ENTITY Rocm

Rocm

PulseAugur coverage of Rocm — every cluster mentioning Rocm across labs, papers, and developer communities, ranked by signal.

Total · 30d

38

38 over 90d

Releases · 30d

0

0 over 90d

Papers · 30d

3

3 over 90d

TIER MIX · 90D

significant 2
research 4
tool 25
commentary 5
meme 2

TOPICS

RELATIONSHIPS

SENTIMENT · 30D

19 day(s) with sentiment data

RECENT · PAGE 1/2 · 38 TOTAL

COMMENTARY · CL_110098 · Jun 25 · 05:07

LLMs haven't spurred competition against NVIDIA's CUDA, user asks why

The user questions why LLMs, despite their coding capabilities, haven't significantly accelerated the development of alternative software ecosystems like ROCm and Intel's stack to compete with NVIDIA's CUDA. They observ…
TOOL · CL_110107 · Jun 24 · 15:16

AMD Strix Halo NPUs Now Usable for LLM Inference with Lemonade Software

A new software development, Lemonade, has been released that enables the use of the Neural Processing Unit (NPU) on AMD Strix Halo devices for running large language models. This allows for hybrid models that leverage b…
RESEARCH · CL_108825 · Jun 24 · 12:36

Qualcomm acquires AI chip software firm Modular for $4B

Qualcomm is acquiring chip software startup Modular for nearly $4 billion in a deal that includes $300 million for Modular employees. This acquisition aims to bolster Qualcomm's expansion beyond mobile chips into areas …
TOOL · CL_106685 · Jun 22 · 20:57

Ideogram 4 LoRA training achieved on AMD Strix Halo with ROCm

A user successfully trained an Ideogram 4 face LoRA on an AMD Strix Halo APU using ROCm and the AI-Toolkit. The process involved several AMD-specific challenges, including the incompatibility of bitsandbytes, issues wit…
TOOL · CL_102581 · Jun 21 · 11:20

AMD ships ATOM + ATOMesh for ROCm LLM serving with disaggregation

AMD has released ATOM and ATOMesh, a new LLM serving stack designed for its Instinct GPUs and ROCm software. This stack introduces a technique called prefill/decode disaggregation, which separates the compute-intensive …
RESEARCH · CL_100348 · Jun 19 · 07:32

MoonMath AI open-sources AMD MI300X attention kernel outperforming AITER v3 · 3 sources tracked

MoonMath AI has released an open-source HIP attention kernel for AMD's MI300X GPU, which reportedly outperforms AMD's own AITER v3. The kernel achieves speedups of up to 1.26x by optimizing memory placement and using on…
TOOL · CL_95288 · Jun 16 · 19:55

Ideogram 4 LoRA training detailed for AMD hardware and style examples

Users are sharing their experiences and results training LoRAs for the Ideogram 4 model, a diffusion model praised for its open-source capabilities. One user detailed the process of training a face LoRA on an AMD Strix …
COMMENTARY · CL_91744 · Jun 15 · 10:00

User asks about Linux/ROCm performance boost for AMD R9700

A user is inquiring about the potential performance gains of switching from a Windows-based AMD R9700 setup to Linux with ROCm for running Wan-2.2. They are seeking community experiences to determine if the effort is wo…
TOOL · CL_89223 · Jun 13 · 03:30

AMD launches $3999 mini-PC for local AI development

AMD has begun accepting pre-orders for its new "Ryzen AI Halo" development machine, priced at $3999 (approximately 640,000 JPY). This compact PC is designed to run large AI models, including those with up to 200 billion…
TOOL · CL_87111 · Jun 12 · 05:17

llama.cpp Releases Enhance Performance and Add New Features

The llama.cpp project has released several updates, including b9608, which features an update to cpp-httplib and provides pre-compiled binaries for various platforms like macOS, Linux, Android, and Windows. Release b960…
TOOL · CL_86315 · Jun 11 · 21:13

Step-3.7-Flash on AMD/ROCm faces context corruption and requires thinking budget

A user running the Step-3.7-Flash model on AMD hardware with ROCm has identified two key issues. First, ROCm appears to corrupt context windows beyond approximately 94,000 tokens, causing the model to loop and fail to p…
TOOL · CL_85992 · Jun 11 · 17:08

New tool visualizes NPU and iGPU activity on AMD Strix Halo

A new terminal monitoring tool called xdna-top has been released to help users visualize the activity of NPUs and iGPUs on AMD's Strix Halo processors. This tool addresses the current difficulty in tracking NPU performa…
TOOL · CL_76428 · Jun 7 · 17:14

User script boosts SDXL performance on older AMD GPUs

A user has developed a script to enable Stable Diffusion XL (SDXL) to run more efficiently on older AMD GPUs with 8GB of VRAM. The script bypasses the problematic DirectML backend on Windows, opting instead for native R…
TOOL · CL_75292 · Jun 6 · 19:01

AMD MI50 GPUs show strong performance with llama.cpp on Debian

A user on Reddit's r/LocalLLaMA shared performance benchmarks for AMD MI50 GPUs running the llama.cpp software on Debian Testing. The benchmarks, conducted using the llama-benchy tool with the unsloth/Qwen3.6-35B-A3B-GG…
MEME · CL_71129 · Jun 4 · 12:04

BC250 device performance benchmarked with custom Llama-cpp setup

A user on Reddit shared performance metrics for a BC250 device running Fedora 44 with a customized Llama-cpp setup. The user detailed their process of overclocking the device to 2Ghz and unlocking 40 Compute Units, whic…
TOOL · CL_69327 · Jun 3 · 16:03

Unsloth Studio adds Gemma 4 12B, new UI, and live tools

Unsloth has released a beta update (v0.1.44-beta) that includes a new chat UI, project management features, and experimental canvas capabilities. This update also integrates Google's Gemma 4 12B model, which can run loc…
COMMENTARY · CL_64914 · Jun 2 · 04:59

ROCm vs CUDA: Choosing the Right AI Development Platform

This article compares ROCm and CUDA, two prominent platforms for AI development. It details the author's personal experience attempting to train a PyTorch model on an AMD GPU using ROCm, highlighting the challenges enco…
TOOL · CL_64679 · Jun 1 · 23:25

AMD ROCm adds improved Linux support for Windows Subsystem for Linux 2

AMD's ROCm platform now offers improved support for Windows Subsystem for Linux 2 (WSL2), enabling users to run Linux-based AI workloads more effectively on Windows. While this update brings the system closer to a stabl…
RESEARCH · CL_63787 · Jun 1 · 14:10

Mistral.rs boosts CUDA inference speed; non-CUDA status debated

The mistral.rs project has released version 0.8.2, significantly improving CUDA inference speeds by up to 2.8 times compared to llama.cpp on various NVIDIA GPUs. This update focuses on optimizing throughput for models l…
TOOL · CL_61830 · May 31 · 19:21

Ollama v0.30.0-rc32 improves multi-GPU support and embeddings API

Ollama has released a release candidate version v0.30.0-rc32, which includes several follow-up fixes and improvements for its llama-server functionality. These updates address issues with ROCm build flags for multi-GPU …