Strix Halo
PulseAugur coverage of Strix Halo — every cluster mentioning Strix Halo across labs, papers, and developer communities, ranked by signal.
6 day(s) with sentiment data
-
Guide: Run LLMs on AMD NPUs with FastFlowLM on Fedora
This guide details how to run Large Language Models (LLMs) on AMD NPUs using FastFlowLM on Fedora Linux. It outlines a four-layer setup requiring building XRT, the NPU plugin, and FastFlowLM from source, as pre-built pa…
-
USB4 RDMA implementation could boost local LLM performance
An experimental implementation of Remote Direct Memory Access (RDMA) over USB4 has been demonstrated, potentially enabling high-speed data transfer between devices connected via USB4. This development, detailed in a blo…
-
GLM-5.2 performance on dual Strix Halo hardware questioned
A user on Reddit's r/LocalLLaMA community is inquiring about the performance and value of running the GLM-5.2 language model on a dual Strix Halo setup with 256GB of RAM. The post includes a link to a YouTube video, pre…
-
AMD's Strix Halo APU benchmarks reveal potential to replace discrete GPUs for AI
Leaked benchmarks for AMD's upcoming "Strix Halo" APU, specifically the Ryzen AI Max+ 395, indicate a significant performance leap. The chip achieved a Time Spy score of 10,106, surpassing the GeForce RTX 3060. This adv…
-
AMD Strix Halo NPUs Now Usable for LLM Inference with Lemonade Software
A new software development, Lemonade, has been released that enables the use of the Neural Processing Unit (NPU) on AMD Strix Halo devices for running large language models. This allows for hybrid models that leverage b…
-
LLMKube operator fixes its own bug using a local 27B model on AMD hardware
An open-source Kubernetes operator called LLMKube, designed for self-hosted LLM inference across various hardware, has demonstrated its agentic capabilities. Its agent, Foreman, successfully identified and fixed a bug i…
-
AMD Strix Halo Desktop Challenges NVIDIA DGX Spark with Lower Price
AMD has launched the Strix Halo desktop, a new workstation designed to compete with NVIDIA's DGX Spark. Priced at $3,999, the Strix Halo aims to undercut NVIDIA's offering by $700. It features support for Windows 11 and…
-
AMD launches $3999 mini-PC for local AI development
AMD has begun accepting pre-orders for its new "Ryzen AI Halo" development machine, priced at $3999 (approximately 640,000 JPY). This compact PC is designed to run large AI models, including those with up to 200 billion…
-
New tool visualizes NPU and iGPU activity on AMD Strix Halo
A new terminal monitoring tool called xdna-top has been released to help users visualize the activity of NPUs and iGPUs on AMD's Strix Halo processors. This tool addresses the current difficulty in tracking NPU performa…
-
Hyperparameter search yields minor gains for speculative decoding
A user on Reddit's r/LocalLLaMA subreddit shared their experience with hyperparameter tuning for speculative decoding, specifically using the "draft-mtp" method with the Qwen3.6 27B model on a Strix Halo platform. Despi…
-
Qwen 3.6 27B FP16 vs Q8 quantization performance debated
A user on Reddit's r/LocalLLaMA subreddit is inquiring about the performance differences between FP16 and Q8 quantization for the Qwen 3.6 27B model. They are experiencing slow FP16 performance on their setup and are se…
-
DGX Spark and Strix Halo prices double amid AI hardware demand surge
The price of DGX Spark and Strix Halo hardware has reportedly doubled, sparking speculation about increased interest in local AI hardware. This price hike suggests a potential surge in demand for on-premises AI solutions.
-
Rejected llama.cpp PR boosts MoE model speed on Strix Halo
A pull request for llama.cpp, which was denied for inclusion in the main project, offers a performance boost for Mixture of Experts (MoE) models on Strix Halo hardware. This modification, developed by pedapudi, can incr…
-
Arint.info adds MTP support for Strix Halo AI hardware
Arint.info has announced new support for Strix Halo, a significant development for AI hardware acceleration. This update integrates MTP (Multi-Threaded Processing) capabilities, enhancing performance for AI workloads. T…
-
AMD launches Gorgon Halo chips with up to 192GB memory for LLMs
AMD has refreshed its Ryzen AI Max processor line with the new Gorgon Halo chips, which offer up to 192GB of unified memory. These processors utilize Zen 5 CPU cores, RDNA 3.5 GPU cores, and an XDNA 2 NPU, with the flag…
-
Apple M5 Mac chip benchmarks show lead over NVIDIA DGX Spark
New benchmarks indicate that Apple's upcoming M5 Mac chip may outperform NVIDIA's DGX Spark system for local AI tasks. The tests emphasize the importance of memory bandwidth for token generation speed. The comparison al…
-
AMD Ryzen AI Max+ PRO 495 APU leaks with 192GB memory, modest performance gains
AMD is reportedly developing a new flagship APU, the Ryzen AI Max+ PRO 495, which is expected to be an incremental upgrade over the existing Strix Halo lineup. Leaked benchmarks suggest a modest performance increase and…
-
Mini PCs with AMD's Ryzen AI MAX+ 395 offer powerful local LLM capabilities amid price hikes
The price of mini PCs capable of running large language models locally has significantly increased, with some models seeing a 60% price hike in just six months. This surge is attributed to factors like rising LPDDR5 pri…