PulseAugur
EN
LIVE 10:07:31
ENTITY AMD MI300X

AMD MI300X

PulseAugur coverage of AMD MI300X — every cluster mentioning AMD MI300X across labs, papers, and developer communities, ranked by signal.

Show in brief
Total · 30d
8
8 over 90d
Releases · 30d
0
0 over 90d
Papers · 30d
1
1 over 90d
TIER MIX · 90D
TOPICS
SENTIMENT · 30D

3 day(s) with sentiment data

RECENT · PAGE 1/1 · 8 TOTAL
  1. TOOL · CL_106546 ·

    MoonMath AI open-sources HIP attention kernel for AMD MI300X, beating AITER v3

    MoonMath AI has open-sourced a new bf16 forward attention kernel for AMD's MI300X GPU, written in HIP. This kernel reportedly outperforms AMD's own AITER v3 across various configurations, achieving up to a 1.26x speedup…

  2. RESEARCH · CL_100348 ·

    MoonMath AI open-sources AMD MI300X attention kernel outperforming AITER v3 · 3 sources tracked

    MoonMath AI has released an open-source HIP attention kernel for AMD's MI300X GPU, which reportedly outperforms AMD's own AITER v3. The kernel achieves speedups of up to 1.26x by optimizing memory placement and using on…

  3. TOOL · CL_91546 ·

    Qwen3 32B fine-tuning fails on AMD MI300X

    A fine-tuning attempt of the Qwen3 32B model on AMD MI300X hardware encountered significant issues, leading to wasted resources and a lack of learning. The process reportedly consumed $10 in GPU credits before it was re…

  4. TOOL · CL_59358 ·

    Kog AI achieves 3,000 tokens/s LLM inference on standard GPUs

    Kog AI has launched a tech preview of its Kog Inference Engine (KIE), demonstrating significantly faster real-time LLM inference speeds on standard datacenter GPUs. The engine achieves 3,000 output tokens per second on …

  5. TOOL · CL_54717 ·

    Triton MoE kernel achieves high performance on AMD, NVIDIA

    A new fused Mixture-of-Experts (MoE) dispatch kernel, written entirely in Triton, achieves 89-131% of the performance of Stanford's Megablocks library. This kernel notably runs on AMD MI300X hardware without any code mo…

  6. TOOL · CL_26256 ·

    MachinaCheck uses specialized agents to ensure manufactured parts are machinable

    MachinaCheck is a novel multi-agent system designed to bridge the gap between CAD design and CNC manufacturing, ensuring parts are machinable before production. This system utilizes specialized agents that parse geometr…

  7. RESEARCH · CL_15158 ·

    Zyphra's TSP strategy boosts LLM training throughput by 2.6x

    Zyphra has developed a new technique called Tensor and Sequence Parallelism (TSP) designed to optimize the training and inference of large transformer models. This hardware-aware strategy combines aspects of Tensor Para…

  8. RESEARCH · CL_25306 ·

    MachinaCheck automates CNC manufacturability analysis using on-premise AI

    A new system called MachinaCheck has been developed to automate the manufacturability assessment of CNC parts, reducing the process from an hour to 30 seconds. This multi-agent AI system leverages the Qwen 2.5 7B Instru…