PulseAugur
LIVE 11:14:09
ENTITY Cuda

Cuda

PulseAugur coverage of Cuda — every cluster mentioning Cuda across labs, papers, and developer communities, ranked by signal.

Total · 30d
38
38 over 90d
Releases · 30d
0
0 over 90d
Papers · 30d
16
16 over 90d
TIER MIX · 90D
RELATIONSHIPS
SENTIMENT · 30D

3 day(s) with sentiment data

RECENT · PAGE 2/2 · 34 TOTAL
  1. RESEARCH · CL_07063 ·

    New GPU framework accelerates quantum state calculations for complex systems

    Researchers have developed QiankunNet-cuSCI, a novel framework that fully accelerates the NNQS-SCI method for solving complex quantum systems using GPUs. This new approach addresses the scalability limitations of previo…

  2. RESEARCH · CL_10487 ·

    AMD's MI300X falls short of Nvidia in AI training due to software issues

    A recent benchmark analysis by SemiAnalysis found that AMD's MI300X GPU, despite theoretical advantages in specifications and total cost of ownership, does not compete effectively with Nvidia's H100 and H200 in training…

  3. RESEARCH · CL_06196 ·

    PointTransformerX offers portable, efficient 3D point cloud processing without sparse algorithms

    Researchers have developed PointTransformerX (PTX), a new vision transformer backbone for processing 3D point clouds that eliminates the need for custom CUDA operators. This PyTorch-native model achieves competitive acc…

  4. RESEARCH · CL_03577 ·

    llama.cpp and ik_llama.cpp add FP4 inference support for VRAM savings

    The llama.cpp and ik_llama.cpp projects have both integrated support for FP4 (4-bit floating-point) inference, a significant advancement for model quantization. llama.cpp now includes NVFP4, an Nvidia-specific format, w…

  5. TOOL · CL_03576 ·

    llama.cpp CUDA pull request optimizes MMQ stream-k overhead for MoE models

    A pull request to the llama.cpp project aims to reduce overhead in CUDA's MMQ stream-k operations. This optimization targets Mixture of Experts (MoE) models, potentially leading to faster prompt processing speeds. The c…

  6. FRONTIER RELEASE · CL_03105 ·

    DeepSeek releases V4 Pro and Flash models with 1M context, runs on Huawei chips

    DeepSeek has released its new V4 family of models, including V4 Pro and V4 Flash, which boast a 1 million token context window. These models were trained on 32 trillion tokens and feature a novel hybrid attention system…

  7. SIGNIFICANT · CL_05791 ·

    TianShu Zhixin cuts inference chip prices to gain market share amid revenue concerns

    Chinese AI chip designer Tianshu Zhixin reported 10.34 billion yuan in revenue for 2025, a 91.6% year-over-year increase, though this fell short of market expectations. The company's training chip series, "Tianhe," rema…

  8. FRONTIER RELEASE · CL_05793 ·

    DeepSeek V4 to launch late April with trillion parameters, Huawei Ascend chip support

    DeepSeek founder Liang Wenfeng has revealed that the company's next-generation flagship model, DeepSeek V4, is slated for release in late April. This new model is expected to feature trillion-scale parameters and a mill…

  9. TOOL · CL_18066 ·

    AI coding assistants like Claude reignite passion for older developers

    Several older developers are finding renewed passion for coding due to AI coding assistants like Claude Code. These tools allow them to focus on architectural design and problem-solving without getting bogged down in th…

  10. TOOL · CL_17743 ·

    PHP-ORT brings machine learning inference to PHP developers

    A new infrastructure project called PHP-ORT aims to bring machine learning inference capabilities directly to PHP, the server-side language used by a significant portion of the web. This development seeks to empower mil…

  11. TOOL · CL_17711 ·

    ParaQuery launches GPU-accelerated Spark SQL for cost-efficient data processing

    ParaQuery, a new startup, has launched a GPU-accelerated Spark and SQL data processing solution. The platform aims to offer cost and performance benefits over existing solutions like Google BigQuery. ParaQuery leverages…

  12. TOOL · CL_17783 ·

    NetHack ML model performance plummets 40% due to mysterious bug

    Researchers Bartłomiej Cupiał and Maciej Wołczyk observed a significant performance drop in their neural network trained to play NetHack. The model, which had been consistently scoring around 5,000 points, suddenly bega…

  13. SIGNIFICANT · CL_00880 ·

    George Hotz's tiny corp unveils $15K AI computer and RISC-based tinygrad framework

    George Hotz's company, tiny corp, has launched the tinybox, a $15,000 personal AI computer designed for local model training and inference. The tinybox boasts 738 FP16 TFLOPS and 144 GB of GPU RAM, capable of running a …

  14. COMMENTARY · CL_04729 ·

    Eugene Yan: MOOCs offer diminishing returns; real learning comes from doing

    Eugene Yan argues that while Massive Open Online Courses (MOOCs) can be useful for initial learning, they often lead to diminishing returns and can even become a form of procrastination. He suggests that true learning, …