CUDA
PulseAugur coverage of CUDA — every cluster mentioning CUDA across labs, papers, and developer communities, ranked by signal.
- developed by NVIDIA 100%
- used by NVIDIA H100 90%
- used by CPP 90%
- competes with Rocm 80%
- competes with Huawei Ascend 80%
- used by graphics processing unit 70%
- used by Vulkan 70%
- used by Apple Silicon 70%
- used by Metal 70%
- used by Microsoft Windows 70%
- instance of graphics processing unit 70%
- used by vLLM 70%
30 day(s) with sentiment data
-
Old NVIDIA Kepler GPUs resurrected for modern LLM inference
A technical project has successfully enabled modern large language models (LLMs) to run on older NVIDIA Kepler GPUs, specifically a GTX 770, which are typically considered obsolete. This was achieved by patching the pro…
-
New system KernelPro autonomously optimizes GPU kernel code using LLMs
Researchers have developed KernelPro, an autonomous system designed to optimize GPU kernel code for large language models. This system integrates LLM code generation with hardware profiler feedback and specialized analy…
-
audio.cpp framework offers faster audio model inference
A new C++ inference framework called audio.cpp has been developed, built on top of ggml, to run various audio models including TTS, ASR, and voice conversion. The framework aims to consolidate multiple audio models into…
-
Run Alibaba's Qwen LLM locally and offline with Off Grid AI Desktop
Off Grid AI Desktop is a new, free, open-source application that allows users to run Alibaba Group's Qwen large language models locally on their personal computers. This enables offline, private AI interactions, with th…
-
Run Google's Gemma LLM Locally with New Open-Source App
A new open-source application called Off Grid AI Desktop allows users to run Google's Gemma language models locally on their Mac or Windows computers. This approach prioritizes user privacy by keeping all prompts and da…
-
Run LLMs locally on Windows and Mac with Off Grid AI Desktop
Off Grid AI Desktop is a new, free, open-source application that allows users to run large language models locally on their Windows PCs or Macs. The software supports offline use, eliminating the need for subscriptions …
-
LLMs haven't spurred competition against NVIDIA's CUDA, user asks why
The user questions why LLMs, despite their coding capabilities, haven't significantly accelerated the development of alternative software ecosystems like ROCm and Intel's stack to compete with NVIDIA's CUDA. They observ…
-
New GPU-accelerated MPC solver TurboMPC achieves significant speedups
Researchers have developed TurboMPC, a novel model predictive control (MPC) solver designed for efficient execution on GPUs. This solver supports complex robotic applications by handling state and control inequality con…
-
Developer boosts C LLM inference speed by 25x, hitting DRAM limits
A developer details the process of optimizing a C-based LLM inference engine, Project Zero, to achieve significantly faster performance on CPUs. Initially running BitNet b1.58 at 1.4 tokens/second, the project evolved o…
-
Nvidia leads data center Ethernet switching market with integrated AI platforms
Nvidia has rapidly ascended to become the leading vendor in the data center Ethernet switching market, capturing a 21.5% share in Q1 2026. This growth is attributed to Nvidia's strategy of selling networking as an integ…
-
Qualcomm acquires AI chip software firm Modular for $4B
Qualcomm is acquiring chip software startup Modular for nearly $4 billion in a deal that includes $300 million for Modular employees. This acquisition aims to bolster Qualcomm's expansion beyond mobile chips into areas …
-
China black market Nvidia GPU prices surge amid import bans · 1 source tracked
Prices for Nvidia's A100 server GPUs have tripled on the Chinese black market, reaching up to $82,000, due to a U.S. smuggling crackdown and China's customs freeze on approved chips. This has led buyers to repurpose gam…
-
New Neural Particle Automata Learn Self-Organizing Dynamics
Researchers have introduced Neural Particle Automata (NPA), a novel framework that extends Neural Cellular Automata (NCA) to dynamic particle systems. Unlike traditional NCA, NPA treats each cell as a particle with cont…
-
New PyTorch CUDA operator speeds up knowledge graph embedding updates
Researchers have developed FuseSampleAgg, a novel PyTorch CUDA operator designed to optimize knowledge graph (KG) embedding updates. This new operator streamlines the neighborhood estimation process by fusing sampling a…
-
Frontier LLMs struggle with multi-GPU kernel generation, new benchmark reveals
A new benchmark called ParallelKernelBench (PKB) has been developed to evaluate the ability of frontier large language models to generate efficient multi-GPU kernels. Testing models like GPT-5.5, Gemini 3 Pro, and Opus …
-
Qualcomm to Acquire AI Startup Modular for $4 Billion
Qualcomm is acquiring AI startup Modular for approximately $4 billion, aiming to bolster its AI infrastructure ambitions. This move will integrate Modular's AI-native software platform, which allows AI models to run eff…
-
Teenager builds fully local AI assistant O-AI for privacy
A 16-year-old developer from Pune, India, has created O-AI, a fully local AI desktop assistant designed for privacy and offline functionality. The assistant runs large language models and voice recognition entirely on t…
-
Alienware Aurora R16 gaming PC with RTX 5070 drops to $1,479
Alienware is offering a significant discount on its Aurora R16 gaming desktop, reducing the price to $1,479. This configuration includes an RTX 5070 GPU and an Intel Core Ultra 7 265F CPU, making it capable of 4K gaming…
-
Moebius image inpainting model ported to browser using Claude Code
Simon Willison successfully ported the Moebius 0.2B image inpainting model to run in a web browser using Claude Code. The process involved converting the model to ONNX format and leveraging WebGPU for browser-based exec…
-
Open-source Nvidia Vulkan driver NVK adds experimental DLSS support on Linux
The open-source Vulkan driver NVK, developed for Nvidia GPUs on Linux, has introduced experimental support for Nvidia's DLSS upscaling technology. This integration is achieved by loading pre-compiled CUDA binaries direc…