graphics processing unit
PulseAugur coverage of graphics processing unit — every cluster mentioning graphics processing unit across labs, papers, and developer communities, ranked by signal.
- used by Vulkan 90%
- used by Triton 90%
- used by central processing unit 70%
- competes with Tensor Processing Unit 70%
- competes with application-specific integrated circuit 70%
- instance of high-performance computing 70%
- uses data processing unit 70%
- used by H.1000 Gnome 70%
- used by Innu-aimun 70%
- used by SemiAnalysis 70%
- competes with Cerebras Systems 70%
- used by AI inference 70%
30 day(s) with sentiment data
-
New lossless compression speeds up ML training and inference
Researchers have developed a new lossless compression algorithm called Invariant Bit Packing (IBP) to address GPU memory limitations in machine learning. IBP identifies and removes redundant bits across tensor groups, e…
-
DriftSched improves LLM inference efficiency with adaptive scheduling
Researchers have developed DriftSched, a framework to improve the efficiency of multi-tenant GPU inference for large language models. This system addresses the challenge of runtime token drift, where actual output lengt…
-
LLaMA users seek multi-GPU power and cooling solutions
Users on the r/LocalLLaMA subreddit are seeking advice on managing power and cooling for multi-GPU setups. One user is concerned about insufficient power cables for an RTX 3090 Ti and an additional RTX 3080, exploring o…
-
User builds 4-GPU PC for local LLM inference
A user details their experience building a personal computer equipped with four GPUs specifically for running large language models locally. The article aims to fill a perceived gap in Russian-language online resources …
-
Ricoh releases free LLM safeguard model; NVIDIA integrates GaN tech for efficiency
Ricoh has released a free safeguard model designed to detect harmful information in the input and output of large language models. This model aims to enhance the safety and security of AI systems. Separately, NVIDIA is …
-
AI infrastructure efficiency is key to avoid wasted computing power
An AI expert points out that without proper utilization, high-performance computing resources can become inefficient, akin to 'AI space heaters' that consume excessive electricity. This highlights the rapidly evolving l…
-
Reddit user analyzes GPU specs for LLM prefill performance
A Reddit user on r/LocalLLaMA has analyzed various GPUs and machines for their suitability in running large language models, emphasizing the importance of prefill performance over raw generation speed. The analysis sugg…
-
LLaMA.cpp users seek VRAM optimization beyond tensor-split
A user on the r/LocalLLaMA subreddit is seeking more efficient methods for optimizing VRAM usage with llama.cpp, particularly for Mixture of Experts (MoE) models across multiple GPUs. They currently rely on manual adjus…
-
Mirantis k0rdent AI adds tools for GPU monetization
Mirantis has updated its k0rdent AI platform with new features designed to help companies monetize their GPU computing power. The update, released on May 14, 2026, introduces tools for tracking and managing GPU usage, e…
-
AI Infrastructure Buildout Faces Power, Supply Chain, and Investment Challenges
The AI infrastructure buildout is entering a more complex phase, with vendors and operators facing challenges in power, supply chains, and geopolitical risks. Companies are shifting focus from solely GPUs to broader AI …
-
Podcast explores brain-like AI hardware beyond GPUs
A podcast episode from "The Neuron: AI Explained" discusses the future of AI hardware beyond traditional GPUs. The episode features Great Sky's perspective on developing brain-like AI architectures. It explores potentia…
-
Memristor SNN accelerator slashes energy use for edge AI
Researchers have developed a novel memristor-based accelerator designed to enhance the energy efficiency of spiking neural networks (SNNs). This analog accelerator integrates in-memory computation with neuron functional…
-
AI chip startup XCENA raises $135M for memory-centric architecture
XCENA, a startup focused on AI infrastructure, has raised $135 million in Series B funding at a $570 million valuation. The company is developing a new chip architecture that aims to reduce AI's memory bottleneck by pla…
-
China redesigns AI chip industry amid US export curbs
China's AI chip industry is undergoing a significant redesign in response to US export controls, pushing domestic companies to explore alternatives to Nvidia's dominant GPUs. Major players like Huawei and Cambricon are …
-
GPU acceleration speeds up GP-GOMEA symbolic regression
Researchers have developed a GPU-accelerated version of GP-GOMEA, an evolutionary algorithm for symbolic regression. This new approach significantly increases the speed of fitness evaluations, allowing for more complex …
-
Stable Diffusion user struggles with CPU-only processing on AMD GPU
A user is experiencing significant performance issues with Stable Diffusion, where the software consistently utilizes the CPU instead of the GPU, even when using AMD-specific tools like SDNext. Despite having an AMD RX …
-
Financial giants launch AI token and GPU rental futures markets
Financial institutions are developing new markets for AI tokens, similar to how gold and oil are traded. The Shanghai Futures Exchange is designing a derivatives market for AI tokens, while CME Group and Intercontinenta…
-
AI datacenters without GPUs spark debate on future utility
The tech industry is facing a peculiar challenge with AI datacenters that lack GPUs, prompting questions about their future utility. This situation has led to discussions about repurposing existing infrastructure, such …
-
Trillion-parameter AI models challenge Kubernetes orchestration
Running trillion-parameter AI models within Kubernetes clusters presents significant challenges beyond standard container orchestration. These massive models require distributed systems approaches, where a single 'repli…
-
LLMs are language calculators, not true AI, author argues
An LLM is no more impressive than any other complex GPU computation, such as rendering a realistic 3D game. Both require massive parallel processing and are equally amazing feats of technology. The author argues that LL…