graphics processing unit
PulseAugur coverage of graphics processing unit — every cluster mentioning graphics processing unit across labs, papers, and developer communities, ranked by signal.
- used by Vulkan 90%
- used by Triton 90%
- used by central processing unit 70%
- competes with Tensor Processing Unit 70%
- competes with application-specific integrated circuit 70%
- competes with Apple Neural Engine 70%
- instance of high-performance computing 70%
- used by AI inference 70%
- used by H.1000 Gnome 70%
- used by Innu-aimun 70%
- competes with Cerebras Systems 70%
- used by SemiAnalysis 70%
30 day(s) with sentiment data
-
FlashSinkhorn solver accelerates optimal transport on GPUs
Researchers have developed FlashSinkhorn, a new GPU-accelerated solver for entropic optimal transport (EOT) that significantly reduces memory input/output operations. By rewriting stabilized log-domain Sinkhorn updates …
-
LLMs and new frameworks boost GPU kernel optimization
Researchers are exploring novel ways to optimize GPU kernel performance for large language models. One approach uses language models as surrogates to predict kernel performance, significantly increasing the number of ca…
-
New framework speeds up discrete optimization on GPUs
Researchers have developed a new CPU-GPU framework to accelerate optimization problems with discrete variables, which have historically been challenging for GPUs. This framework processes branch and bound nodes in batch…
-
AI industry pivots to token economics, focusing on inference computing centers
The AI industry is shifting its focus from model parameters to computational efficiency, with "token economics" emerging as a new value unit. This transition is driving demand for "token factories" – intelligent computi…
-
New MTP technique speeds AI token generation but needs more VRAM
A new method called MTP (Multi-Token Prediction) has been developed to accelerate token generation in AI models. This technique involves predicting multiple future tokens simultaneously and then having the main model ve…
-
AI chip investor prioritizes product definition over tech
Li Yang, a partner at SenseTime Guoxiang Capital, discusses the AI chip investment landscape, emphasizing that product definition and future use cases are more critical than technology alone. He highlights the shift fro…
-
GPUs Emerge as New Data Storage Paradigm, Mirroring Early Database Challenges
The article posits that GPUs are becoming the new databases, drawing parallels to the early days of database management. Just as teams fumbled through early database adoption, they are now navigating the complexities of…
-
vLLM production guide details key config decisions for performance
This article provides a guide for optimizing vLLM deployments, focusing on three critical configuration decisions that impact performance and cost. It details how static KV cache allocation can lead to GPU out-of-memory…
-
AI development costs rise, potentially slowing industry boom
The escalating costs of AI development, particularly for advanced hardware like GPUs, are beginning to strain the rapid expansion of the AI industry. This price surge, driven by high demand and limited supply, could pot…
-
NYSE owner plans futures market for AI computing power as AI reshapes jobs
Intercontinental Exchange, the parent company of the New York Stock Exchange, is planning to launch futures contracts for computing power, specifically focusing on GPUs. This initiative, in partnership with Ornn, aims t…
-
Alibaba Cloud pivots to high-margin AI token revenue amid investor scrutiny
Alibaba's cloud division is facing scrutiny over its AI strategy, with investors closely monitoring its token revenue growth as a key indicator of future profitability. While AI compute sales offer high revenue, they yi…
-
Mahjong RL simulator Mahjax achieves 2M steps/sec on GPUs
Researchers have developed Mahjax, a new GPU-accelerated simulator for the complex game of Riichi Mahjong, implemented in JAX. This tool is designed to facilitate reinforcement learning research, particularly for agents…
-
New framework models pump deterioration for targeted infrastructure management
Researchers have developed a new framework for causal discovery in infrastructure management, focusing on pump equipment deterioration. This method combines Bayesian hierarchical hazard modeling with causal discovery to…
-
GEM framework optimizes MoE AI model GPU mapping for faster inference
Researchers have developed GEM, a framework designed to optimize the mapping of experts to GPUs in Mixture-of-Expert (MoE) AI models. This new approach accounts for variability in GPU performance, aiming to reduce infer…
-
Baidu CFO: AI infrastructure too hard to build, cloud providers to profit
Baidu's CFO stated that building AI infrastructure is prohibitively difficult, leading to cloud providers capitalizing on the situation. This difficulty stems from the high costs and complexity associated with AI hardwa…
-
AI adoption faces infrastructure, legal, and automation hurdles globally
Manus has launched Scheduled Tasks 2.0, transforming basic reminders into intelligent agents capable of maintaining context and autonomously updating web applications. Meanwhile, the United Arab Emirates aims to generat…
-
KV Cache Optimization Solves LLM GPU Memory Bottleneck
Large language models (LLMs) face a significant bottleneck in serving efficiency due to the memory demands of KV cache, which stores intermediate attention calculations. This KV cache, essential for enabling faster resp…
-
US eyes non-GPU hardware for supercomputers amid AI security concerns
The US government is exploring alternative hardware for its next major supercomputer, potentially moving beyond traditional GPUs. This exploration is driven by the accelerating adoption of AI and the associated security…
-
AI investment shifts from GPU training to inference infrastructure
The AI industry's investment focus is shifting from GPU manufacturing for model training to the infrastructure required for inference. As AI tools become more integrated into daily operations, the demand for continuous …
-
Nvidia releases open Ising quantum AI models for qubit calibration
Nvidia has released open-source Ising quantum AI models designed to automate and improve the calibration of quantum processors. These models, which include a vision-language model for proposing calibration actions and C…