ENTITY graphics processing unit

graphics processing unit

PulseAugur coverage of graphics processing unit — every cluster mentioning graphics processing unit across labs, papers, and developer communities, ranked by signal.

Show in brief

Total · 30d

223

223 over 90d

Releases · 30d

0 over 90d

Papers · 30d

69 over 90d

TIER MIX · 90D

significant 13
research 49
tool 114
commentary 39
meme 8

TOPICS

RELATIONSHIPS

SENTIMENT · 30D

29 day(s) with sentiment data

RECENT · PAGE 8/10 · 200 TOTAL

COMMENTARY · CL_25098 · May 10 · 14:17

Commentator calls AI boom a 'giant con' reliant on hyperscalers

Tech commentator Ed Zitron argues that the current AI boom, particularly for companies like OpenAI and Anthropic, is an unsustainable "con" propped up by hyperscalers. He believes this reliance on massive infrastructure…
COMMENTARY · CL_25028 · May 10 · 13:03

GPU Memory Bandwidth Crucial for Local LLM Speed, Outpacing VRAM

For running large language models locally, GPU memory bandwidth is a more critical factor than VRAM capacity. Higher bandwidth allows the GPU to process data more quickly, preventing it from being bottlenecked while wai…
TOOL · CL_27741 · May 9 · 08:27

New GPU solver cuRegOT accelerates optimal transport for machine learning

Researchers have developed cuRegOT, a new GPU-accelerated solver designed to overcome the computational challenges of optimal transport (OT) in large-scale machine learning applications. The solver addresses the limitat…
TOOL · CL_23767 · May 9 · 04:08

Mac mini outperforms expensive workstations running large AI models

A $1,999 Mac mini equipped with Apple Silicon can run a 70-billion parameter AI model, outperforming a $4,000 Windows workstation. This is attributed to Apple's unified memory architecture, which eliminates VRAM and PCI…
SIGNIFICANT · CL_22646 · May 8 · 08:12

Kunluncore files for dual IPO, touts China's first 32K GPU AI cluster

Kunluncore, an AI chip spinoff from Baidu, has officially filed for an IPO on Shanghai's STAR Market, alongside a concurrent filing for a Hong Kong listing on January 1st. The company announced its P800 GPU cluster, fea…
TOOL · CL_21942 · May 8 · 04:00

HCInfer system enables LLMs on resource-constrained devices with error compensation

Researchers have developed HCInfer, a novel inference system designed to enable large language models (LLMs) to run efficiently on devices with limited memory. This system offloads parts of the model's compensation mech…
SIGNIFICANT · CL_21710 · May 8 · 01:45

Rongxin Zhiyuan raises hundreds of millions for GPU-centric AI architecture

Rongxin Zhiyuan, an AI infrastructure company founded by Tsinghua University alumni, has secured hundreds of millions of yuan in an angel funding round. The company is developing its novel AGC architecture, which positi…
COMMENTARY · CL_21661 · May 8 · 00:43

Galaxy Securities: Token consumption to surge, benefiting AIDC, telcos, fiber optics, and optical modules

Galaxy Securities predicts a significant increase in Token consumption, driven by the growing demand for AI inference and rapid iteration of large language models. This surge is expected to accelerate growth across four…
TOOL · CL_21330 · May 7 · 15:59

AWS offers EC2 Capacity Blocks for short-term GPU needs

Amazon Web Services (AWS) is introducing EC2 Capacity Blocks for Machine Learning (ML) and SageMaker training plans to address the scarcity of GPU capacity. These new options allow customers to secure short-term GPU res…
RESEARCH · CL_23761 · May 6 · 17:45

Modal boosts multimodal inference performance over 10% with Python dict

Modal has identified a performance bottleneck in multimodal inference engines like SGLang, which can hinder GPU utilization. By profiling the scheduler, they discovered that expensive bookkeeping for shared GPU memory c…
RESEARCH · CL_20462 · May 6 · 14:18

New benchmark reveals LLM-generated GPU kernels struggle with correctness and efficiency

A new benchmark called KernelBench-X has been developed to evaluate the capabilities of large language models in generating GPU kernels. The benchmark, which covers 176 tasks across 15 categories, reveals that task stru…
TOOL · CL_19446 · May 6 · 13:58

AMD EPYC CPUs show competitive performance for LLM and TTS inference workloads

A recent analysis by Leaseweb benchmarks the performance of AMD EPYC 9334 CPUs for Large Language Model (LLM) and Text-to-Speech (TTS) inference workloads. The study reveals that while GPUs offer higher throughput, CPUs…
TOOL · CL_19402 · May 6 · 12:56

AI assists in developing Pascal version of LAPACK, aiming for GPU acceleration

A user on Mastodon is collaborating with GitHub Copilot to develop a Pascal version of the LAPACK numerical library, which is approximately 30% complete. They anticipate reaching 80% completion within two days and plan …
RESEARCH · CL_20517 · May 6 · 10:02

New tool cuts GPU memory use in AI training by optimizing optimizer states

Researchers have developed a Budget-Aware Optimizer Configurator (BAOC) to address the significant GPU memory consumption during large-scale model training. BAOC intelligently assigns different optimizer configurations …
TOOL · CL_19074 · May 6 · 09:22

AI image generation: CPU vs GPU performance and scaling insights

This article explores the performance differences between CPUs and GPUs when generating AI-created images and videos. The author shares their experience using these components for digital art creation, highlighting that…
RESEARCH · CL_19066 · May 6 · 08:30

Memory giants push new MRDIMM standard for AI, HPC servers

Major memory manufacturers Samsung Electronics, SK Hynix, and Micron are nearing completion of the next-generation server DRAM module standard, MRDIMM. This new standard is optimized for AI and high-performance computin…
TOOL · CL_18835 · May 6 · 04:00

New Polar Express method accelerates matrix decomposition for deep learning

Researchers have developed a new GPU-friendly algorithm called Polar Express for computing matrix decompositions, which is crucial for the Muon optimizer used in training deep neural networks. This method optimizes for …
TOOL · CL_18603 · May 6 · 04:00

VUDA system enables spatial sharing of compute and graphics on GPUs

Researchers have developed VUDA, a system designed to enhance GPU utilization by enabling simultaneous execution of CUDA compute and Vulkan graphics workloads. This is achieved by breaking down the isolation between the…
RESEARCH · CL_18441 · May 6 · 03:49

Lumentum CEO: AI component demand outstrips supply, orders booked until 2028

Lumentum, a major US optical module manufacturer, reported a record-breaking third fiscal quarter with revenue soaring 90% year-over-year to $808 million. The company also saw significant improvements in profitability, …
RESEARCH · CL_18429 · May 6 · 03:15

AI boom creates volatile market for video game hardware

The burgeoning AI industry is creating unprecedented demand for high-end graphics cards, significantly impacting the video game hardware market. This surge in demand is leading to shortages and price increases for GPUs,…

Commentator calls AI boom a 'giant con' reliant on hyperscalers

GPU Memory Bandwidth Crucial for Local LLM Speed, Outpacing VRAM

New GPU solver cuRegOT accelerates optimal transport for machine learning

Mac mini outperforms expensive workstations running large AI models

Kunluncore files for dual IPO, touts China's first 32K GPU AI cluster

HCInfer system enables LLMs on resource-constrained devices with error compensation

Rongxin Zhiyuan raises hundreds of millions for GPU-centric AI architecture

Galaxy Securities: Token consumption to surge, benefiting AIDC, telcos, fiber optics, and optical modules

AWS offers EC2 Capacity Blocks for short-term GPU needs

Modal boosts multimodal inference performance over 10% with Python dict

New benchmark reveals LLM-generated GPU kernels struggle with correctness and efficiency

AMD EPYC CPUs show competitive performance for LLM and TTS inference workloads

AI assists in developing Pascal version of LAPACK, aiming for GPU acceleration

New tool cuts GPU memory use in AI training by optimizing optimizer states

AI image generation: CPU vs GPU performance and scaling insights

Memory giants push new MRDIMM standard for AI, HPC servers

New Polar Express method accelerates matrix decomposition for deep learning

VUDA system enables spatial sharing of compute and graphics on GPUs

Lumentum CEO: AI component demand outstrips supply, orders booked until 2028

AI boom creates volatile market for video game hardware