ENTITY graphics processing unit

graphics processing unit

PulseAugur coverage of graphics processing unit — every cluster mentioning graphics processing unit across labs, papers, and developer communities, ranked by signal.

Show in brief

Total · 30d

223

223 over 90d

Releases · 30d

0 over 90d

Papers · 30d

69 over 90d

TIER MIX · 90D

significant 13
research 49
tool 114
commentary 39
meme 8

TOPICS

RELATIONSHIPS

SENTIMENT · 30D

30 day(s) with sentiment data

RECENT · PAGE 5/10 · 200 TOTAL

RESEARCH · CL_54692 · May 27 · 13:00

Broadcom, FuriosaAI partner on Ethernet AI inference platform

Broadcom and FuriosaAI have partnered to develop a rack-scale inference platform that aims to move AI infrastructure away from GPU-centric designs. This collaboration integrates FuriosaAI's processor architecture with B…
RESEARCH · CL_54587 · May 27 · 11:50

Loongson Technology to raise 2.3B yuan for advanced chip R&D

Loongson Technology plans to raise up to 2.3 billion yuan to fund the R&D and industrialization of chips using Xnm process technology. The funds will also support the development of key CPU and GPU technologies. This in…
TOOL · CL_54589 · May 27 · 11:39

Sifangda begins small-batch supply of diamond heat sinks after client tests

Sifangda has successfully passed testing for its diamond heat dissipation sheets with an overseas client and has begun small-batch supply. These CVD diamond sheets boast a thermal conductivity exceeding 2000W/(m·K), mak…
COMMENTARY · CL_54300 · May 27 · 09:12

AI infrastructure advances target GPU savings and agent system standards

This ML digest covers advancements in AI infrastructure, focusing on reducing GPU costs by 2.5 times and optimizing AI for backend operations. It explores new standards for agent systems and addresses challenges in depl…
TOOL · CL_53742 · May 27 · 04:00

New Qrita Algorithm Boosts LLM Sampling Efficiency

Researchers have developed Qrita, a novel algorithm designed to enhance the efficiency of Top-k and Top-p sampling in large language models. By employing Gaussian-based sigma-truncation and a quaternary pivot search, Qr…
RESEARCH · CL_53149 · May 26 · 20:39

AI infrastructure buyers reserve $4B in cooling capacity years ahead

Modine has secured a significant deal worth over $4 billion with an unnamed AI infrastructure customer, which includes a $165 million upfront payment to finance manufacturing expansion. This agreement, extending through…
RESEARCH · CL_51883 · May 26 · 07:42

Kubernetes enhances GPU management with Dynamic Resource Allocation

Kubernetes has evolved its GPU management capabilities beyond simply counting devices. The new Dynamic Resource Allocation (DRA) feature allows for more granular control, enabling specific resource profiles, memory allo…
RESEARCH · CL_51723 · May 26 · 05:57

Sakura Internet boosts AI-driven capex amid Japan's growing demand

Sakura Internet, a Japanese data center and cloud service provider, is significantly increasing its capital expenditure for the 2026 fiscal year. This boost, potentially reaching up to 30 billion yen, is driven by the s…
TOOL · CL_50074 · May 25 · 19:54

Detecting GPU Waste in Kubernetes Clusters

This article discusses how to identify and address GPU waste within Kubernetes clusters, a problem that often goes unnoticed due to seemingly healthy utilization metrics. It highlights that inefficient GPU usage can occ…
RESEARCH · CL_46675 · May 24 · 05:00

Chinese AI startups secure over $15B in Q1 funding

In the first quarter, the AI sector saw over 110 billion yuan in funding, with domestic large language models experiencing a significant surge. Companies like Moonshot AI and Jueyue Xingchen secured over 30 billion yuan…
COMMENTARY · CL_45778 · May 23 · 11:03

LLM inference: CPU vs GPU trade-offs detailed for local deployments

This article explores the practical differences between CPU and GPU inference for large language models (LLMs) using the llama.cpp framework. It highlights that while GPUs offer superior speed, CPUs can be a viable alte…
COMMENTARY · CL_45250 · May 22 · 22:06

Anyscale details Ray Data for scaling multimodal AI data pipelines

Anyscale's blog post details challenges in scaling multimodal AI data pipelines, where preprocessing often starves GPUs, leading to underutilization. The article explains that traditional staged batch execution, which i…
COMMENTARY · CL_45145 · May 22 · 19:40

AI pricing shifts to flexible models amid rising hardware and operational costs

The existing fixed pricing models for AI services are becoming unsustainable due to rising inference costs and increased usage. Surging prices for GPUs and High Bandwidth Memory (HBM), coupled with higher power and cool…
TOOL · CL_44371 · May 22 · 16:01

Modal launches autoscaling GPUs for AI research agents

Modal has introduced an autoscaling feature for GPUs designed to support AI research agents. This new capability allows agents to dynamically provision and release compute resources as needed, addressing the challenge o…
TOOL · CL_44370 · May 22 · 16:01

Modal achieves serverless GPUs for AI inference in seconds

Modal has developed a system to achieve truly serverless GPUs for AI inference, addressing the challenge of rapidly scaling resources to meet variable demand. Their approach involves maintaining cloud buffers of idle GP…
SIGNIFICANT · CL_44315 · May 22 · 14:17

Anker launches AI chip, Poland seeks EU AI factory, California passes automation law

Anker is entering the processor market with its new Thus chip, which uses compute-in-memory architecture to deliver 150x more AI processing power for its upcoming Soundcore headphones. Meanwhile, Poland is vying for a B…
RESEARCH · CL_43614 · May 22 · 07:29

Shenmou targets wireless cameras with ultra-low-power chips

Shenmou, led by Yang Zuoxing, is developing ultra-low-power chip designs to free cameras from wires, envisioning a future with billions of smart visual terminals. Their first-generation chip achieves one-third the indus…
RESEARCH · CL_43418 · May 22 · 05:38

Stanford's ThunderKittens DSL optimizes AI kernel performance

A new article details ThunderKittens, a compact domain-specific language (DSL) developed at Stanford's Hazy Research Lab for creating high-performance AI kernels. The DSL aims to strike a balance between research produc…
RESEARCH · CL_43372 · May 22 · 04:22

LLM reliability and cost-efficiency drive new infrastructure solutions

The integration of Large Language Models (LLMs) into professional workflows is shifting from experimental use to essential tooling, emphasizing collaboration rather than automation. However, the reliability of these LLM…
TOOL · CL_45008 · May 22 · 04:00

WarmServe system prewarms GPUs for faster multi-LLM serving

Researchers have developed WarmServe, a new system designed to improve the efficiency of serving multiple large language models (LLMs) on shared GPU clusters. WarmServe utilizes a one-for-many GPU prewarming strategy, p…

Broadcom, FuriosaAI partner on Ethernet AI inference platform

Loongson Technology to raise 2.3B yuan for advanced chip R&D

Sifangda begins small-batch supply of diamond heat sinks after client tests

AI infrastructure advances target GPU savings and agent system standards

New Qrita Algorithm Boosts LLM Sampling Efficiency

AI infrastructure buyers reserve $4B in cooling capacity years ahead

Kubernetes enhances GPU management with Dynamic Resource Allocation

Sakura Internet boosts AI-driven capex amid Japan's growing demand

Detecting GPU Waste in Kubernetes Clusters

Chinese AI startups secure over $15B in Q1 funding

LLM inference: CPU vs GPU trade-offs detailed for local deployments

Anyscale details Ray Data for scaling multimodal AI data pipelines

AI pricing shifts to flexible models amid rising hardware and operational costs

Modal launches autoscaling GPUs for AI research agents

Modal achieves serverless GPUs for AI inference in seconds

Anker launches AI chip, Poland seeks EU AI factory, California passes automation law

Shenmou targets wireless cameras with ultra-low-power chips

Stanford's ThunderKittens DSL optimizes AI kernel performance

LLM reliability and cost-efficiency drive new infrastructure solutions

WarmServe system prewarms GPUs for faster multi-LLM serving