实体 graphics processing unit

graphics processing unit

PulseAugur coverage of graphics processing unit — every cluster mentioning graphics processing unit across labs, papers, and developer communities, ranked by signal.

Show in brief

总计 · 30天

134

90 天内 134

发布 · 30天

90 天内 0

论文 · 30天

90 天内 48

层级分布 · 90 天

significant 9
research 35
tool 64
commentary 23
meme 3

关系

情绪 · 30 天

18 天有情绪数据

最近 · 第 5/7 页 · 共 134 条

RESEARCH · CL_15670 · May 5 · 04:00

新的 HERMES 和 DSCache 方法通过 KV 缓存改进流式视频理解

研究人员开发了新的方法来提高多模态大型语言模型 (MLLM) 理解流式视频的效率。一种方法 HERMES 将 KV 缓存概念化为一个分层内存系统，从而以更少的内存使用量实现更快的处理和更高的准确性。另一种方法 DSCache 将过去和现在的 KV 缓存解耦，并使用位置无关编码来处理无界流，并泛化到比模型训练时更长的序列。
RESEARCH · CL_15158 · May 4 · 23:15

Zyphra's TSP strategy boosts LLM training throughput by 2.6x

Zyphra has developed a new technique called Tensor and Sequence Parallelism (TSP) designed to optimize the training and inference of large transformer models. This hardware-aware strategy combines aspects of Tensor Para…
RESEARCH · CL_14976 · May 4 · 21:05

NVIDIA cuOpt and OpenAI achieve breakthroughs in supply chain and voice AI

NVIDIA is enhancing supply chain decision systems with its cuOpt technology, which combines agentic AI with GPU acceleration for real-time, large-scale planning. Separately, OpenAI has achieved low-latency voice AI, del…
TOOL · CL_14833 · May 4 · 16:05

AWS SageMaker adds automatic instance fallback for AI endpoints

Amazon SageMaker has introduced a new feature called capacity-aware instance pools for AI inference endpoints. This enhancement allows users to define a prioritized list of instance types, enabling SageMaker to automati…
RESEARCH · CL_16299 · May 4 · 13:49

Coral and CoRAL systems optimize LLM serving and robotic control

Researchers have developed two distinct systems named Coral and CoRAL. Coral is an adaptive system designed for cost-efficient serving of multiple large language models across heterogeneous cloud GPUs, aiming to optimiz…
MEME · CL_14555 · May 4 · 08:04

Mastodon users criticize energy consumption of AI hardware

The user is expressing frustration about the energy consumption associated with specialized hardware, drawing a parallel to the cryptocurrency industry. They note that ASICs have largely replaced GPUs in certain applica…
SIGNIFICANT · CL_13762 · May 3 · 15:36

ODMs transition from manufacturing to AI infrastructure partners for complex racks

Original Design Manufacturers (ODMs) are transitioning from traditional hardware production to becoming key partners in AI infrastructure. This evolution is spurred by the increasing complexity of AI hardware, particula…
TOOL · CL_13684 · May 3 · 13:06

GitHub tool measures GPU 'useful' work amid AI and security buzz

A new GitHub tool called Utilyze has been released, designed to monitor GPU performance for "useful" work. The tool aims to track computational tasks beyond entertainment, incorporating buzzwords like AI, workflow autom…
RESEARCH · CL_13590 · May 3 · 09:58

Sasha Rush releases Autodiff Puzzles to teach automatic differentiation

Sasha Rush has released "Autodiff Puzzles," an interactive Google Colab notebook designed to teach automatic differentiation. Similar to his previous puzzle series on Tensors and GPUs, these challenges guide users throu…
TOOL · CL_17313 · May 1 · 09:00

Next-gen chips promise data centers greater efficiency and AI power

Next-generation chip designs, including those optimized for AI, energy efficiency, and heat tolerance, have the potential to significantly alter data center infrastructure. Innovations in packaging, memory, and offload …
SIGNIFICANT · CL_11581 · May 1 · 04:07

Datavault AI raises $120M to build nationwide GPU network for AI compute

Datavault AI has secured $120 million in funding from Scilex Holding to establish a nationwide GPU network. This initiative aims to provide increased computing power for companies engaged in artificial intelligence deve…
RESEARCH · CL_11925 · May 1 · 04:00

FluxMoE system decouples expert weights for faster LLM serving

Researchers have developed FluxMoE, a new system designed to improve the efficiency of serving Mixture-of-Experts (MoE) models. FluxMoE addresses the challenge of large parameter sizes in MoE models by decoupling expert…
RESEARCH · CL_11722 · May 1 · 04:00

RoundPipe 实现了在消费级 GPU 上高效进行 LLM 微调

研究人员开发了 RoundPipe，这是一种新的流水线调度方法，旨在提高在消费级 GPU 上微调大型语言模型的效率。该方法通过以循环方式动态调度设备上的计算阶段来解决现有方法的局限性，从而有效地消除流水线气泡并提高吞吐量。评估显示，与当前基线相比，速度有了显著提升，使得在单台服务器上微调非常大的模型成为可能。RoundPipe 也作为一个开源库提供。
RESEARCH · CL_14183 · Apr 30 · 21:35

Study finds switchless networks more cost-effective for MoE LLM serving

A new paper analyzes network topologies for Mixture-of-Experts (MoE) Large Language Model (LLM) serving, finding that lower-cost, switchless networks can be more cost-effective than expensive scale-up infrastructures. T…
RESEARCH · CL_14104 · Apr 30 · 20:48

VkSplat 流水线通过 Vulkan 计算提升 3D 高斯溅射训练性能

研究人员开发了 VkSplat，一种利用 Vulkan 计算进行 3D 高斯溅射 (3DGS) 训练的新型流水线，可提高性能和兼容性。与传统的 CUDA 和 PyTorch 方法相比，这种新方法将速度提高了 3.3 倍，并将 VRAM 使用量减少了 33%。VkSplat 值得注意的是，它是第一个在不同 GPU 供应商上实现最先进结果的全 Vulkan 3DGS 训练流水线。
RESEARCH · CL_14105 · Apr 30 · 19:49

研究人员结合 DPU 和 GPU 以加速神经网络推理

研究人员开发了一种新颖的方法，通过在深度学习处理单元 (DPU) 和图形处理单元 (GPU) 之间拆分卷积神经网络 (CNN) 计算来加速神经网络推理。这种“拆分 CNN 推理”方法在数据源附近的 DPU 上处理初始层，在 GPU 上处理后续层，从而显著降低延迟。还引入了一个图神经网络 (GNN) 模型，以准确预测各种 CNN 架构的最佳层划分，准确率达到 96.27%。
MEME · CL_10938 · Apr 30 · 18:15

AI tools read code, not minds; Chinese GPU maker revenue hits $423M

A Chinese GPU maker, Cambricon, reported its first-quarter revenue at $423 million. Separately, a blog post discusses how AI tools can read code but not minds, and another mentions AI breaking Silicon Valley's global pl…
RESEARCH · CL_11513 · Apr 30 · 17:55

Strait system enhances ML inference serving with priority-aware scheduling

Researchers have developed Strait, a new system designed to improve the efficiency of machine learning inference serving, particularly in on-premises environments. Strait addresses limitations in task prioritization and…
COMMENTARY · CL_23141 · Apr 30 · 17:15

China Dominates Critical Minerals for AI Supply Chain

Six critical chokepoints in the AI supply chain, from raw materials to finished chips, are dominated by China. The country processes 90% of rare earths, highlighting its significant control over the production of GPUs, …
SIGNIFICANT · CL_09991 · Apr 30 · 01:37

T-Head 发布 Panmai 920 智能网卡，完成其 AI 基础设施芯片系列

阿里巴巴子公司平头哥发布了其首款智能网卡“Panmai 920”，旨在解决 AI 计算基础设施的瓶颈问题。这款新网卡采用了先进的 PCIe 5.0 和 112G PAM4 以太网技术，实现了高吞吐量和高数据包处理速率，目标是显著提高大规模 AI 集群中的 GPU 利用率。随着此次发布，平头哥完成了其核心数据中心芯片组合，涵盖计算能力、网络和存储，从而能够提供全面的 AI 基础设施解决方案。

新的 HERMES 和 DSCache 方法通过 KV 缓存改进流式视频理解

Zyphra's TSP strategy boosts LLM training throughput by 2.6x

NVIDIA cuOpt and OpenAI achieve breakthroughs in supply chain and voice AI

AWS SageMaker adds automatic instance fallback for AI endpoints

Coral and CoRAL systems optimize LLM serving and robotic control

Mastodon users criticize energy consumption of AI hardware

ODMs transition from manufacturing to AI infrastructure partners for complex racks

GitHub tool measures GPU 'useful' work amid AI and security buzz

Sasha Rush releases Autodiff Puzzles to teach automatic differentiation

Next-gen chips promise data centers greater efficiency and AI power

Datavault AI raises $120M to build nationwide GPU network for AI compute

FluxMoE system decouples expert weights for faster LLM serving

RoundPipe 实现了在消费级 GPU 上高效进行 LLM 微调

Study finds switchless networks more cost-effective for MoE LLM serving

VkSplat 流水线通过 Vulkan 计算提升 3D 高斯溅射训练性能

研究人员结合 DPU 和 GPU 以加速神经网络推理

AI tools read code, not minds; Chinese GPU maker revenue hits $423M

Strait system enhances ML inference serving with priority-aware scheduling

China Dominates Critical Minerals for AI Supply Chain

T-Head 发布 Panmai 920 智能网卡，完成其 AI 基础设施芯片系列