Brief

last 24h

[2/2] 221 sources

Multi-source AI news clustered, deduplicated, and scored 0–100 across authority, cluster strength, headline signal, and time decay.

RESEARCH · X — Perplexity English(EN) · 1w · [4 sources]

We published new research on how we serve post-trained Qwen3 235B models on NVIDIA GB200 NVL72 Blackwell racks.

Perplexity has published research detailing how they serve large language models, specifically Qwen3 235B, on NVIDIA's GB200 NVL72 Blackwell racks. The findings indicate that the GB200 platform offers significant improvements over previous NVIDIA hardware for large-model inference, boasting reduced latency and higher throughput. This research highlights the GB200's capabilities for both training and high-throughput inference, particularly for Mixture-of-Experts (MoE) models. AI

IMPACT NVIDIA's GB200 Blackwell platform shows significant gains in LLM inference speed and cost-efficiency, potentially accelerating deployment of large models.
- NVIDIA
- Perplexity
- Hopper
- H200
- Blackwell
- Qwen3 235B
- GB200 NVL72
TOOL · Together AI blog English(EN) · 4mo

Learn how Cursor partnered with Together AI to deliver real-time, low-latency inference at scale

Cursor, an AI-powered coding platform, has partnered with Together AI to optimize its real-time inference capabilities. This collaboration focuses on achieving low-latency responses within the editor's feedback loop, which is crucial for the AI's predictive and refactoring features. The partnership leverages NVIDIA's Blackwell architecture, specifically the GB200 NVL72, to enhance performance and reduce response times for developers. AI

IMPACT Enables faster, more responsive AI coding assistance by optimizing inference infrastructure, potentially improving developer productivity.

Brief

We published new research on how we serve post-trained Qwen3 235B models on NVIDIA GB200 NVL72 Blackwell racks.

Learn how Cursor partnered with Together AI to deliver real-time, low-latency inference at scale