NVIDIA H200
PulseAugur coverage of NVIDIA H200 — every cluster mentioning NVIDIA H200 across labs, papers, and developer communities, ranked by signal.
8 day(s) with sentiment data
-
Zhipu AI's GLM-5.2 model deployed on serverless GPUs
Zhipu AI has released GLM-5.2, a 700B Mixture-of-Experts (MoE) model that excels in complex reasoning and software engineering tasks, reportedly matching or surpassing proprietary models like Claude 3.5 Sonnet and GPT-4…
-
Superhuman AI agent dominates Generals.io using self-play RL
A new research paper details the creation of a superhuman AI agent for the real-time strategy game Generals.io. Trained for four days on high-end GPUs, the agent achieved the top rank among over 5,000 human players and …
-
MegaFold system boosts training efficiency for 3D attention protein models
Researchers have developed MegaFold, a new system designed to make training large-scale 3D attention protein models more efficient. This approach addresses the significant computational and memory challenges posed by mo…
-
MIMONet uses neural operators for virtual sensing in inaccessible systems
Researchers have developed MIMONet, a novel operator-based virtual sensing framework designed for real-time monitoring of inaccessible or unmeasurable parameters in safety-critical systems, such as nuclear-grade thermal…
-
Flash-KMeans accelerates GPU k-means clustering over 200x
Researchers from UC Berkeley and UT Austin have developed Flash-KMeans, an open-source library that significantly accelerates the k-means clustering algorithm for modern AI pipelines. By optimizing data movement on GPUs…
-
Old smartphones repurposed into low-cost computing platforms
Researchers from the University of California San Diego, in collaboration with Google, have developed a method to repurpose old smartphones into functional computing platforms. By stripping down devices like Pixel phone…
-
Avataar AI launches Varya video model for India at fraction of global cost
Avataar AI has launched Varya, a new video generation model optimized for the Indian market. Built using distillation from Alibaba's Wan 2.2, Varya is significantly faster and cheaper, costing $0.005 per second of video…
-
ArcOS engine tackles AI persona drift with deterministic logic
Persona drift in conversational AI, where models lose their intended character over time, is an architectural issue rather than a prompting problem. The ArcOS engine addresses this by separating language model tasks fro…
-
AIGaitor enables on-device, private motion analysis via smartphones
Researchers have developed AIGaitor, a novel system that enables privacy-preserving, cloud-free motion analysis directly on consumer smartphones. This system utilizes on-device neural accelerators to perform markerless …
-
Neural operator VIRSO enables real-time sensing on edge devices
Researchers have developed VIRSO, a novel neural operator designed for real-time sensing of inaccessible physical fields. This system uniquely integrates a spatial-spectral architecture optimized for edge deployment, si…
-
Kog AI achieves 3,000 tokens/s LLM inference on standard GPUs
Kog AI has launched a tech preview of its Kog Inference Engine (KIE), demonstrating significantly faster real-time LLM inference speeds on standard datacenter GPUs. The engine achieves 3,000 output tokens per second on …
-
Financial giants launch AI token and GPU rental futures markets
Financial institutions are developing new markets for AI tokens, similar to how gold and oil are traded. The Shanghai Futures Exchange is designing a derivatives market for AI tokens, while CME Group and Intercontinenta…
-
OSP-Next video model achieves 83.73% VBench score with efficiency gains
Researchers have introduced OSP-Next, a novel text-to-video generation model designed for enhanced efficiency and quality. The model integrates sparse attention mechanisms, a novel Sparse Sequence Parallelism (SSP) tech…
-
Together AI adds 1,000 H100/H200 GPUs for inference
Together AI has significantly expanded its GPU capacity by adding one thousand NVIDIA H100 and H200 instances. These powerful GPUs are now available through Together's on-demand GPU clusters and dedicated endpoint servi…
-
AIGaitor enables on-device, private motion analysis via smartphone
Researchers have developed AIGaitor, a novel system for motion analysis that operates entirely on a smartphone, eliminating the need for cloud processing. This approach addresses key barriers in clinical motion capture,…
-
Hugging Face and AWS Detail Foundation Model Infrastructure
Hugging Face and AWS have collaborated to detail the infrastructure required for training and running large foundation models. The blog post outlines a layered architecture, emphasizing the interplay between AWS's compu…
-
Hallo-Live achieves real-time audio-video avatar generation at 20 FPS
Researchers have developed Hallo-Live, a novel framework for real-time text-driven audio-visual avatar generation. This system utilizes an asynchronous dual-stream diffusion approach combined with human-centric preferen…
-
DeepSeek v3 leads open-weight models, Baseten enables mission-critical inference
DeepSeek v3, a new 671B parameter Mixture-of-Experts model, has been released and is currently the top-performing open-weights model available. Serving such large models presents significant challenges, but inference st…