PulseAugur
实时 00:19:07
实体 Nvidia B200

Nvidia B200

PulseAugur coverage of Nvidia B200 — every cluster mentioning Nvidia B200 across labs, papers, and developer communities, ranked by signal.

Show in brief
总计 · 30天
6
90 天内 6
发布 · 30天
0
90 天内 0
论文 · 30天
0
90 天内 0
层级分布 · 90 天
情绪 · 30 天

3 天有情绪数据

最近 · 第 1/1 页 · 共 6 条
  1. TOOL · CL_44370 ·

    Modal achieves serverless GPUs for AI inference in seconds

    Modal has developed a system to achieve truly serverless GPUs for AI inference, addressing the challenge of rapidly scaling resources to meet variable demand. Their approach involves maintaining cloud buffers of idle GP…

  2. RESEARCH · CL_44358 ·

    Together AI releases FlashAttention-3 and -4 for faster LLM processing

    Together AI has released FlashAttention-3 and FlashAttention-4, significant upgrades to their GPU-accelerated attention mechanism for large language models. FlashAttention-3, designed for Hopper GPUs, achieves up to 75%…

  3. SIGNIFICANT · CL_44550 ·

    Cohere releases open-source Command A+ AI model for enterprise agents

    Cohere has released Command A+, an open-source, multimodal AI model designed for enterprise use and agentic tasks. This new model integrates reasoning, vision, and multilingual capabilities, supporting 48 languages and …

  4. RESEARCH · CL_39357 ·

    AMD MI355 cheaper than Nvidia B200 for GLM5 serving

    AMD's MI355 accelerator is now 40% cheaper than Nvidia's B200 for serving on the GLM5 architecture. This cost reduction comes 14 weeks after the initial launch of GLM5, which supports both non-MTP and other configurations.

  5. RESEARCH · CL_28769 ·

    Nvidia B200 GPUs deployed in cost-saving inference clusters

    Nvidia's B200 GPUs are being deployed in large clusters, utilizing RoCEv2 Ethernet and Tomahawk switches for efficient inference. This setup allows for significant cost savings as more machines are added, indicating a t…

  6. TOOL · CL_17917 ·

    IonRouter launches AI inference service with custom IonAttention engine

    IonRouter has launched a new inference service designed for high throughput and low cost, utilizing its proprietary IonAttention engine. This engine is capable of multiplexing multiple models on a single GPU, enabling r…