PulseAugur
实时 21:41:32

Cerebras unveils new Wafer-Scale Engine for faster, cheaper AI inference

Cerebras Systems has announced new hardware and software optimizations aimed at improving the efficiency and cost-effectiveness of AI inference. Their latest offerings include enhanced Wafer Scale Engine (WSE) processors and accompanying software designed to accelerate model deployment. The company claims these advancements will lead to faster processing times and reduced operational expenses for AI workloads. AI

排序理由 Announcement of new hardware and software optimizations for AI inference by a specialized AI hardware company.

在 Smol AINews 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。 我们如何撰写摘要 →

报道来源 [1]

  1. Smol AINews TIER_1 English(EN) ·

    Cerebras Inference: Faster, Better, AND Cheaper

    **Groq** led early 2024 with superfast LLM inference speeds, achieving ~450 tokens/sec for Mixtral 8x7B and 240 tokens/sec for Llama 2 70B. **Cursor** introduced a specialized code edit model hitting 1000 tokens/sec. Now, **Cerebras** claims the fastest inference with their wafer…