PulseAugur
实时 23:39:26

Zyphra's ZAYA1-8B MoE model trained on AMD hardware outperforms larger rivals

Zyphra AI has released ZAYA1-8B, a Mixture of Experts (MoE) language model with 760 million active parameters and 8.4 billion total parameters. Trained on AMD hardware, this model demonstrates competitive performance against larger models on math and coding benchmarks, utilizing innovations like Compressed Convolutional Attention and an MLP-based router. ZAYA1-8B is available under an Apache 2.0 license and as a serverless endpoint, offering efficient deployment for on-device applications and lower latency inference. AI

影响 Offers a more efficient alternative for reasoning tasks, potentially lowering inference costs and enabling on-device LLM applications.

排序理由 Release of a new open-weight language model with novel architecture and training infrastructure. [lever_c_demoted from research: ic=1 ai=1.0]

在 MarkTechPost 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。 我们如何撰写摘要 →

Zyphra's ZAYA1-8B MoE model trained on AMD hardware outperforms larger rivals

报道来源 [1]

  1. MarkTechPost TIER_1 English(EN) · Asif Razzaq ·

    Zyphra Releases ZAYA1-8B: A Reasoning MoE Trained on AMD Hardware That Punches Far Above Its Weight Class

    <p>Zyphra releases ZAYA1-8B, a reasoning Mixture of Experts model with only 760M active parameters that outperforms open-weight models many times its size on math and coding benchmarks — closing in on DeepSeek-V3.2 and surpassing Claude 4.5 Sonnet on HMMT'25 with its novel Markov…