English(EN) Zyphra Releases ZAYA1-8B: A Reasoning MoE Trained on AMD Hardware That Punches Far Above Its Weight Class

Zyphra的ZAYA1-8B MoE模型在AMD硬件上训练，表现优于更大模型

作者 PulseAugur 编辑部 · [1 个来源] · 2026-05-07 05:44

Zyphra AI发布了ZAYA1-8B，一个专家混合（MoE）语言模型，拥有7.6亿个活跃参数和84亿个总参数。该模型在AMD硬件上训练，在数学和编码基准测试中表现出与更大模型相媲美的性能，采用了压缩卷积注意力（Compressed Convolutional Attention）和基于MLP的路由器等创新技术。ZAYA1-8B在Apache 2.0许可下可用，并提供无服务器终端，为设备端应用提供高效部署和低延迟推理。 AI

影响为推理任务提供更高效的替代方案，可能降低推理成本并支持设备端LLM应用。

排序理由发布了具有新颖架构和训练基础设施的开源权重语言模型。[lever_c_从研究降级：ic=1 ai=1.0]

在 MarkTechPost 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

MarkTechPost TIER_1 English(EN) · Asif Razzaq · 2026-05-07 05:44

Zyphra发布ZAYA1-8B：在AMD硬件上训练的推理MoE，实力远超同级

<p>Zyphra releases ZAYA1-8B, a reasoning Mixture of Experts model with only 760M active parameters that outperforms open-weight models many times its size on math and coding benchmarks — closing in on DeepSeek-V3.2 and surpassing Claude 4.5 Sonnet on HMMT'25 with its novel Markov…

报道来源 [1]

Zyphra发布ZAYA1-8B：在AMD硬件上训练的推理MoE，实力远超同级

相关实体

相关话题