Zyphra AI has released ZAYA1-8B, a Mixture of Experts (MoE) language model with 760 million active parameters and 8.4 billion total parameters. Trained on AMD hardware, this model demonstrates competitive performance against larger models on math and coding benchmarks, utilizing innovations like Compressed Convolutional Attention and an MLP-based router. ZAYA1-8B is available under an Apache 2.0 license and as a serverless endpoint, offering efficient deployment for on-device applications and lower latency inference. AI
影响 Offers a more efficient alternative for reasoning tasks, potentially lowering inference costs and enabling on-device LLM applications.
排序理由 Release of a new open-weight language model with novel architecture and training infrastructure. [lever_c_demoted from research: ic=1 ai=1.0]
- AMD
- AMD Instinct MI300x
- Claude 4.5 Sonnet
- DeepSeek-R1-0528
- Gemini-2.5-Pro
- GPT-5-High
- Hugging Face
- IBM
- Mixture of Experts
- ZAYA1-8B
- Zyphra AI
- Zyphra Cloud
AI 生成摘要 · Google Gemini · 来自 1 个来源。 我们如何撰写摘要 →