Zyphra
PulseAugur coverage of Zyphra — every cluster mentioning Zyphra across labs, papers, and developer communities, ranked by signal.
- 2026-05-19 funding AI startup Zyphra is raising $500 million at a $5 billion valuation.
3 天有情绪数据
-
Zyphra 发布 ZAYA1-8B MoE 模型,活跃参数不足 10 亿
Zyphra 发布了 ZAYA1-8B,一个拥有 84 亿参数的混合专家(Mixture-of-Experts)模型,每个 token 仅激活约 7.6 亿参数。该架构使其在数学和编码基准测试中能够达到与更大模型相媲美的性能,包括 Claude 4.5 Sonnet。该模型采用了压缩卷积注意力(Compressed Convolutional Attention)和基于 MLP 的专家选择路由器等架构改进,并在大量 AMD Insti…
-
Zyphra's ZAYA1 model boosts AI inference speed by 7.7x
Zyphra has introduced the ZAYA1-8B-Diffusion-Preview model, which transforms autoregressive MoE language models into discrete diffusion models. This innovation reportedly achieves up to a 7.7x inference speedup without …
-
新的大型语言模型因过大或过于复杂而无法在家庭实验室运行
作者详细说明了最近发布的三个大型语言模型——DeepSeek V4-Pro、DeepSeek V4-Flash 和 Zyphra ZAYA1-8B——目前为何无法在典型的家庭实验室硬件上运行。DeepSeek V4-Pro 体积过大,为 805 GB,需要数据中心规模。DeepSeek V4-Flash 虽然体积较小,但仍需要大量内存,并且缺乏广泛的软件支持。Zyphra ZAYA1-8B 体积合适,但使用了新颖的架构,尚未开发出相应的推理软件。
-
Zyphra 的 ZAYA1-8B 模型以 7 亿激活参数媲美更大模型
Zyphra 发布了 ZAYA1-8B,这是一款专注于推理的混合专家模型,拥有 7 亿激活参数。该模型在 AMD 计算平台上从头开始训练,并采用了新颖的四阶段强化学习级联。ZAYA1-8B 通过采用专注于推理的训练方法和保留答案的修剪方案,在数学和编码基准测试中表现出竞争力,即使与规模大得多的模型相比也是如此。
-
Zyphra's ZAYA1-8B model matches top AI benchmarks with under 1B parameters
Zyphra has released ZAYA1-8B, an open-source model that achieves performance comparable to DeepSeek-R1 on math benchmarks. The model also demonstrates competitive reasoning capabilities against Claude Sonnet 4.5 and app…
-
Zyphra's TSP strategy boosts LLM training throughput by 2.6x
Zyphra has developed a new technique called Tensor and Sequence Parallelism (TSP) designed to optimize the training and inference of large transformer models. This hardware-aware strategy combines aspects of Tensor Para…