English(EN) Better Models, Faster Training: Sigmoid Attention for single-cell Foundation Models

Sigmoid attention 改进了生物基础模型，实现了更快、更稳定的训练

作者 PulseAugur 编辑部 · [1 个来源] · 2026-05-01 04:00

研究人员开发了一种名为 Sigmoid Attention 的新注意力机制，该机制在训练生物基础模型方面提供了显著改进。与传统的 softmax attention 相比，这种新颖的方法能够学习到更好的表示，实现高出 25% 的细胞类型分离度和更高的内聚度指标。此外，Sigmoid Attention 能够实现更快的训练，模型完成速度最多可提高 10%，并通过缓解 softmax attention 中固有的问题来增强稳定性。该团队还发布了 TritonSigmoid，这是一个高效的 GPU 内核，在 H100 GPU 上的性能优于现有解决方案。 AI

影响为生物基础模型引入了更稳定、更高效的注意力机制，有望加速该领域的研发。

排序理由学术论文，介绍了一种新颖的注意力机制，并附有实证结果和开源代码。

在 arXiv cs.LG 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

Sigmoid attention 改进了生物基础模型，实现了更快、更稳定的训练

报道来源 [1]

arXiv cs.LG TIER_1 English(EN) · Vijay Sadashivaiah, Georgios Dasoulas, Judith Mueller, Soumya Ghosh · 2026-05-01 04:00

更优模型，更快训练：单细胞基础模型的 Sigmoid Attention

arXiv:2604.27124v1 Announce Type: new Abstract: Training stable biological foundation models requires rethinking attention mechanisms: we find that using sigmoid attention as a drop in replacement for softmax attention a) produces better learned representations: on six diverse si…

报道来源 [1]

更优模型，更快训练：单细胞基础模型的 Sigmoid Attention

相关实体

相关话题