English(EN) q0: Primitives for Hyper-Epoch Pretraining

新的q0预训练方法提高了LLM数据效率

作者 PulseAugur 编辑部 · [2 个来源] · 2026-06-02 17:27

研究人员引入了一种名为q0的新预训练方法，旨在提高大型语言模型的数据效率。该技术将重点从优化单个模型转移到训练多样化的模型群体并聚合它们的预测。q0利用了周期性调度、链式蒸馏和学习先验，在数据效率方面取得了显著的进步，优于传统的集成方法。 AI

影响引入了一种新颖的预训练策略，显著提高了数据效率，可能降低未来大型语言模型训练的计算成本。

排序理由该集群包含一篇详细介绍语言模型预训练新方法的论文。

AI 生成摘要 · Google Gemini · 来自 2 个来源。我们如何撰写摘要 →

报道来源 [2]

arXiv cs.AI TIER_1 English(EN) · Bishwas Mandal, Shmuel Berman, Akshay Vegesna, Samip Dahal · 2026-06-03 04:00

q0: 超时代预训练的基元

arXiv:2606.03938v1 Announce Type: cross Abstract: Multi-epoch training is becoming the standard now that compute is growing faster than the supply of high-quality text. But pretraining a single model saturates within a few passes, long before the compute budget is exhausted. We a…
arXiv cs.AI TIER_1 English(EN) · Samip Dahal · 2026-06-02 17:27

q0：超时代预训练的基元

Multi-epoch training is becoming the standard now that compute is growing faster than the supply of high-quality text. But pretraining a single model saturates within a few passes, long before the compute budget is exhausted. We argue this calls for a conceptual shift from traini…