English(EN) When Is Next-Token Prediction Useful? Marginalization, Ergodicity, Mixture Identifiability, Local Sufficiency, RAG, Tools, and Programming

论文质疑LLM中下一个词元预测的有效性

作者 PulseAugur 编辑部 · [2 个来源] · 2026-05-22 06:34

一篇新发表在arXiv上的论文探讨了语言模型中下一个词元预测的局限性。文章认为，当前在观察到的序列上训练的模型未能完全捕捉语言生成的条件定律，因为它们忽略了非文本的因素，如意图和上下文。研究表明，为了使下一个词元预测真正有用，观察到的文本必须是这些潜在因素的充分统计量，而异构训练语料库常常不满足这一条件。 AI

影响这篇论文挑战了LLM训练的基本假设，暗示需要超越简单的下一个词元预测的新方法来实现真正的上下文理解。

排序理由该集群包含一篇讨论语言模型训练理论局限性的学术论文。

在 arXiv stat.ML 阅读 →

arXiv

AI 生成摘要 · Google Gemini · 来自 2 个来源。我们如何撰写摘要 →

报道来源 [2]

arXiv stat.ML TIER_1 English(EN) · Francesco Corielli · 2026-05-25 04:00

何时何地下一词预测有用？边际化、遍历性、混合可辨识性、局部充分性、RAG、工具和编程

arXiv:2605.23278v1 Announce Type: cross Abstract: Language models trained on observed sequences are often described as learning the conditional distribution of the next token given previous tokens. This description is only conditionally correct. A model trained on realized token …
arXiv stat.ML TIER_1 English(EN) · Francesco Corielli · 2026-05-22 06:34

何时下一词元预测有用？边际化、遍历性、混合可辨识性、局部充分性、RAG、工具和编程

Language models trained on observed sequences are often described as learning the conditional distribution of the next token given previous tokens. This description is only conditionally correct. A model trained on realized token trajectories does not observe full conditional law…

报道来源 [2]

何时何地下一词预测有用？边际化、遍历性、混合可辨识性、局部充分性、RAG、工具和编程

何时下一词元预测有用？边际化、遍历性、混合可辨识性、局部充分性、RAG、工具和编程

相关实体

相关话题