English(EN) EndPrompt: Efficient Long-Context Extension via Terminal Anchoring

EndPrompt方法可高效扩展LLM上下文窗口

作者 PulseAugur 编辑部 · [1 个来源] · 2026-06-03 04:00

研究人员开发了一种名为EndPrompt的新方法，可以在无需对长序列进行大量训练的情况下，高效地扩展大型语言模型的上下文窗口。该技术通过使用简短的初始片段和简短的终端提示进行训练，引入了必要的位置信息。EndPrompt在LongBench等基准测试中表现出显著的改进，其性能优于其他方法，同时消耗的计算资源大大减少。 AI

影响该方法可以显著降低LLM适应更长上下文的计算成本，有可能加速其在需要大量信息处理的应用中的部署。

排序理由该集群包含一篇研究论文，详细介绍了一种扩展LLM上下文窗口的新方法。[lever_c_demoted from research: ic=1 ai=1.0]

在 arXiv cs.CL 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

arXiv cs.CL TIER_1 English(EN) · Han Tian, Luxuan Chen, Xinran Chen, Rui Kong, Fang Wang, Jiamin Chen, Jinman Zhao, Yuchen Li, Jiashu Zhao, Shuaiqiang Wang, Haoyi Xiong, Linghe Kong, Dawei Yin · 2026-06-03 04:00

EndPrompt：通过终端锚定实现高效长上下文扩展

arXiv:2605.14589v2 Announce Type: replace Abstract: Extending the context window of large language models typically requires training on sequences at the target length, incurring quadratic memory and computational costs that make long-context adaptation expensive and difficult to…

报道来源 [1]

EndPrompt：通过终端锚定实现高效长上下文扩展

相关实体

相关话题