English(EN) Improving Long-Context Retrieval with Multi-Prefix Embedding

新的多前缀嵌入方法改进长上下文检索

作者 PulseAugur 编辑部 · [1 个来源] · 2026-06-22 17:31

研究人员推出了一种名为多前缀嵌入（MPE）的新颖技术，旨在改进信息检索系统中的长上下文检索。MPE解决了单向量嵌入中细节丢失与令牌级多向量方法的高存储成本之间的权衡问题。通过划分文档并在前缀边界提取嵌入，MPE可以保持跨块上下文，并仅使用文档级相关性标签即可实现高效的块级匹配。 AI

影响这种新的嵌入方法可以提高处理大型文档的信息检索系统的效率和准确性。

排序理由这是一篇详细介绍信息检索新方法的学术论文。[lever_c_demoted from research: ic=1 ai=1.0]

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

arXiv cs.IR (Information Retrieval) TIER_1 English(EN) · Jimmy Lin · 2026-06-22 17:31

Improving Long-Context Retrieval with Multi-Prefix Embedding

Long-context retrieval exposes a tension: single-vector embeddings lose fine-grained detail, while token-level multi-vector methods incur prohibitive storage. We propose Multi-Prefix Embedding (MPE), which partitions a document into chunks separated by EOS tokens, encodes the ful…