English(EN) Causal2Vec: Improving Decoder-only LLMs as Embedding Models through a Contextual Token

Causal2Vec 在不改变架构的情况下增强了用于嵌入的仅解码器LLM

作者 PulseAugur 编辑部 · [1 个来源] · 2026-05-05 04:00

研究人员推出了一种新颖的方法 Causal2Vec，可以在不改变核心架构的情况下增强仅解码器的大型语言模型（LLM）以用于嵌入任务。该方法涉及将输入文本预编码为单个“上下文标记”，然后将其添加到 LLM 的输入序列中。Causal2Vec 还使用来自上下文标记和 EOS 标记的组合嵌入来减轻近期偏差，在 MTEB 检索数据集基准测试上取得了最先进的结果。 AI

影响引入了一种无需架构更改即可提高 LLM 嵌入性能的新技术，有可能降低特定任务的计算成本。

排序理由介绍 LLM 嵌入模型新方法的学术论文。[lever_c_demoted from research: ic=1 ai=1.0]

在 arXiv cs.CL 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

arXiv cs.CL TIER_1 English(EN) · Ailiang Lin, Zhuoyun Li, Yusong Wang, Kotaro Funakoshi, Manabu Okumura · 2026-05-05 04:00

Causal2Vec: Improving Decoder-only LLMs as Embedding Models through a Contextual Token

arXiv:2507.23386v3 Announce Type: replace Abstract: Decoder-only large language models (LLMs) have been increasingly adopted to build embedding models for diverse tasks. To overcome the inherent limitations of causal attention in representation learning, many existing methods mod…

报道来源 [1]

Causal2Vec: Improving Decoder-only LLMs as Embedding Models through a Contextual Token

相关实体

相关话题