English(EN) Language Models Without a Trainable Input Embedding Table: Learning from Fixed Minimal Binary Token Codes

语言模型用固定的二元码取代可训练的输入嵌入

作者 PulseAugur 编辑部 · [1 个来源] · 2026-05-10 21:00

研究人员开发了一种新颖的语言模型方法，消除了对可训练输入嵌入表的需求。通过使用固定的、最小的二元标记码代替大型、可学习的矩阵，他们实现了与标准模型相当的性能。这种方法显著减少了可训练参数的数量，可能导致更高效的模型架构。 AI

影响这项研究通过移除一个重要组成部分，为更具参数效率的语言模型指明了潜在的途径。

排序理由提出语言模型新架构变更的学术论文。[lever_c_demoted from research: ic=1 ai=1.0]

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

arXiv cs.CL TIER_1 English(EN) · A. Bochkov · 2026-05-10 21:00

无需可训练输入嵌入表的语言模型：从固定最小二元标记码中学习

Trainable input embedding tables are a standard component of modern language models. We ask whether they are actually necessary at the input interface. For a vocabulary of size $V$, exact token identity requires only $K=\lceil \log_2 V\rceil$ bits. We replace the usual trainable …