English(EN) Massive Spikes in LLMs are Bias Vectors: Mechanistic Uncovering and Spike-Free Quantization

LLM激活尖峰被识别为结构向量偏见

作者 PulseAugur 编辑部 · [1 个来源] · 2026-06-02 04:00

研究人员发现，大型语言模型（LLM）中的大规模激活尖峰不仅仅是标量偏见，而是由特定token内的结构向量偏见驱动的。这些token在归一化后会收敛到影响注意力和值机制的常数向量。一种名为INSERTQUANT的新型训练后量化框架被开发出来，通过钳制尖峰并使用预计算的模板向量来解决这个问题，从而在不同模态之间实现高保真度的鲁棒低比特量化。 AI

影响引入了一种新颖的量化方法，可以在不牺牲性能的情况下提高效率并减小模型尺寸。

排序理由这是一篇详细介绍理解和改进LLM量化新方法的学术论文。[lever_c_demoted from research: ic=1 ai=1.0]

在 arXiv cs.LG 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

arXiv cs.LG TIER_1 English(EN) · Yung-Chin Chen, Chung Peng Lee, Ze-Wei Liou, Naveen Verma · 2026-06-02 04:00

Massive Spikes in LLMs are Bias Vectors: Mechanistic Uncovering and Spike-Free Quantization

arXiv:2606.02288v1 Announce Type: new Abstract: Massive activation spikes in Large Language Models (LLMs) severely degrade quantization by stretching dynamic ranges. While prior hypotheses characterize these as high-level scalar biases, we argue that they are merely the scalar in…

报道来源 [1]

Massive Spikes in LLMs are Bias Vectors: Mechanistic Uncovering and Spike-Free Quantization

相关实体

相关话题