PulseAugur
实时 12:37:16
English(EN) Massive Spikes in LLMs are Bias Vectors: Mechanistic Uncovering and Spike-Free Quantization

LLM激活尖峰被识别为结构向量偏见

研究人员发现,大型语言模型(LLM)中的大规模激活尖峰并非简单的标量偏见,而是特定token内的结构向量偏见。这些向量通过模型的投影权重和位置嵌入得以保留,即使在扰动下也是如此。为了解决这些尖峰在量化过程中造成的性能下降问题,开发了一种名为INSERTQUANT的新型训练后量化框架。该方法对尖峰进行钳制并恢复其功能,从而能够实现跨模态的高保真度、鲁棒的低比特量化。 AI

影响 能够实现更高效的LLM低比特量化,可能降低部署的计算成本和内存需求。

排序理由 该集群包含一篇详细介绍LLM量化新方法的学术论文。

在 arXiv cs.LG 阅读 →

AI 生成摘要 · Google Gemini · 来自 2 个来源。 我们如何撰写摘要 →

报道来源 [2]

  1. arXiv cs.LG TIER_1 English(EN) · Yung-Chin Chen, Chung Peng Lee, Ze-Wei Liou, Naveen Verma ·

    Massive Spikes in LLMs are Bias Vectors: Mechanistic Uncovering and Spike-Free Quantization

    arXiv:2606.02288v1 Announce Type: new Abstract: Massive activation spikes in Large Language Models (LLMs) severely degrade quantization by stretching dynamic ranges. While prior hypotheses characterize these as high-level scalar biases, we argue that they are merely the scalar in…

  2. arXiv cs.LG TIER_1 English(EN) · Naveen Verma ·

    Massive Spikes in LLMs are Bias Vectors: Mechanistic Uncovering and Spike-Free Quantization

    Massive activation spikes in Large Language Models (LLMs) severely degrade quantization by stretching dynamic ranges. While prior hypotheses characterize these as high-level scalar biases, we argue that they are merely the scalar intermediates of rigid, structural vector biases i…