English(EN) ReQAT: Achieving Full-Precision Reasoning Accuracy with 4-bit Floating-Point Quantization-Aware Training

新的 ReQAT 框架使 4 位量化 LLM 能够匹配全精度推理

作者 PulseAugur 编辑部 · [1 个来源] · 2026-06-16 04:00

研究人员开发了 ReQAT，一种新颖的训练框架，旨在使大型推理模型 (LRM) 即使在量化为 4 位浮点格式时也能实现全精度推理准确性。现有的量化方法在处理数字和运算符等低熵标记时遇到困难，导致推理能力下降。ReQAT 通过 Trace-Aligned QAT、选择性熵最小化和 Q-FIT 初始化来解决此问题，这些方法共同关注关键决策并稳定训练。这种方法不仅恢复了标准微调的准确性，甚至超越了它，同时显著提高了推理速度并降低了硬件要求。 AI

影响能够更有效地部署大型推理模型，可能降低硬件成本并提高推理速度。

排序理由这是一篇详细介绍大型语言模型量化新方法的论文。[lever_c_demoted from research: ic=1 ai=1.0]

在 arXiv cs.LG 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

arXiv cs.LG TIER_1 English(EN) · Janghwan Lee, Sihwa Lee, Jinseok Kim, Yongjik Kim, Jieun Lim, Jinwook Oh, Jungwook Choi · 2026-06-16 04:00

ReQAT: Achieving Full-Precision Reasoning Accuracy with 4-bit Floating-Point Quantization-Aware Training

arXiv:2606.15682v1 Announce Type: new Abstract: Large Reasoning Models (LRMs) achieve strong problem-solving through long chain-of-thought, but their deployment is constrained by the high cost of full-precision inference and growing KV cache footprints. Microscaled FP4 formats en…

报道来源 [1]

ReQAT: Achieving Full-Precision Reasoning Accuracy with 4-bit Floating-Point Quantization-Aware Training

相关实体

相关话题