English(EN) Distilling Self-Consistency into Verbal Confidence: A Pre-Registered Negative Result and Post-Hoc Rescue on Gemma 3 4B

Gemma 3 4B LLM 置信度训练结果喜忧参半，事后提高准确性

作者 PulseAugur 编辑部 · [2 个来源] · 2026-04-27 05:53

一项针对 Gemma 3 4B 模型的研究，探讨了提高其回应言语自信度的方法。最初尝试使用过滤后的数据集进行置信度条件监督微调（CSFT）未能奏效，反而降低了性能。然而，一种移除过滤器并在所有校准项上进行训练的探索性方法，显著提高了模型预测言语正确性的能力，在 TriviaQA 上达到了 0.774 的 AUROC2。 AI

影响展示了一种提高小型 LLM 置信度校准的潜在方法，影响其在下游应用的可靠性。

排序理由这是一篇详细介绍特定模型性能实验结果的研究论文。

在 arXiv cs.CL 阅读 →

AI 生成摘要 · Google Gemini · 来自 2 个来源。我们如何撰写摘要 →

报道来源 [2]

arXiv cs.CL TIER_1 English(EN) · Jon-Paul Cacioli · 2026-04-28 04:00

Distilling Self-Consistency into Verbal Confidence: A Pre-Registered Negative Result and Post-Hoc Rescue on Gemma 3 4B

arXiv:2604.24070v1 Announce Type: new Abstract: Small instruct-tuned LLMs produce degenerate verbal confidence under minimal elicitation: ceiling rates above 95%, near-chance Type-2 AUROC, and Invalid validity profiles. We test whether confidence-conditioned supervised fine-tunin…
arXiv cs.CL TIER_1 English(EN) · Jon-Paul Cacioli · 2026-04-27 05:53

Distilling Self-Consistency into Verbal Confidence: A Pre-Registered Negative Result and Post-Hoc Rescue on Gemma 3 4B

Small instruct-tuned LLMs produce degenerate verbal confidence under minimal elicitation: ceiling rates above 95%, near-chance Type-2 AUROC, and Invalid validity profiles. We test whether confidence-conditioned supervised fine-tuning (CSFT) with self-consistency-derived targets c…

报道来源 [2]

Distilling Self-Consistency into Verbal Confidence: A Pre-Registered Negative Result and Post-Hoc Rescue on Gemma 3 4B

Distilling Self-Consistency into Verbal Confidence: A Pre-Registered Negative Result and Post-Hoc Rescue on Gemma 3 4B

相关实体

相关话题