PulseAugur
实时 13:03:47

Gemma 3 4B LLM confidence training shows mixed results, improves accuracy post-hoc

A study on the Gemma 3 4B model investigated methods to improve its verbal confidence in responses. Initial attempts using a filtered dataset for confidence-conditioned supervised fine-tuning (CSFT) yielded negative results, decreasing performance. However, an exploratory approach that removed the filter and trained on all calibration items significantly improved the model's ability to predict verbal correctness, achieving an AUROC2 of 0.774 on TriviaQA. AI

影响 Demonstrates a potential method to improve confidence calibration in smaller LLMs, impacting their reliability in downstream applications.

排序理由 This is a research paper detailing experimental results on a specific model's performance.

在 arXiv cs.CL 阅读 →

AI 生成摘要 · Google Gemini · 来自 2 个来源。 我们如何撰写摘要 →

Gemma 3 4B LLM confidence training shows mixed results, improves accuracy post-hoc

报道来源 [2]

  1. arXiv cs.CL TIER_1 English(EN) · Jon-Paul Cacioli ·

    Distilling Self-Consistency into Verbal Confidence: A Pre-Registered Negative Result and Post-Hoc Rescue on Gemma 3 4B

    arXiv:2604.24070v1 Announce Type: new Abstract: Small instruct-tuned LLMs produce degenerate verbal confidence under minimal elicitation: ceiling rates above 95%, near-chance Type-2 AUROC, and Invalid validity profiles. We test whether confidence-conditioned supervised fine-tunin…

  2. arXiv cs.CL TIER_1 English(EN) · Jon-Paul Cacioli ·

    Distilling Self-Consistency into Verbal Confidence: A Pre-Registered Negative Result and Post-Hoc Rescue on Gemma 3 4B

    Small instruct-tuned LLMs produce degenerate verbal confidence under minimal elicitation: ceiling rates above 95%, near-chance Type-2 AUROC, and Invalid validity profiles. We test whether confidence-conditioned supervised fine-tuning (CSFT) with self-consistency-derived targets c…