PulseAugur
LIVE 08:00:59
research · [2 sources] ·
0
research

Gemma 3 4B LLM confidence training shows mixed results, improves accuracy post-hoc

A study on the Gemma 3 4B model investigated methods to improve its verbal confidence in responses. Initial attempts using a filtered dataset for confidence-conditioned supervised fine-tuning (CSFT) yielded negative results, decreasing performance. However, an exploratory approach that removed the filter and trained on all calibration items significantly improved the model's ability to predict verbal correctness, achieving an AUROC2 of 0.774 on TriviaQA. AI

Summary written by gemini-2.5-flash-lite from 2 sources. How we write summaries →

IMPACT Demonstrates a potential method to improve confidence calibration in smaller LLMs, impacting their reliability in downstream applications.

RANK_REASON This is a research paper detailing experimental results on a specific model's performance.

Read on arXiv cs.CL →

COVERAGE [2]

  1. arXiv cs.CL TIER_1 · Jon-Paul Cacioli ·

    Distilling Self-Consistency into Verbal Confidence: A Pre-Registered Negative Result and Post-Hoc Rescue on Gemma 3 4B

    arXiv:2604.24070v1 Announce Type: new Abstract: Small instruct-tuned LLMs produce degenerate verbal confidence under minimal elicitation: ceiling rates above 95%, near-chance Type-2 AUROC, and Invalid validity profiles. We test whether confidence-conditioned supervised fine-tunin…

  2. arXiv cs.CL TIER_1 · Jon-Paul Cacioli ·

    Distilling Self-Consistency into Verbal Confidence: A Pre-Registered Negative Result and Post-Hoc Rescue on Gemma 3 4B

    Small instruct-tuned LLMs produce degenerate verbal confidence under minimal elicitation: ceiling rates above 95%, near-chance Type-2 AUROC, and Invalid validity profiles. We test whether confidence-conditioned supervised fine-tuning (CSFT) with self-consistency-derived targets c…