PulseAugur
实时 12:15:35

Language models learn to abstain from answering when unsure, improving correctness

Researchers have developed a post-hoc framework called Conformal Abstention (CA) to help language models determine when they should abstain from answering a query. This method aims to reduce hallucinations by providing finite-sample guarantees on both the likelihood of participation and the correctness of responses. CA utilizes prediction confidence, calibrated by the model's internal representation geometry, to measure knowledge involvement in response generation. Experiments show this approach significantly improves selective answering capabilities, achieving 75 percent conditional correctness. AI

影响 Introduces a method to improve language model reliability by enabling them to admit ignorance, potentially reducing hallucinations and increasing trust in their outputs.

排序理由 This is a research paper detailing a new framework for language models.

在 arXiv cs.CL 阅读 →

AI 生成摘要 · Google Gemini · 来自 2 个来源。 我们如何撰写摘要 →

Language models learn to abstain from answering when unsure, improving correctness

报道来源 [2]

  1. arXiv cs.CL TIER_1 English(EN) · Rui Xu, Yi Chen, Sihong Xie, Hui Xiong ·

    Geometry-Calibrated Conformal Abstention for Language Models

    arXiv:2604.27914v1 Announce Type: new Abstract: When language models lack relevant knowledge for a given query, they frequently generate plausible responses that can be hallucinations, rather than admitting being agnostic about the answer. Retraining models to reward admitting ig…

  2. arXiv cs.CL TIER_1 English(EN) · Hui Xiong ·

    Geometry-Calibrated Conformal Abstention for Language Models

    When language models lack relevant knowledge for a given query, they frequently generate plausible responses that can be hallucinations, rather than admitting being agnostic about the answer. Retraining models to reward admitting ignorance can lead to overly conservative behavior…