English(EN) Assessing and Mitigating Miscalibration in LLM-Based Social Science Measurement

LLM置信度失准影响社会科学研究

作者 PulseAugur 编辑部 · [1 个来源] · 2026-06-03 04:00

一篇新论文探讨了大型语言模型在社会科学研究中使用时出现的失准问题。研究发现，LLM报告的置信度分数常常不能准确反映其正确性，这会影响下游分析。研究人员提出了一种软标签蒸馏方法来改进小型模型的校准，显示出校准误差显著降低。 AI

影响强调了在研究环境中改进LLM校准的必要性，以确保可靠的数据提取和分析。

排序理由学术论文，详细介绍了LLM在研究领域使用中的一个具体问题。[lever_c_demoted from research: ic=1 ai=1.0]

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

arXiv cs.AI TIER_1 English(EN) · Jinyuan Wang, Ningyuan Deng, Yi Yang · 2026-06-03 04:00

评估和减轻基于LLM的社会科学测量中的失校准问题

arXiv:2605.11954v2 Announce Type: replace Abstract: Large language models (LLMs) are increasingly used in social science as scalable measurement tools for converting unstructured text into variables that can enter standard empirical designs. Measurement validity demands more than…