English(EN) Evaluating Commercial AI Chatbots as News Intermediaries

AI聊天机器人难以应对新闻准确性、地区偏见和错误前提

作者 PulseAugur 编辑部 · [2 个来源] · 2026-05-21 17:42

一项新研究评估了六款主流AI聊天机器人准确报道新兴新闻事实的能力。虽然顶级模型在多项选择题上准确率超过90%，但在自由回答格式和尤其是在带有错误前提的问题上，其表现显著下降。研究还强调了不同语言之间显著的准确性差异，印地语查询结果较低，表明存在偏向英语语言来源的偏见。 AI

影响凸显了AI新闻中介的关键局限性，包括地区偏见和易受虚假信息影响，影响可靠信息的传播。

排序理由该集群包含一篇评估AI聊天机器人事实报道性能的学术论文。

AI 生成摘要 · Google Gemini · 来自 2 个来源。我们如何撰写摘要 →

报道来源 [2]

arXiv cs.CL TIER_1 English(EN) · Mirac Suzgun, Emily Shen, Federico Bianchi, Alexander Spangher, Thomas Icard, Daniel E. Ho, Dan Jurafsky, James Zou · 2026-05-22 04:00

评估商业AI聊天机器人作为新闻中介

arXiv:2605.22785v1 Announce Type: new Abstract: AI chatbots are rapidly shaping how people encounter the news, yet no prior study has systematically measured how accurately these systems, with their proprietary search integrations and retrieval-synthesis pipelines, handle emergin…
arXiv cs.CL TIER_1 English(EN) · James Zou · 2026-05-21 17:42

评估商业AI聊天机器人作为新闻中介

AI chatbots are rapidly shaping how people encounter the news, yet no prior study has systematically measured how accurately these systems, with their proprietary search integrations and retrieval-synthesis pipelines, handle emerging facts across languages and regions. We present…