English(EN) It's the humans, not the data: Geopolitical bias in LLMs originates in post-training, amplified by the language of the prompt

研究发现：LLM 地缘政治偏见源于训练后阶段，而非数据

作者 PulseAugur 编辑部 · [2 个来源] · 2026-05-22 16:29

一篇新发表在 arXiv 上的研究揭示，大型语言模型 (LLM) 中的地缘政治偏见主要源于训练后对齐阶段，而非初始训练数据。研究人员测试了七对 LLM，发现其中六对在训练后表现出偏袒其开发者所在地区的偏见。这种效应在阿里巴巴的 Qwen 2.5 中尤为明显，其训练后偏袒中国的几率增加了 18 倍。研究还指出，提示所使用的语言会放大这些偏见，例如法国制造的 Mistral 模型仅在用法文提示时才表现出亲法倾向。 AI

影响强调 LLM 的对齐过程，而不仅仅是原始数据，塑造了地缘政治偏见，这需要提高模型开发的透明度。

排序理由学术论文，详细介绍了关于 LLM 行为的新发现。

在 arXiv cs.AI 阅读 →

AI 生成摘要 · Google Gemini · 来自 2 个来源。我们如何撰写摘要 →

报道来源 [2]

arXiv cs.AI TIER_1 English(EN) · Stuart Bladon, Brinnae Bent · 2026-05-25 04:00

It's the humans, not the data: Geopolitical bias in LLMs originates in post-training, amplified by the language of the prompt

arXiv:2605.23825v1 Announce Type: cross Abstract: It has generally been assumed that geopolitical bias in language models originates from the training data used during the pre-training phase. We tested seven open-weight LLM pairs consisting of the base model (pre-training only) a…
arXiv cs.AI TIER_1 English(EN) · Brinnae Bent · 2026-05-22 16:29

It's the humans, not the data: Geopolitical bias in LLMs originates in post-training, amplified by the language of the prompt

It has generally been assumed that geopolitical bias in language models originates from the training data used during the pre-training phase. We tested seven open-weight LLM pairs consisting of the base model (pre-training only) and the chat model (pre-training and post-training)…

报道来源 [2]

It's the humans, not the data: Geopolitical bias in LLMs originates in post-training, amplified by the language of the prompt

It's the humans, not the data: Geopolitical bias in LLMs originates in post-training, amplified by the language of the prompt

相关实体

相关话题