PulseAugur
实时 04:19:11

New research highlights English bias in LLMs, calls for per-language investment

A new paper reveals that large language models are significantly biased towards English, even when fine-tuned for other languages. Researchers found that continual pre-training does not improve cultural understanding in a target language cost-effectively compared to training from scratch. This suggests that future LLM development may require dedicated investment in per-language resources rather than solely expanding English-centric ones. AI

影响 Suggests a shift towards dedicated per-language LLM development, potentially increasing costs and complexity for non-English applications.

排序理由 Academic paper analyzing LLM language bias. [lever_c_demoted from research: ic=1 ai=1.0]

在 arXiv cs.CL 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。 我们如何撰写摘要 →

New research highlights English bias in LLMs, calls for per-language investment

报道来源 [1]

  1. arXiv cs.CL TIER_1 English(EN) · Ukyo Honda ·

    Toward LLMs Beyond English-Centric Development

    Through an analysis of sequences generated by open-weight large language models (LLMs), we demonstrate that LLMs are heavily biased toward English. While continual pre-training is commonly used to adapt LLMs to a target language, we show that it does not offer a cost advantage ov…