PulseAugur
EN
LIVE 09:04:50

New research details how LLMs develop cross-lingual abilities in two phases

Researchers have investigated the emergence of cross-lingual generalization in large language models during multilingual pretraining. By analyzing a 1.7B parameter model trained on nine languages with fine-grained checkpoints, they observed that linguistic capabilities and token-level copying develop concurrently. Translation skills emerge in two phases: an initial stage reliant on copying and surface similarities, followed by a second phase where more generalized translation mechanisms are formed while copying is refined. This study offers a detailed perspective on the development of cross-lingual abilities in multilingual models. AI

IMPACT Provides a fine-grained view of how cross-lingual generalization develops during multilingual pretraining, informing future model architectures and training strategies.

RANK_REASON Academic paper detailing a novel dataset and analysis of multilingual pretraining dynamics. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.CL →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

New research details how LLMs develop cross-lingual abilities in two phases

COVERAGE [1]

  1. arXiv cs.CL TIER_1 English(EN) · Felicia K\"orner, Maria Matveev, Florian Eichin, Gitta Kutyniok, Barbara Plank, Michael A. Hedderich ·

    Copy First, Translate Later: Interpreting Translation Dynamics in Multilingual Pretraining

    arXiv:2604.17633v2 Announce Type: replace Abstract: Large language models exhibit impressive cross-lingual capabilities. However, prior work analyzes this phenomenon through isolated factors and at sparse points during training, limiting our understanding of how cross-lingual gen…