Improving Cross-Lingual Factual Recall via Consistency-Driven Reinforcement Learning
Researchers have developed a new method to improve how large language models recall facts in different languages. They created a dataset called PolyFact with 100,000 facts across 12 languages to study and address cross-lingual factual inconsistency. Their reinforcement learning approach, GRPO, significantly outperformed standard fine-tuning methods in enhancing factual recall and generalization to new languages. AI
IMPACT Enhances LLM reliability in multilingual applications by improving cross-lingual factual consistency.