A new study evaluated the machine translation capabilities of four large language models (LLMs) for Hausa and Fongbe, two West African languages. The research found that while Hausa achieved acceptable translation quality with models like GPT-4o mini, Fongbe translations were poor across all evaluated systems. Model performance varied significantly between the two languages, with Gemini 2.5 Flash leading for Fongbe and GPT-4o mini for Hausa, indicating that performance on one low-resource language does not predict performance on another. The study also highlighted issues with standard automatic metrics, which showed weak correlation with human judgment for Hausa and limitations due to embedding collapse in neural metrics for both languages. AI
IMPACT Highlights limitations of current LLMs for low-resource languages and the unreliability of standard translation metrics, necessitating careful evaluation.
RANK_REASON Academic paper evaluating LLM performance on specific languages and metrics. [lever_c_demoted from research: ic=1 ai=1.0]
- BERTScore
- Bleu
- chrF++
- Claude Sonnet 4
- Comet
- English
- Fongbe
- Gemini 2.5 Flash
- GPT-4o mini
- Hausa
- Mahounan Pericles Adjovi
- qwen2.5:7b
- Terapevticheskii arkhiv
AI-generated summary · Google Gemini · from 1 sources. How we write summaries →