AfroScope: A Framework for Studying the Linguistic Landscape of Africa
Researchers have developed AfroScope, a comprehensive framework designed to study the linguistic landscape of Africa. This framework includes a large dataset, AfroScope-Data, encompassing 640 African languages, and a suite of models, AfroScope-Models, for language identification. To improve accuracy among closely related languages, AfroScope-Models utilizes a hierarchical classification approach and a specialized embedding model called AfroScope-Mirror, which enhances macro-F1 scores by 1.57 points on confusable language subsets. The project also investigates cross-lingual transfer and domain effects on language identification performance, aiming to enable large-scale measurement of Africa's digital linguistic diversity. AI
IMPACT Enhances NLP capabilities for African languages, enabling broader digital inclusion and research.