Do LLMs Know What Luxembourgish Borrows? Probing Lexical Neology in Low-Resource Multilingual Models
Researchers have developed a new benchmark, LexNeo-Bench, to evaluate how well large language models understand lexical borrowing in low-resource languages like Luxembourgish. The benchmark, derived from a Luxembourgish news corpus, labels tokens as native or borrowed from French, German, or English. When prompted with a linguistic knowledge graph, LLMs showed significantly improved accuracy in classifying borrowed words, narrowing the performance gap between smaller and larger models. AI
IMPACT Enhances LLM evaluation for low-resource languages, potentially improving writing assistance tools for diverse linguistic communities.