Brief · PulseAugur

RESEARCH · arXiv cs.CL English(EN) · 6d · [2 sources]

Do LLMs Know What Luxembourgish Borrows? Probing Lexical Neology in Low-Resource Multilingual Models

Researchers have developed a new benchmark, LexNeo-Bench, to evaluate how well large language models understand lexical borrowing in low-resource languages like Luxembourgish. The benchmark, derived from a Luxembourgish news corpus, labels tokens as native or borrowed from French, German, or English. When prompted with a linguistic knowledge graph, LLMs showed significantly improved accuracy in classifying borrowed words, narrowing the performance gap between smaller and larger models. AI

IMPACT Enhances LLM evaluation for low-resource languages, potentially improving writing assistance tools for diverse linguistic communities.

large language models
English
French
German
Luxembourgish
Nina Hosseini-Kivanani
LexNeo-Bench
ENEOLI COST Action