Sakura at BEA 2026 Shared Task 1: What Makes Vocabulary Difficult?
Researchers developed two models to predict vocabulary difficulty, with one achieving top results in a shared task. The high-accuracy model utilized a fine-tuned LLM with a soft-target loss function, reaching a correlation of over 0.91. An explainable model also demonstrated strong performance, correlating over 0.77, and provided insights into factors influencing word difficulty beyond just production ease, such as spelling complexity and test item construction. AI
IMPACT Demonstrates advanced LLM application in linguistic analysis and educational tooling.