Researchers have developed SiNFluD, a new dataset for classifying figurative language in Sindhi. The dataset was compiled from various online sources and annotated by native speakers, achieving a high inter-annotator agreement. Several models, including mBERT, XLM-RoBERTa, and SetFit, were evaluated, with XLM-RoBERTa-XL demonstrating the best performance. AI
Summary written by gemini-2.5-flash-lite from 1 source. How we write summaries →
IMPACT Introduces a new benchmark dataset for figurative language classification in Sindhi, enabling further research and model development for low-resource languages.
RANK_REASON This is a research paper introducing a new dataset and evaluating models. [lever_c_demoted from research: ic=1 ai=1.0]