XLM-RoBERTa
PulseAugur coverage of XLM-RoBERTa — every cluster mentioning XLM-RoBERTa across labs, papers, and developer communities, ranked by signal.
2 day(s) with sentiment data
-
Multilingual slur detection framework improves F1 score with language-specific thresholds
Researchers have developed a multi-stage framework to detect reclaimed slurs in multilingual social media, focusing on LGBTQ+-related terms in English, Spanish, and Italian. The approach tackles data scarcity and class …
-
XLM-RoBERTa model improves hope speech detection in Tulu
Researchers developed an XLM-RoBERTa-based system for detecting hope speech in code-mixed Tulu social media comments. Their organically adapted model showed improved performance over a baseline on a development set. Whi…
-
New benchmark study explores neural network performance on Tajik POS tagging
This paper introduces the first benchmark for part-of-speech tagging in the Tajik language, evaluating various neural network architectures. The study utilized the TajPersParallel corpus, focusing on context-independent…
-
Teams leverage LLMs and ensemble methods for multilingual online polarization detection at SemEval-2026
Researchers have developed systems for SemEval-2026 Task 9, a multilingual polarization detection challenge across 22 languages. One approach fine-tuned Gemma 3 models using Low-Rank Adaptation (LoRA) and augmented data…
-
New Sindhi figurative language dataset SiNFluD released with XLM-RoBERTa-XL benchmark
Researchers have developed SiNFluD, a new dataset for classifying figurative language in Sindhi. The dataset was compiled from various online sources and annotated by native speakers, achieving a high inter-annotator ag…
-
Researchers create Naamah, a large synthetic Sanskrit NER dataset using LLMs
Researchers have developed Naamah, a synthetic dataset of over 100,000 Sanskrit sentences designed to improve Named Entity Recognition (NER) for classical Sanskrit literature. The dataset was generated by combining enti…
-
XITE technique boosts cross-lingual transfer for language models up to 81%
Researchers have introduced XITE, a novel data augmentation technique designed to improve cross-lingual transfer in multilingual language models. This method leverages embedding similarities to identify and adapt labels…