RoIt-XMASA: Multi-Domain Multilingual Sentiment Analysis Dataset for Romanian and Italian
Researchers have introduced RoIt-XMASA, a new dataset designed for multilingual sentiment analysis in Romanian and Italian. This dataset includes 36,000 labeled reviews across books, movies, and music, along with over 200,000 unlabeled samples. To tackle cross-lingual and cross-domain challenges, they developed a multi-target adversarial training framework that achieved an F1-score of 66.23% with XLM-R, surpassing the baseline by 4.64%. AI
IMPACT Enhances multilingual NLP capabilities, particularly for under-resourced languages like Romanian and Italian.