PulseAugur
EN
LIVE 15:44:20
tool · [1 source] ·

New dataset RoIt-XMASA aids Romanian and Italian sentiment analysis

Researchers have introduced RoIt-XMASA, a new dataset designed for multilingual sentiment analysis in Romanian and Italian. This dataset includes 36,000 labeled reviews across books, movies, and music, along with over 200,000 unlabeled samples. To tackle cross-lingual and cross-domain challenges, they developed a multi-target adversarial training framework that achieved an F1-score of 66.23% with XLM-R, surpassing the baseline by 4.64%. AI

Summary written by gemini-2.5-flash-lite from 1 sources. How we write summaries →

IMPACT Enhances multilingual NLP capabilities, particularly for under-resourced languages like Romanian and Italian.

RANK_REASON The cluster describes a new academic paper introducing a dataset and a novel training framework for sentiment analysis. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.CL →

COVERAGE [1]

  1. arXiv cs.CL TIER_1 · Andrei-Marius Avram, Aureliu Valentin Antonie, Cosmin-Mircea Croitoru, Vlad Andrei Muntean, Dumitru-Clementin Cercel ·

    RoIt-XMASA: Multi-Domain Multilingual Sentiment Analysis Dataset for Romanian and Italian

    arXiv:2604.17134v2 Announce Type: replace Abstract: We present RoIt-XMASA, a multilingual dataset that extends the Cross-lingual Multi-domain Amazon Sentiment Analysis to Italian and Romanian, comprising 36,000 labeled reviews across three domains (books, movies, and music) and 2…