Researchers have developed a new metric called the Opposition Index to quantify and predict disagreement among human annotators in graded rating tasks. The study focused on identifying patterns of annotation variation in perceptions of inappropriate language, such as hate speech and toxic content. Findings indicate that the degree of annotation disagreement can be predicted using textual features, with a moderate correlation observed between estimated and actual annotation variance. The research also highlights that items with a high Opposition Index are more challenging for models to predict accurately. AI
Summary written by gemini-2.5-flash-lite from 1 source. How we write summaries →
IMPACT Introduces a new metric for evaluating model performance on subjective tasks, potentially improving content moderation systems.
RANK_REASON Academic paper published on arXiv detailing a new metric for quantifying disagreement in human ratings. [lever_c_demoted from research: ic=1 ai=1.0]