Measuring a hate speech spectrum with faceted Rasch item response theory and perspective-aware, explainable-by-design deep learning
Researchers have developed a novel system to measure hate speech on a continuous spectrum, ranging from genocidal to supportive language. This approach combines supervised deep learning with faceted Rasch item response theory, breaking down hate speech into 10 ordinal labels. These labels are then probabilistically modeled to create an interval outcome measure, while also accounting for individual annotator perspectives. The system, applied to a dataset of 50,070 social media comments from YouTube, Twitter, and Reddit annotated by over 11,000 Mechanical Turk workers, utilizes a RoBERTa-based model that demonstrates improved accuracy over existing methods. AI
IMPACT Introduces a new paradigm for NLP that encourages continuous constructs and incorporates annotator perspective and model explainability.