Researchers have introduced ArabDiscrim, a new corpus of 293,000 Arabic Facebook posts spanning a decade (2014-2024) that focus on racism and discrimination. This dataset uniquely incorporates engagement metrics like reactions and shares, alongside page metadata, to analyze language and audience interaction. It also features 200 curated terms related to racism and discrimination, 20 distinct discrimination axes, and explicit attribution patterns, aiming to advance fairness-oriented Arabic Natural Language Processing. AI
IMPACT Provides a foundational resource for developing fairer and more context-aware Arabic NLP models, particularly for analyzing social issues.
RANK_REASON The cluster describes a new academic paper detailing a dataset release.
AI-generated summary · Google Gemini · from 2 sources. How we write summaries →