A new paper analyzes annotation variation in NLP datasets, focusing on harmful language detection. The research combines annotator characteristics with linguistic properties of the data to understand labeling discrepancies. Findings indicate that interactions between annotator traits and item features, particularly lexical cues and annotator attitudes, are crucial, but patterns vary significantly across different datasets, cautioning against overgeneralization. AI
影响 Highlights the importance of considering both annotator and data characteristics for reliable NLP model training.
排序理由 The cluster contains an academic paper published on arXiv.
AI 生成摘要 · Google Gemini · 来自 2 个来源。 我们如何撰写摘要 →