PulseAugur
EN
LIVE 10:57:19

New corpora reveal human variation in coreference and discourse annotation

Researchers have introduced two new corpora, Hlava Cor and Hlava AD, designed to study human label variation in coreference and discourse relations. Hlava Cor contains 1,024 contexts annotated by three individuals, focusing on coreference identification across different linguistic elements. Hlava AD includes 512 contexts annotated by five individuals, concentrating on discourse relations. Both corpora exhibit an inter-annotator agreement of around 60-65%, with lower agreement observed in cases where automatic coreference resolution models also struggle, indicating ambiguity for human annotators. AI

IMPACT Highlights challenges in natural language understanding tasks, potentially guiding future model development for coreference and discourse.

RANK_REASON The cluster contains a research paper detailing new corpora for studying linguistic annotation variation.

Read on arXiv cs.CL →

AI-generated summary · Google Gemini · from 3 sources. How we write summaries →

New corpora reveal human variation in coreference and discourse annotation

COVERAGE [3]

  1. arXiv cs.CL TIER_1 English(EN) · Anna Nedoluzhko, \v{S}\'arka Zik\'anov\'a, Ji\v{r}\'i M\'irovsk\'y, Milan Straka, Eva Haji\v{c}ov\'a ·

    Introducing corpora Hlava Cor and Hlava AD: Human Label Variation in Coreference and Discourse Relations

    arXiv:2606.25383v1 Announce Type: new Abstract: As previous research on annotator disagreement in discourse phenomena has shown, understanding text coherence varies considerably from one individual to another. To explore this phenomenon, we created two corpora with multiple annot…

  2. arXiv cs.CL TIER_1 English(EN) · Eva Hajičová ·

    Introducing corpora Hlava Cor and Hlava AD: Human Label Variation in Coreference and Discourse Relations

    As previous research on annotator disagreement in discourse phenomena has shown, understanding text coherence varies considerably from one individual to another. To explore this phenomenon, we created two corpora with multiple annotations of Czech texts, accompanied by annotators…

  3. Hugging Face Daily Papers TIER_1 English(EN) ·

    Introducing corpora Hlava Cor and Hlava AD: Human Label Variation in Coreference and Discourse Relations

    As previous research on annotator disagreement in discourse phenomena has shown, understanding text coherence varies considerably from one individual to another. To explore this phenomenon, we created two corpora with multiple annotations of Czech texts, accompanied by annotators…