New corpora reveal human variation in coreference and discourse annotation

By PulseAugur Editorial · [3 sources] · 2026-06-24 04:32

Researchers have introduced two new corpora, Hlava Cor and Hlava AD, designed to study human label variation in coreference and discourse relations. Hlava Cor contains 1,024 contexts annotated by three individuals, focusing on coreference identification across different linguistic elements. Hlava AD includes 512 contexts annotated by five individuals, concentrating on discourse relations. Both corpora exhibit an inter-annotator agreement of around 60-65%, with lower agreement observed in cases where automatic coreference resolution models also struggle, indicating ambiguity for human annotators. AI

IMPACT Highlights challenges in natural language understanding tasks, potentially guiding future model development for coreference and discourse.

RANK_REASON The cluster contains a research paper detailing new corpora for studying linguistic annotation variation.

Read on arXiv cs.CL →

paper
other

AI-generated summary · Google Gemini · from 3 sources. How we write summaries →

New corpora reveal human variation in coreference and discourse annotation

COVERAGE [3]

arXiv cs.CL TIER_1 English(EN) · Anna Nedoluzhko, \v{S}\'arka Zik\'anov\'a, Ji\v{r}\'i M\'irovsk\'y, Milan Straka, Eva Haji\v{c}ov\'a · 2026-06-25 04:00

Introducing corpora Hlava Cor and Hlava AD: Human Label Variation in Coreference and Discourse Relations

arXiv:2606.25383v1 Announce Type: new Abstract: As previous research on annotator disagreement in discourse phenomena has shown, understanding text coherence varies considerably from one individual to another. To explore this phenomenon, we created two corpora with multiple annot…
arXiv cs.CL TIER_1 English(EN) · Eva Hajičová · 2026-06-24 04:32

Introducing corpora Hlava Cor and Hlava AD: Human Label Variation in Coreference and Discourse Relations

As previous research on annotator disagreement in discourse phenomena has shown, understanding text coherence varies considerably from one individual to another. To explore this phenomenon, we created two corpora with multiple annotations of Czech texts, accompanied by annotators…
Hugging Face Daily Papers TIER_1 English(EN) · 2026-06-24 04:32

Introducing corpora Hlava Cor and Hlava AD: Human Label Variation in Coreference and Discourse Relations

As previous research on annotator disagreement in discourse phenomena has shown, understanding text coherence varies considerably from one individual to another. To explore this phenomenon, we created two corpora with multiple annotations of Czech texts, accompanied by annotators…

COVERAGE [3]

Introducing corpora Hlava Cor and Hlava AD: Human Label Variation in Coreference and Discourse Relations

Introducing corpora Hlava Cor and Hlava AD: Human Label Variation in Coreference and Discourse Relations

Introducing corpora Hlava Cor and Hlava AD: Human Label Variation in Coreference and Discourse Relations

RELATED ENTITIES

RELATED TOPICS