PulseAugur
实时 13:51:32
English(EN) Introducing corpora Hlava Cor and Hlava AD: Human Label Variation in Coreference and Discourse Relations

新语料库揭示人类在核心指代和语篇标注中的变异

研究人员推出了两个新语料库 Hlava CorHlava AD,旨在研究核心指代和语篇关系中人类标注的变异。Hlava Cor 包含 1,024 个由三名个体标注的语境,侧重于跨不同语言元素的指代识别。Hlava AD 包含 512 个由五名个体标注的语境,侧重于语篇关系。两个语料库的标注者间一致性(inter-annotator agreement)均约为 60-65%,在自动指代消解模型也难以处理的情况下,一致性较低,这表明了人类标注者也面临歧义。 AI

影响 强调了自然语言理解任务中的挑战,可能指导未来核心指代和语篇模型的开发。

排序理由 该集群包含一篇研究论文,详细介绍了用于研究语言标注变异的新语料库。

在 arXiv cs.CL 阅读 →

AI 生成摘要 · Google Gemini · 来自 3 个来源。 我们如何撰写摘要 →

新语料库揭示人类在核心指代和语篇标注中的变异

报道来源 [3]

  1. arXiv cs.CL TIER_1 English(EN) · Anna Nedoluzhko, \v{S}\'arka Zik\'anov\'a, Ji\v{r}\'i M\'irovsk\'y, Milan Straka, Eva Haji\v{c}ov\'a ·

    Introducing corpora Hlava Cor and Hlava AD: Human Label Variation in Coreference and Discourse Relations

    arXiv:2606.25383v1 Announce Type: new Abstract: As previous research on annotator disagreement in discourse phenomena has shown, understanding text coherence varies considerably from one individual to another. To explore this phenomenon, we created two corpora with multiple annot…

  2. arXiv cs.CL TIER_1 English(EN) · Eva Hajičová ·

    Introducing corpora Hlava Cor and Hlava AD: Human Label Variation in Coreference and Discourse Relations

    As previous research on annotator disagreement in discourse phenomena has shown, understanding text coherence varies considerably from one individual to another. To explore this phenomenon, we created two corpora with multiple annotations of Czech texts, accompanied by annotators…

  3. Hugging Face Daily Papers TIER_1 English(EN) ·

    Introducing corpora Hlava Cor and Hlava AD: Human Label Variation in Coreference and Discourse Relations

    As previous research on annotator disagreement in discourse phenomena has shown, understanding text coherence varies considerably from one individual to another. To explore this phenomenon, we created two corpora with multiple annotations of Czech texts, accompanied by annotators…