PulseAugur
实时 02:48:01

新捷克语依存句法树库发布,用于自然语言处理研究 · 追踪 4 个来源

研究人员发布了两篇详细介绍捷克语处理资源进展的论文。第一篇论文介绍了 Prague Dependency Treebank -- Consolidated 2.0 (PDT-C 2.0),这是一个广泛、统一标注的捷克语语料库,包含近 400 万个词元。该资源历经三十年开发,旨在系统地整合各种语言层面,包括共指和语篇关系等句间现象。第二篇论文介绍了 UD_Czech-PDTC,这是一个大型且体裁丰富的句法树库,已转换为可用于 Universal Dependencies,并强调了转换过程以及两种标注方案之间的差异。 AI

影响 这些新的、大规模、体裁多样的捷克语句法树库将促进自然语言处理工具的开发和评估,尤其是在捷克语方面,并有助于跨语言比较。

排序理由 该集群包含两篇在 arXiv 上发表的学术论文,详细介绍了用于自然语言处理的新语言资源。

在 arXiv cs.CL 阅读 →

AI 生成摘要 · Google Gemini · 来自 4 个来源。 我们如何撰写摘要 →

新捷克语依存句法树库发布,用于自然语言处理研究 · 追踪 4 个来源

报道来源 [4]

  1. arXiv cs.CL TIER_1 English(EN) · Marie Mikulov\'a, Ji\v{r}\'i M\'irovsk\'y, Milan Straka, Pavl\'ina Synkov\'a, Jan \v{S}t\v{e}p\'anek, Barbora \v{S}t\v{e}p\'ankov\'a, Jan Haji\v{c} ·

    Prague Dependency Treebank -- Consolidated 2.0: Enriching a Complex Annotation Scheme

    arXiv:2606.24324v1 Announce Type: new Abstract: The Prague Dependency Treebank framework is unique in its attempt to systematically include and link different layers of language, including a meaning representation with several types of inter-sentential phenomena, especially coref…

  2. arXiv cs.CL TIER_1 English(EN) · Marie Mikulov\'a, Barbora \v{S}t\v{e}p\'ankov\'a, Daniel Zeman, Jan \v{S}t\v{e}p\'anek, Milan Straka, Jan Haji\v{c} ·

    Meet UD_Czech-PDTC: A Large and Genre-Rich Treebank in Universal Dependencies

    arXiv:2606.24337v1 Announce Type: new Abstract: Czech has been part of Universal Dependencies since its first release in 2015. It has also been one of the best represented languages, with the Prague Dependency Treebank being order of magnitude larger than most other UD treebanks.…

  3. arXiv cs.CL TIER_1 English(EN) · Jan Hajič ·

    Meet UD_Czech-PDTC: A Large and Genre-Rich Treebank in Universal Dependencies

    Czech has been part of Universal Dependencies since its first release in 2015. It has also been one of the best represented languages, with the Prague Dependency Treebank being order of magnitude larger than most other UD treebanks. More recently, three other datasets from the Pr…

  4. arXiv cs.CL TIER_1 English(EN) · Jan Hajič ·

    Prague Dependency Treebank -- Consolidated 2.0: Enriching a Complex Annotation Scheme

    The Prague Dependency Treebank framework is unique in its attempt to systematically include and link different layers of language, including a meaning representation with several types of inter-sentential phenomena, especially coreference and discourse relations. We present its s…