新型GiLT模型使用依赖图增强Transformer语言模型

作者 PulseAugur 编辑部 · [1 个来源] · 2026-05-15 03:08

研究人员开发了一种新型Transformer语言模型GiLT，该模型整合了依赖图以增强句法泛化能力。与添加结构化标记的先前方法不同，GiLT通过根据增量构建的依赖图中的特征修改注意力权重来整合语言信息。实验表明，GiLT，特别是使用语义依赖图时，与标准的Transformer模型相比，在句法泛化和困惑度方面均表现出色。该模型还可以从预训练模型进行微调，以提高在下游任务上的性能。 AI

影响引入了一种将语言结构整合到大型语言模型中的新颖方法，有望提高其对句法和泛化能力的理解。

排序理由该集群描述了一篇详细介绍新型模型架构（GiLT）及其实验结果的学术论文。[lever_c_demoted from research: ic=1 ai=1.0]

在 arXiv cs.CL 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

arXiv cs.CL TIER_1 English(EN) · Kewei Tu · 2026-05-15 03:08

GiLT: Augmenting Transformer Language Models with Dependency Graphs

Augmenting Transformers with linguistic structures effectively enhances the syntactic generalization performance of language models. Previous work in this direction focuses on syntactic tree structures of languages, in particular constituency tree structures. We propose Graph-Inf…

报道来源 [1]

GiLT: Augmenting Transformer Language Models with Dependency Graphs

相关实体

相关话题