PulseAugur
实时 13:27:50
English(EN) The BD-LSC Dataset: Facilitating the Benchmarking of Models for Lexical Semantic Change Detection in Slang and Standard Usage

新数据集应对俚语和标准词义变化

研究人员推出了 BD-LSC 和 ST-WSD 数据集,用于基准测试模型在检测词汇语义变化方面的能力,特别是针对具有俚语和标准含义的词语。这些数据集能够研究词义随时间推移的获得、丢失和稳定性。虽然 GPT-4o 在少样本设置下,在精确词义匹配等指标上表现强劲,但整体 Macro-F1 分数表明,识别罕见的俚语词义仍然是一个重大挑战。 AI

影响 新数据集可能提高 LLM 对细微语言(尤其是俚语)的理解能力。

排序理由 介绍用于 NLP 模型基准测试的新数据集的研究论文。

在 arXiv cs.CL 阅读 →

AI 生成摘要 · Google Gemini · 来自 2 个来源。 我们如何撰写摘要 →

报道来源 [2]

  1. arXiv cs.CL TIER_1 English(EN) · Afnan Aloraini, Viktor Schlegel, Goran Nenadic, Riza Batista-Navarro ·

    The BD-LSC Dataset: Facilitating the Benchmarking of Models for Lexical Semantic Change Detection in Slang and Standard Usage

    arXiv:2606.16560v1 Announce Type: new Abstract: Automatic semantic change detection aims to identify how word meanings shift over time, offering insights into both linguistic and societal change. Despite recent progress in computational lexical semantic change (LSC), existing ben…

  2. arXiv cs.CL TIER_1 English(EN) · Riza Batista-Navarro ·

    The BD-LSC Dataset: Facilitating the Benchmarking of Models for Lexical Semantic Change Detection in Slang and Standard Usage

    Automatic semantic change detection aims to identify how word meanings shift over time, offering insights into both linguistic and societal change. Despite recent progress in computational lexical semantic change (LSC), existing benchmarks and methods struggle to capture bi-direc…