PulseAugur
EN
LIVE 12:06:49

New datasets tackle slang and standard word meaning shifts

Researchers have introduced the BD-LSC and ST-WSD datasets to benchmark models in detecting lexical semantic change, particularly for words with both slang and standard meanings. These datasets enable the study of sense gain, loss, and stability over time. While GPT-4o demonstrated strong performance in few-shot settings on metrics like Exact Sense Match, overall Macro-F1 scores indicate that identifying rare slang senses remains a significant challenge. AI

IMPACT New datasets may improve LLM understanding of nuanced language, especially slang.

RANK_REASON Research paper introducing new datasets for benchmarking NLP models.

Read on arXiv cs.CL →

AI-generated summary · Google Gemini · from 2 sources. How we write summaries →

COVERAGE [2]

  1. arXiv cs.CL TIER_1 English(EN) · Afnan Aloraini, Viktor Schlegel, Goran Nenadic, Riza Batista-Navarro ·

    The BD-LSC Dataset: Facilitating the Benchmarking of Models for Lexical Semantic Change Detection in Slang and Standard Usage

    arXiv:2606.16560v1 Announce Type: new Abstract: Automatic semantic change detection aims to identify how word meanings shift over time, offering insights into both linguistic and societal change. Despite recent progress in computational lexical semantic change (LSC), existing ben…

  2. arXiv cs.CL TIER_1 English(EN) · Riza Batista-Navarro ·

    The BD-LSC Dataset: Facilitating the Benchmarking of Models for Lexical Semantic Change Detection in Slang and Standard Usage

    Automatic semantic change detection aims to identify how word meanings shift over time, offering insights into both linguistic and societal change. Despite recent progress in computational lexical semantic change (LSC), existing benchmarks and methods struggle to capture bi-direc…