Researchers have introduced the BD-LSC and ST-WSD datasets to benchmark models in detecting lexical semantic change, particularly for words with both slang and standard meanings. These datasets enable the study of sense gain, loss, and stability over time. While GPT-4o demonstrated strong performance in few-shot settings on metrics like Exact Sense Match, overall Macro-F1 scores indicate that identifying rare slang senses remains a significant challenge. AI
影响 New datasets may improve LLM understanding of nuanced language, especially slang.
排序理由 Research paper introducing new datasets for benchmarking NLP models.
- arXiv
- BD-LSC Dataset
- GPT-4o
- SlangTrack Word Sense Disambiguation
- alphaXiv
- CatalyzeX Code Finder for Papers
- CORE Recommender
- DagsHub
- Gotit.pub
- Hugging Face
- ScienceCast
AI 生成摘要 · Google Gemini · 来自 2 个来源。 我们如何撰写摘要 →