English(EN) LLMs fail the Stroop task: they are unable to reliably name the color of a word when the word names a different color. They get worse as word lists get longer,

大型语言模型在斯特鲁普任务测试中难以处理认知干扰

作者 PulseAugur 编辑部 · [1 个来源] · 2026-06-03 16:15

大型语言模型在斯特鲁普任务（一种认知干扰测试）中遇到困难。当单词本身命名一种不同的颜色时，它们无法一致地识别单词的颜色。这种困难随着单词列表的增长以及呈现匹配和不匹配单词的混合而增加。 AI

影响凸显了大型语言模型在处理认知干扰方面的局限性，表明在需要细微理解的实际应用中可能存在挑战。

排序理由该集群描述了一篇已发表的学术论文的发现，该论文详细介绍了大型语言模型在特定认知测试中的表现。[lever_c_demoted from research: ic=1 ai=1.0]

在 Mastodon — fosstodon.org 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

Mastodon — fosstodon.org TIER_1 English(EN) · [email protected] · 2026-06-03 16:15

大型语言模型无法完成斯特鲁普测试：当单词的含义与单词本身所代表的颜色不同时，它们无法可靠地命名颜色。随着单词列表变长，它们的表现会变差，

LLMs fail the Stroop task: they are unable to reliably name the color of a word when the word names a different color. They get worse as word lists get longer, and when there are both mismatched and non-mismatched words. Summary: https://www. eurekalert.org/news-releases/1 129812…

链接 eurekalert.org/…/1129812 eurekalert.org/…/1

报道来源 [1]

大型语言模型无法完成斯特鲁普测试：当单词的含义与单词本身所代表的颜色不同时，它们无法可靠地命名颜色。随着单词列表变长，它们的表现会变差，

相关实体

相关话题