English(EN) Can Crowdsourcing Survive the LLM Era? A Community Survey on Human Data Collection

研究人员发现大型语言模型挑战众包数据质量

作者 PulseAugur 编辑部 · [1 个来源] · 2026-06-04 04:00

一项对155名自然语言处理及相关领域研究人员的最新调查显示，大型语言模型（LLMs）的日益普及对众包数据的质量构成了重大挑战。尽管44%的受访者在其收集的数据中观察到了LLM的使用，但许多人仍不确定有效的缓解策略。研究界认为，目前诸如识别风格模式和快速完成时间等检测方法不足以完全解决该问题。 AI

影响在数据收集中使用LLM会威胁到研究数据集的完整性，因此需要新的检测和缓解策略。

排序理由该集群包含一篇学术论文，详细介绍了关于特定研究挑战的调查结果。[lever_c_demoted from research: ic=1 ai=1.0]

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

arXiv cs.CL TIER_1 English(EN) · Aswathy Velutharambath, Neele Falk, Sofie Labat, Tarun Tater, Amelie Wuehrl · 2026-06-04 04:00

Can Crowdsourcing Survive the LLM Era? A Community Survey on Human Data Collection

arXiv:2606.04924v1 Announce Type: new Abstract: The widespread use of Large Language Models (LLMs) as writing tools challenges the validity of crowdsourced data, as crowdworkers may outsource tasks to models. To better understand how this is addressed, we surveyed 155 researchers…