LLMs challenge crowdsourced data quality, researchers find

By PulseAugur Editorial · [1 sources] · 2026-06-04 04:00

A recent survey of 155 researchers in NLP and related fields reveals that the increasing use of LLMs poses a significant challenge to the quality of crowdsourced data. While 44% of respondents have observed LLM usage in their collected data, many are unsure of effective mitigation strategies. Current detection methods, such as identifying stylistic patterns and rapid completion times, are considered insufficient by the research community to fully address the issue. AI

IMPACT LLM use in data collection threatens the integrity of research datasets, necessitating new detection and mitigation strategies.

RANK_REASON The cluster contains an academic paper detailing survey results on a specific research challenge. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.CL →

paper
safety

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

COVERAGE [1]

arXiv cs.CL TIER_1 English(EN) · Aswathy Velutharambath, Neele Falk, Sofie Labat, Tarun Tater, Amelie Wuehrl · 2026-06-04 04:00

Can Crowdsourcing Survive the LLM Era? A Community Survey on Human Data Collection

arXiv:2606.04924v1 Announce Type: new Abstract: The widespread use of Large Language Models (LLMs) as writing tools challenges the validity of crowdsourced data, as crowdworkers may outsource tasks to models. To better understand how this is addressed, we surveyed 155 researchers…

COVERAGE [1]

Can Crowdsourcing Survive the LLM Era? A Community Survey on Human Data Collection

RELATED ENTITIES

RELATED TOPICS