A new paper investigates social-desirability bias in LLM annotators used for computational social science. Researchers found that three open-source models (Zephyr, Mistral-Instruct, and Qwen2.5-Instruct) exhibit different types of bias, such as leniency or overcorrection in labeling harmful content. The study also revealed that common prompting techniques do not effectively mitigate these biases and can sometimes exacerbate them, highlighting the need for more robust validation methods in CSS research. AI
RANK_REASON The cluster contains an academic paper detailing research findings on LLM bias. [lever_c_demoted from research: ic=1 ai=1.0]
AI-generated summary · Google Gemini · from 1 sources. How we write summaries →