Brief · PulseAugur

RESEARCH · Hugging Face Daily Papers English(EN) · 2w · [4 sources]

Human Psychometric Questionnaires Mischaracterize LLM Behavior

Research indicates that traditional psychometric self-report questionnaires, like the Big-5 personality framework, are not reliable predictors of Large Language Model (LLM) behavior. Studies suggest that more specific, behavior-oriented frameworks, such as the Theory of Planned Behavior, can achieve human-level coherence with LLM responses, but only under certain conditions like shared conversational contexts. Furthermore, an LLM-native psychometric instrument derived from behavioral affordances also failed to predict LLM behavior, highlighting potential confounds in LLM self-reporting and the limitations of current evaluation methods. AI

IMPACT Current psychometric evaluation methods for LLMs are insufficient, necessitating the development of more robust and behavior-specific assessment tools for safe deployment.

LLM
Hugging Face
BFI-44/10
PVQ-40/21
alignment
human psychometric questionnaires
generation-based profiling
theory of planned behavior
Big-5
arXiv