A new study evaluated large language models, specifically Gemini Pro, against mental health professionals in diagnosing personality disorders from autobiographical narratives. While the LLMs demonstrated higher overall diagnostic scores, particularly for Borderline Personality Disorder, they significantly underdiagnosed Narcissistic Personality Disorder. The models provided detailed, pattern-focused justifications, contrasting with the human experts' more concise and patient-centered approach, highlighting potential biases and reliability concerns in LLM clinical assessments. AI
影响 LLMs show potential in clinical narrative analysis but require careful validation due to bias and reliability issues.
排序理由 Academic paper evaluating LLM performance against human experts on a specific clinical task.
AI 生成摘要 · Google Gemini · 来自 1 个来源。 我们如何撰写摘要 →