Researchers have developed an agentic LLM framework designed for large-scale mental health screening, which uses a policy-guided evaluation system to ensure trustworthiness and adaptability in clinical settings. A separate study evaluated the reliability of existing LLMs for mental health screening, testing their consistency, robustness to speech recognition errors, and faithfulness to evidence. The findings indicate that while some models like Phi-4 and Gemma-2-9B maintain high consistency and predictive validity even with speech recognition inaccuracies, others like Llama-3.1-8B are significantly more fragile. AI
影响 LLMs show potential for scalable mental health screening but require careful validation due to varying reliability and robustness to errors.
排序理由 Two academic papers presenting novel research and evaluations of LLMs for mental health applications.
AI 生成摘要 · Google Gemini · 来自 2 个来源。 我们如何撰写摘要 →