PulseAugur
EN
LIVE 09:18:07

LLMs show promise in PTSD severity estimation with enhanced context

A new study published on arXiv evaluates the performance of 11 large language models (LLMs) in estimating PTSD severity from clinical narratives. The research found that LLMs perform best when provided with detailed contextual information, such as subscale definitions and interview questions, and that increased reasoning effort improves accuracy. Open-weight models like Llama and DeepSeek showed performance plateaus beyond 70B parameters, while closed-weight models like gpt-o3-mini and GPT-5 continued to improve with newer generations. The study also demonstrated that LLMs could differentiate PTSD severity from other conditions and predict future healthcare expenditure. AI

IMPACT LLMs demonstrate potential for clinical utility in mental health assessment, particularly with enhanced contextual knowledge and reasoning strategies.

RANK_REASON Research paper published on arXiv detailing LLM performance on a specific task. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.CL →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

COVERAGE [1]

  1. arXiv cs.CL TIER_1 English(EN) · Panagiotis Kaliosis, Adithya V Ganesan, Oscar N. E. Kjell, Whitney Ringwald, Scott Feltman, Melissa A. Carr, Dimitris Samaras, Camilo Ruggero, Benjamin J. Luft, Roman Kotov, Andrew H. Schwartz ·

    A Systematic Evaluation of Large Language Models for PTSD Severity Estimation: The Role of Contextual Knowledge and Modeling Strategies

    arXiv:2602.06015v2 Announce Type: replace Abstract: Large language models (LLMs) are increasingly being used in a zero-shot (generative) fashion to assess mental health conditions, yet we have limited knowledge on what factors affect their accuracy. In this study, we use a clinic…