A new study published on arXiv reveals that large language models exhibit gender-based bias in medical triage recommendations. When presented with identical neurological symptoms, models like Gemini 3.5 Flash, Claude Sonnet 4.6, and GPT-5.4-mini assigned lower urgency to young women compared to age-matched men. This disparity stems from diagnostic substitution, where models favor gender-associated conditions, leading to less urgent care recommendations for female patients despite comparable symptom severity. AI
IMPACT Reveals critical biases in AI medical tools, necessitating careful design to avoid perpetuating health inequities.
RANK_REASON The cluster contains a research paper detailing findings on LLM bias.
AI-generated summary · Google Gemini · from 2 sources. How we write summaries →