PulseAugur
EN
LIVE 13:30:46

LLMs struggle with zero-shot ECG diagnosis, CNNs outperform

A comparative study evaluated the efficacy of zero-shot multimodal large language models (LLMs) against Convolutional Neural Network (CNN) based models for classifying 12-lead ECG images. While LLMs like GPT-5.2, GPT-4.1, and Gemini-2.5 Pro could generate plausible ECG narratives, their zero-shot diagnostic capabilities performed at near-chance levels (ROC-AUC around 0.5). In contrast, a custom-developed physiology-aware CNN, LeadGroupECG, demonstrated stable and reliable discrimination, achieving ROC-AUC scores of 0.92-0.94 internally and 0.85-0.86 externally, highlighting the continued necessity of domain-specific architectures for clinical AI applications. AI

IMPACT Domain-specific CNN architectures remain essential for reliable AI-based ECG interpretation, as current zero-shot multimodal LLMs show limited diagnostic discrimination.

RANK_REASON The cluster reports on a comparative study published in a paper, evaluating the performance of LLMs and CNNs on a specific task (ECG classification). [lever_c_demoted from research: ic=1 ai=1.0]

Read on Hugging Face Daily Papers →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

LLMs struggle with zero-shot ECG diagnosis, CNNs outperform

COVERAGE [1]

  1. Hugging Face Daily Papers TIER_1 English(EN) ·

    Physiology-Aware CNN and Zero-Shot Multimodal LLMs for ECG Image Classification: A Comparative Study

    Multimodal large language models (LLMs) are increasingly adopted to interpret 12-lead ECG images, though the interpretations often lack validation. However, ECG image understanding significantly differs from general images as it depends on precise waveform morphology, lead relati…