Brief · PulseAugur

TOOL · arXiv cs.CV · 3d

Seizure-Semiology-Suite (S3): A Clinically Multimodal Dataset, Benchmark, and Models for Seizure Semiology Understanding

Researchers have developed the Seizure-Semiology-Suite (S3), a new dataset and benchmark designed to evaluate multimodal large language models (MLLMs) on their ability to understand complex seizure semiology from video. The S3 dataset contains 438 seizure videos with over 35,000 labels, supporting a seven-task benchmark that assesses various aspects of MLLM performance, from visual perception to clinical reporting. Initial evaluations of 11 open-weight MLLMs revealed significant weaknesses in areas like laterality reasoning and temporal localization, though seizure-specific fine-tuning showed promise for improvement. AI

IMPACT Establishes a new benchmark for evaluating multimodal AI in safety-critical medical video analysis, guiding development for clinical reliability.

multimodal large language models
Seizure-Semiology-Suite