Brief · PulseAugur

TOOL · arXiv cs.CV English(EN) · 8h

HSQ-VLM: A Novel Spatially-Constrained Quadrant Segmentation VLM Model for Explainability in Diabetic Retinopathy

Researchers have developed HSQ-VLM, a new vision-language model designed to improve the explainability of AI diagnostics for diabetic retinopathy. This model uses a novel quadrant segmentation pipeline with Landmark-Anchored Cartesian Cross-Attention and Topological Latent Partitioning to align retinal features with a fovea-centered coordinate system. The HSQ-VLM generates precise natural language reports by quantifying pathology with anatomical accuracy, achieving high sensitivity in detecting hemorrhages and microaneurysms on a dataset of 3,500 fundus images. AI

IMPACT This research offers a path toward more interpretable AI diagnostics in healthcare, potentially increasing trust and adoption of AI in clinical settings for conditions like diabetic retinopathy.

vision-language model
diabetic retinopathy
Fundus images analysis using deep features for detection of exudates, hemorrhages and microaneurysms
HSQ-VLM
Landmark-Anchored Cartesian Cross-Attention
Topological Latent Partitioning
bleeding