Brief · PulseAugur

RESEARCH · arXiv cs.AI English(EN) · 22h · [2 sources]

Decoupling Semantics from Distortions: Multi-Scale Two-Stream Vision-Language Alignment for AI-Generated Image Quality Assessment

Researchers have introduced MST-CLIPIQA, a novel multi-scale two-stream framework designed to improve AI-generated image quality assessment. This method decouples semantic understanding from perceptual sensitivity, using dual CLIP encoders with different patch granularities to capture both global coherence and fine-grained artifact patterns. An adaptive fusion mechanism then distills this information, leading to state-of-the-art results on five benchmarks for both image quality and text-image correspondence. AI

IMPACT Establishes new state-of-the-art in AI-generated image quality assessment, potentially improving the evaluation of generative models.

Hugging Face
arXiv
DagsHub
alphaXiv
ScienceCast
CatalyzeX
Gotit.pub
Influence Flower
MST-CLIPIQA