新方法通过解决虚假相关性来改进 VLM 的零样本分类

作者 PulseAugur 编辑部 · [1 个来源] · 2026-06-02 04:00

研究人员推出了一种名为密度感知翻译 (DAT) 的新方法，以提高视觉语言模型 (VLM) 的零样本分类能力。DAT 通过使用源自参考集的局部几何密度项来改进图像-文本相似度分数，从而解决虚假相关性问题。该方法根据嵌入密度重新校准分数，提高了对代表性不足群体的准确性，并提高了多模态模型的整体可靠性。 AI

影响提高了多模态模型中零样本分类的可靠性，有可能提高在小众或代表性不足数据上的性能。

排序理由学术论文，介绍了一种改进现有模型的新方法。[lever_c_demoted from research: ic=1 ai=1.0]

在 arXiv cs.LG 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

arXiv cs.LG TIER_1 English(EN) · Afsaneh Hasanebrahimi, Hanxun Huang, Christopher Leckie, Sarah Erfani · 2026-06-02 04:00

Density-Aware Translation of Spurious Correlations in Zero-Shot VLMs

arXiv:2606.01710v1 Announce Type: cross Abstract: Vision-Language models (VLMs), such as CLIP, achieve powerful zero-shot classification. However, their predictions remain sensitive to spurious correlations, where contextual cues dominate over semantic content. Earlier solutions …

报道来源 [1]

Density-Aware Translation of Spurious Correlations in Zero-Shot VLMs

相关实体

相关话题