English(EN) UHR-Micro: Diagnosing and Mitigating the Resolution Illusion in Earth Observation VLMs

新基准揭示VLM难以处理高分辨率地球观测细节

作者 PulseAugur 编辑部 · [1 个来源] · 2026-05-12 15:07

研究人员推出了UHR-Micro，这是一个旨在评估视觉语言模型（VLM）在感知超高分辨率地球观测图像中微小关键细节能力的新基准。当前的VLM经常遭受“分辨率错觉”，即高输入分辨率并不能转化为对微尺度目标的可靠感知。该基准包含超过11,000条指令和1,200张图像，揭示了现有模型在空间定位和证据解析方面存在重大缺陷。为解决此问题，该团队开发了微证据主动感知（MAP）代理，通过将推理集中在局部观测而非整个高分辨率图像上，来提高感知能力。 AI

影响凸显了当前VLM在高清图像中感知关键微观细节方面的局限性，推动了对更以证据为中心的推理代理的研究。

排序理由该集群描述了一篇介绍基准和拟议代理以评估AI模型的新学术论文。[lever_c_demoted from research: ic=1 ai=1.0]

在 arXiv cs.CV 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

arXiv cs.CV TIER_1 English(EN) · Bo Du · 2026-05-12 15:07

UHR-Micro: Diagnosing and Mitigating the Resolution Illusion in Earth Observation VLMs

Vision-Language Models (VLMs) increasingly operate on ultra-high-resolution (UHR) Earth observation imagery, yet they remain vulnerable to a severe scale mismatch between large-scale scene context and micro-scale targets. We refer to this empirical gap as a "resolution illusion":…

报道来源 [1]

UHR-Micro: Diagnosing and Mitigating the Resolution Illusion in Earth Observation VLMs

相关实体

相关话题