English(EN) Benchmarking and Enhancing VLM for Compressed Image Understanding

新的基准测试评估视觉语言模型在压缩图像上的性能

作者 PulseAugur 编辑部 · [1 个来源] · 2026-05-25 04:00

研究人员开发了一个新的基准测试，用于评估视觉语言模型（VLMs）在低比特率压缩图像上的理解能力。研究发现，性能下降是由于压缩过程中的信息丢失和VLMs的泛化失败。为了解决这个问题，提出了一种通用的VLM适配器，该适配器在各种压缩编解码器和比特率下均显示出VLM性能提高10-30%。 AI

影响这项研究可以提高在需要图像压缩的场景中VLMs的效率和适用性。

排序理由学术论文，介绍用于评估压缩图像上VLM性能的新基准测试和增强方法。[lever_c_demoted from research: ic=1 ai=1.0]

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

arXiv cs.CV TIER_1 English(EN) · Zifu Zhang, Tongda Xu, Siqi Li, Shengxi Li, Yue Zhang, Mai Xu, Yan Wang · 2026-05-25 04:00

压缩图像理解的视觉语言模型基准测试与增强

arXiv:2512.20901v2 Announce Type: replace Abstract: With the rapid development of Vision-Language Models (VLMs) and the growing demand for their applications, efficient compression of the image inputs has become increasingly important. Existing VLMs predominantly digest and under…