English(EN) Sentinel2Cap: A Human-Annotated Benchmark Dataset for Multimodal Remote Sensing Image Captioning

深度学习模型增强卫星数据，用于预测和图像字幕生成

作者 PulseAugur 编辑部 · [3 个来源] · 2026-05-04 22:16

研究人员推出了 Sentinel2Cap，这是一个新的人工标注数据集，专为多模态遥感图像字幕生成而设计。该数据集包含 Sentinel-1 SAR 和 Sentinel-2 多光谱图像块，填补了现有卫星数据字幕资源中的空白。使用 Qwen3-VL-8B-Instruct 模型进行的初步评估表明，虽然 RGB 图像的字幕生成性能更好，但 SAR 图像对当前的视觉语言模型提出了更大的挑战。 AI

影响引入了一个新的数据集，以推进多模态遥感图像字幕生成领域的研究，特别是针对 SAR 数据。

排序理由该集群描述了一个在 arXiv 上发布的多模态遥感图像字幕生成新基准数据集。

在 arXiv cs.CV 阅读 →

AI 生成摘要 · Google Gemini · 来自 3 个来源。我们如何撰写摘要 →

报道来源 [3]

arXiv cs.CV TIER_1 English(EN) · V\'eronique Defonte, Dawa Derksen, Alexandre Constantin, Bastien Nespoulous · 2026-05-07 04:00

使用深度生成模型对多模态SAR和光学卫星数据的Sentinel-2时间序列进行稠密化和预测

arXiv:2605.04239v1 Announce Type: new Abstract: Optical satellite image time series are extensively used in many Earth observation applications, including agriculture, climate monitoring, and land surface analysis. However, clouds and swath edges result in irregular sampling alon…
arXiv cs.CV TIER_1 English(EN) · Lucrezia Tosato, Gianluca Lombardi, Ronny Hansch · 2026-05-06 04:00

Sentinel2Cap：用于多模态遥感图像字幕的人工标注基准数据集

arXiv:2605.03189v1 Announce Type: new Abstract: Image captioning has become an important task in computer vision, enabling models to generate natural language descriptions of visual content. While several datasets exist for natural images and high-resolution optical remote sensin…
arXiv cs.CV TIER_1 English(EN) · Ronny Hansch · 2026-05-04 22:16

Sentinel2Cap：用于多模态遥感图像字幕生成的人工标注基准数据集

Image captioning has become an important task in computer vision, enabling models to generate natural language descriptions of visual content. While several datasets exist for natural images and high-resolution optical remote sensing imagery, the availability of captioning datase…

报道来源 [3]

使用深度生成模型对多模态SAR和光学卫星数据的Sentinel-2时间序列进行稠密化和预测

Sentinel2Cap：用于多模态遥感图像字幕的人工标注基准数据集

Sentinel2Cap：用于多模态遥感图像字幕生成的人工标注基准数据集

相关实体

相关话题