FusionRS: A Large-Scale RGB-Infrared Remote Sensing Dataset for Dual-Modal Vision-Language Foundation Models
Researchers have introduced FusionRS, a novel large-scale dataset designed to advance dual-modal vision-language foundation models in remote sensing. This dataset uniquely combines RGB and infrared imagery with corresponding text captions, addressing the under-exploration of infrared data in current models. Experiments using FusionRS demonstrate improved performance in tasks like RGB-IR alignment and captioning, highlighting the value of modality-specific textual supervision. AI
IMPACT This dataset could enable more sophisticated remote sensing analysis by integrating thermal and visual data, potentially improving applications in environmental monitoring and urban planning.