Brief · PulseAugur

TOOL · arXiv cs.CV English(EN) · 7h

FUSAR-GPT : A Spatiotemporal Feature-Embedded and Two-Stage Decoupled Visual Language Model for SAR Imagery

Researchers have developed FUSAR-GPT, a novel Visual Language Model (VLM) specifically designed for Synthetic Aperture Radar (SAR) imagery. This model addresses the limitations of existing VLMs in interpreting SAR data by incorporating a geospatial baseline model for world knowledge and embedding spatiotemporal remote-sensing features. FUSAR-GPT utilizes a two-stage strategy to decouple knowledge injection and task execution, leading to state-of-the-art performance on remote sensing benchmarks, outperforming current models by over 10%. AI

IMPACT Enhances AI capabilities for all-weather, all-time remote sensing and opens new avenues for SAR data interpretation.

remote sensing
Visual Language Models
FUSAR-GPT
Xiaokun Zhang
SAR imagery