Brief · PulseAugur

TOOL · arXiv cs.AI English(EN) · 10h

TerraMind: Large-Scale Generative Multimodality for Earth Observation

Researchers have introduced TerraMind, a novel multimodal foundation model designed for Earth observation tasks. This model uniquely combines token-level and pixel-level data representations, allowing it to capture both high-level contextual information and fine-grained spatial details. TerraMind demonstrates strong zero-shot and few-shot learning capabilities, introduces a new technique called "Thinking-in-Modalities" (TiM) for data augmentation during fine-tuning and inference, and achieves state-of-the-art performance on benchmarks like PANGAEA. The model, its pretraining dataset, and associated code are publicly available under a permissive license. AI

IMPACT Introduces a new multimodal foundation model for Earth observation, potentially advancing capabilities in geospatial data analysis and application.

Hugging Face
arXiv
DagsHub
alphaXiv
ScienceCast
CatalyzeX
Gotit.pub
TerraMind
Pangaea
Johannes Jakubik
Thinking-in-Modalities