Global Cross-Modal Geo-Localization: A Million-Scale Dataset and a Physical Consistency Learning Framework
Researchers have introduced CORE, a new dataset containing over a million cross-view images from six continents, designed to advance cross-modal geo-localization. This dataset aims to overcome the limitations of previous studies by offering a vast range of global architectural styles and topographic features. To process this data, they also developed a physical-law-aware network called PLANET, which uses contrastive learning to improve the accuracy of matching text descriptions with geo-tagged aerial imagery. AI
IMPACT Enhances AI's ability to precisely locate objects and places globally using visual and textual data.