Researchers have developed MSEarth, a new multimodal benchmark designed to evaluate the capabilities of multimodal large language models (MLLMs) in Earth science reasoning. This dataset comprises over 289,000 figures with detailed captions and contextual discussions, drawn from open-access scientific publications across the five major Earth science spheres. MSEarth supports tasks like figure captioning, multiple-choice questions, and open-ended reasoning, aiming to provide a high-fidelity resource for advancing MLLMs in scientific discovery. AI
影响 Establishes a new benchmark for MLLMs in scientific reasoning, potentially accelerating AI applications in Earth science research.
排序理由 This is a research paper introducing a new benchmark dataset for evaluating multimodal large language models in Earth science. [lever_c_demoted from research: ic=1 ai=1.0]
AI 生成摘要 · Google Gemini · 来自 1 个来源。 我们如何撰写摘要 →