Researchers have introduced LithoBench, a new benchmark designed to evaluate the capabilities of large multimodal models in interpreting geological lithology from remote sensing data. This benchmark includes 10,000 expert-annotated instances across 12 lithological categories, structured into five cognitive levels from basic identification to complex reasoning. Experiments using LithoBench have revealed significant limitations in current large multimodal models, particularly in their ability to perform higher-order geological explanation, application, and reasoning tasks. AI
影响 This benchmark will help researchers identify and address the shortcomings of large multimodal models in specialized domains like geology.
排序理由 The cluster contains a new academic paper introducing a novel benchmark for evaluating AI models. [lever_c_demoted from research: ic=1 ai=1.0]
AI 生成摘要 · Google Gemini · 来自 1 个来源。 我们如何撰写摘要 →