PulseAugur
实时 14:10:01

New LithoBench benchmark reveals large multimodal model limitations

Researchers have introduced LithoBench, a new benchmark designed to evaluate the capabilities of large multimodal models in interpreting geological lithology from remote sensing data. This benchmark includes 10,000 expert-annotated instances across 12 lithological categories, structured into five cognitive levels from basic identification to complex reasoning. Experiments using LithoBench have revealed significant limitations in current large multimodal models, particularly in their ability to perform higher-order geological explanation, application, and reasoning tasks. AI

影响 This benchmark will help researchers identify and address the shortcomings of large multimodal models in specialized domains like geology.

排序理由 The cluster contains a new academic paper introducing a novel benchmark for evaluating AI models. [lever_c_demoted from research: ic=1 ai=1.0]

在 arXiv cs.CV 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。 我们如何撰写摘要 →

New LithoBench benchmark reveals large multimodal model limitations

报道来源 [1]

  1. arXiv cs.CV TIER_1 English(EN) · Wei Han ·

    LithoBench: Benchmarking Large Multimodal Models for Remote-Sensing Lithology Interpretation

    Remote sensing lithology interpretation is fundamental to geological surveys, mineral exploration, and regional geological mapping. Unlike general land-cover recognition, lithology interpretation is a knowledge-intensive task that requires experts to infer rock types from various…