PulseAugur
实时 12:30:18
English(EN) GeoDisaster: Benchmarking Orchestrated Agents for Operational Disaster Geo-Intelligence

新的GeoDisaster基准测试用于测试AI智能体在灾害响应中的能力

研究人员推出了GeoDisaster,这是一个旨在评估和改进编排智能体在操作性灾害地理情报方面能力的新基准测试。该基准测试包含五个任务家族的2,921个实例,整合了多样化的地球观测和GIS数据,用于危险探测和损害评估等任务。配套的多智能体框架采用了一种名为角色-契约期望对齐(RCEA)的新颖对齐技术,以增强灾害响应场景中的工具使用和决策能力。 AI

影响 该基准测试有望推动AI智能体在灾害响应和地理情报等现实世界应用中的能力发展。

排序理由 该集群描述了一个用于评估AI智能体的新的学术基准测试及相关框架,已在arXiv上发布。

在 arXiv cs.CV 阅读 →

AI 生成摘要 · Google Gemini · 来自 2 个来源。 我们如何撰写摘要 →

报道来源 [2]

  1. arXiv cs.MA (Multiagent) TIER_1 English(EN) · Biplab Banerjee ·

    GeoDisaster: Benchmarking Orchestrated Agents for Operational Disaster Geo-Intelligence

    Remote-sensing vision-language models (RS-VLMs) have advanced Earth-observation analysis toward visual interpretation and instruction-following, yet fall short of operational geo-intelligence, which demands tool-grounded spatial reasoning and structured, evidence-backed decisions…

  2. arXiv cs.CV TIER_1 English(EN) · Maram Hasan, Aman Verma, Savitra Roy, Hariseetharam Gunduboina, Daksh Jain, Muhammad Haris Khan, Subhasis Chaudhuri, Biplab Banerjee ·

    GeoDisaster: Benchmarking Orchestrated Agents for Operational Disaster Geo-Intelligence

    arXiv:2606.17246v1 Announce Type: new Abstract: Remote-sensing vision-language models (RS-VLMs) have advanced Earth-observation analysis toward visual interpretation and instruction-following, yet fall short of operational geo-intelligence, which demands tool-grounded spatial rea…