Brief · PulseAugur

TOOL · arXiv cs.CV English(EN) · 9h

RescueBench: Can Embodied Agents Save Lives in the Wild ?

Researchers have introduced RescueBench, a new benchmark designed to evaluate embodied agents in realistic search-and-rescue scenarios. The benchmark simulates a four-stage pipeline, including exploration, target rescue, memory-guided return, and handoff, to assess how failures compound in complex workflows. Current baseline agents struggle significantly, with autonomous exploration identified as the primary failure mode and spatial memory as a secondary bottleneck. AI

IMPACT This benchmark could drive progress in embodied AI for complex, real-world applications like disaster response.

arXiv
RescueBench