PulseAugur
实时 12:58:02
English(EN) PaSBench-Video: A Streaming Video Benchmark for Proactive Safety Warning

新基准测试 AI 视频模型的主动安全警告能力

研究人员开发了 PaSBench-Video,这是一个旨在评估视频多模态大语言模型(MLLMs)主动安全警告能力的新基准。该基准包含 740 个跨越驾驶、医疗保健、日常生活和工业生产的视频,并标注了风险发生和事故边界。对 13 个 MLLMs 的测试表明,当前模型在时间校准和误报率方面存在困难,这表明它们依赖于场景级线索而非真正的危害推理。 AI

影响 凸显了当前 AI 视频分析在安全应用中的局限性,表明需要能够推理潜在危害而非仅仅是场景活动的模型。

排序理由 该集群包含一篇介绍用于评估 AI 模型的新基准的研究论文。

在 arXiv cs.AI 阅读 →

AI 生成摘要 · Google Gemini · 来自 2 个来源。 我们如何撰写摘要 →

报道来源 [2]

  1. arXiv cs.AI TIER_1 English(EN) · Yusong Zhao, Yuejin Xie, Youliang Yuan, Junjie Hu, Jitian Guo, Yujiu Yang, Pinjia He ·

    PaSBench-Video: A Streaming Video Benchmark for Proactive Safety Warning

    arXiv:2606.02443v1 Announce Type: cross Abstract: Between the first visible sign of danger and the moment an accident occurs, there is often a window where intervention remains possible. Video-capable multimodal large language models (MLLMs) could serve as always-on safety monito…

  2. arXiv cs.AI TIER_1 English(EN) · Pinjia He ·

    PaSBench-Video: A Streaming Video Benchmark for Proactive Safety Warning

    Between the first visible sign of danger and the moment an accident occurs, there is often a window where intervention remains possible. Video-capable multimodal large language models (MLLMs) could serve as always-on safety monitors that issue warnings during this window. Yet cur…