English(EN) Confidence-Aware Tool Orchestration for Robust Video Understanding

新框架 Robust-TO 解决视频理解的“盲信问题” · 已追踪 3 个来源

作者 PulseAugur 编辑部 · [3 个来源] · 2026-06-25 00:00

研究人员开发了 Robust-TO，一个旨在通过解决“盲信问题”来改进视频理解模型的新框架。当模型未能识别输入质量下降时，就会出现此问题，导致准确性显著下降。Robust-TO 将每帧的可信度分数整合到其推理过程中，使其能够更有效地加权证据，即使在输入损坏的情况下也能保持性能。在评估中，Robust-TO 的表现优于开源基线和 Gemini 2.5 Pro，在经受现实扰动时表现出更小的准确率下降。 AI

影响这项研究可能在需要视频分析的应用中带来更可靠的 AI 系统，尤其是在视觉条件不可预测的环境中。

排序理由该集群描述了一篇关于视频理解新颖框架的最新研究论文。

在 Hugging Face Daily Papers 阅读 →

AI 生成摘要 · Google Gemini · 来自 3 个来源。我们如何撰写摘要 →

报道来源 [3]

arXiv cs.AI TIER_1 English(EN) · Yangfan He, Yujin Choi, Jaehong Yoon · 2026-06-26 04:00

Confidence-Aware Tool Orchestration for Robust Video Understanding

arXiv:2606.26904v1 Announce Type: cross Abstract: Video reasoning language models implicitly assume that every input frame is equally reliable. This leads to what we term the Blind Trust Problem: under realistic perturbations such as motion blur, glare, or occlusion, frontier vid…
Hugging Face Daily Papers TIER_1 English(EN) · 2026-06-25 00:00

Confidence-Aware Tool Orchestration for Robust Video Understanding

Robust-TO addresses the Blind Trust Problem in video reasoning by integrating per-frame trustworthiness into an agentic framework that improves accuracy under realistic perturbations through calibrated evidence weighting and reliability-aware reasoning.
arXiv cs.CV TIER_1 English(EN) · Jaehong Yoon · 2026-06-25 11:37

面向鲁棒视频理解的置信度感知工具编排

Video reasoning language models implicitly assume that every input frame is equally reliable. This leads to what we term the Blind Trust Problem: under realistic perturbations such as motion blur, glare, or occlusion, frontier video reasoning models can suffer 15-30%p accuracy dr…

报道来源 [3]

Confidence-Aware Tool Orchestration for Robust Video Understanding

Confidence-Aware Tool Orchestration for Robust Video Understanding

面向鲁棒视频理解的置信度感知工具编排

相关实体

相关话题