English(EN) ProMSA:Progressive Multimodal Search Agents for Knowledge-Based Visual Question Answering

ProMSA代理推动知识库视觉问答发展

作者 PulseAugur 编辑部 · [3 个来源] · 2026-06-26 00:00

研究人员开发了ProMSA，一种用于知识库视觉问答（KB-VQA）的新型代理。与使用固定检索管道的先前方法不同，ProMSA根据工具调用预算和去重情况，自适应地选择图像搜索、文本搜索或停止。该代理使用拒绝采样SFT和一种称为TN-GSPO的序列级RL目标进行训练。在E-VQA和InfoSeek数据集上的实验表明，与现有的RAG和代理基线相比，ProMSA在检索和端到端准确性方面有所提高。 AI

影响推动了多模态任务的基于代理的推理，有可能改进复杂的جست information retrieval systems。

排序理由发布了一篇详细介绍新型AI代理及其方法论的研究论文。

在 arXiv cs.AI 阅读 →

AI 生成摘要 · Google Gemini · 来自 3 个来源。我们如何撰写摘要 →

报道来源 [3]

arXiv cs.AI TIER_1 English(EN) · ZhengXian Wu, Hangrui Xu, Kai Shi, Zhuohong Chen, Yunyao Yu, Chuanrui Zhang, Zirui Liao, Jun Yang, Zhenyu Yang, Haonan Lu, Haoqian Wang · 2026-06-29 04:00

ProMSA:Progressive Multimodal Search Agents for Knowledge-Based Visual Question Answering

arXiv:2606.27974v1 Announce Type: cross Abstract: Knowledge-based Visual Question Answering (KB-VQA) requires models to combine image understanding with external knowledge. Most prior methods use a fixed retrieve-then-generate pipeline with a pre-selected retriever and a static t…
arXiv cs.AI TIER_1 English(EN) · Haoqian Wang · 2026-06-26 11:23

ProMSA：面向知识密集型视觉问答的渐进式多模态搜索代理

Knowledge-based Visual Question Answering (KB-VQA) requires models to combine image understanding with external knowledge. Most prior methods use a fixed retrieve-then-generate pipeline with a pre-selected retriever and a static top-k setting, which is not adaptive during reasoni…
Hugging Face Daily Papers TIER_1 English(EN) · 2026-06-26 00:00

ProMSA:Progressive Multimodal Search Agents for Knowledge-Based Visual Question Answering

A progressive multimodal search agent for knowledge-based visual question answering that adaptively selects search strategies and optimizes through sequence-level reinforcement learning.

报道来源 [3]

ProMSA:Progressive Multimodal Search Agents for Knowledge-Based Visual Question Answering

ProMSA：面向知识密集型视觉问答的渐进式多模态搜索代理

ProMSA:Progressive Multimodal Search Agents for Knowledge-Based Visual Question Answering

相关实体

相关话题