New ODE framework boosts multimodal search agents, beats Gemini Pro

By PulseAugur Editorial · Summary by gemini-2.5-flash-lite from 1 source

Researchers have developed a new framework called On-policy Data Evolution (ODE) to improve multimodal deep search agents. This system allows agents to reuse intermediate visual information from search results and dynamically refines training data based on the agent's current learning progress. ODE enhances agent performance across various benchmarks, with significant improvements shown for Qwen3-VL models, surpassing Gemini-2.5 Pro in complex agent-workflow settings. AI

Summary written by gemini-2.5-flash-lite from 1 source. How we write summaries →

IMPACT Enhances multimodal search agent capabilities by enabling better data evolution and visual context reuse, potentially improving performance on complex tasks.

RANK_REASON The cluster contains a research paper detailing a new framework and its performance on benchmarks. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.CL →

COVERAGE [1]

arXiv cs.CL TIER_1 · Yi R. Fung · 2026-05-11 16:49

Towards On-Policy Data Evolution for Visual-Native Multimodal Deep Search Agents

Multimodal deep search requires an agent to solve open-world problems by chaining search, tool use, and visual reasoning over evolving textual and visual context. Two bottlenecks limit current systems. First, existing tool-use harnesses treat images returned by search, browsing, …

COVERAGE [1]

Towards On-Policy Data Evolution for Visual-Native Multimodal Deep Search Agents

RELATED ENTITIES

RELATED TOPICS