Researchers have developed a new framework called On-policy Data Evolution (ODE) to improve multimodal deep search agents. This system allows agents to reuse intermediate visual information from search results and dynamically refines training data based on the agent's current learning progress. ODE enhances agent performance across various benchmarks, with significant improvements shown for Qwen3-VL models, surpassing Gemini-2.5 Pro in complex agent-workflow settings. AI
Summary written by gemini-2.5-flash-lite from 1 source. How we write summaries →
IMPACT Enhances multimodal search agent capabilities by enabling better data evolution and visual context reuse, potentially improving performance on complex tasks.
RANK_REASON The cluster contains a research paper detailing a new framework and its performance on benchmarks. [lever_c_demoted from research: ic=1 ai=1.0]