Baidu's DuMate agent has achieved top rankings on two key benchmarks, PinchBench and DeepResearch Bench. On PinchBench, which evaluates multi-step reasoning and tool use in real-world scenarios, DuMate secured the top two positions, surpassing models from Anthropic and OpenAI. The agent's success is attributed to its end-to-end collaborative Harness architecture, which intelligently handles tasks locally or in the cloud and optimizes context assembly. DuMate also led the DeepResearch Bench, designed for complex research tasks, showcasing its advanced information retrieval and analysis capabilities. AI
Summary written by gemini-2.5-flash-lite from 2 sources. How we write summaries →
IMPACT Demonstrates advanced agent capabilities, potentially setting new standards for AI task execution and research.
RANK_REASON Product release and benchmark performance announcement for an AI agent.