Brief · PulseAugur

TOOL · The Decoder English(EN) · 2d

AI search agents often confirm what they already know instead of actually researching the web

New research indicates that popular AI search agents, including GPT-5.4 and Kimi K2.6, frequently fail to conduct genuine web research. Instead, they tend to confirm information already present in their training data. A novel benchmark, LiveBrowseComp, designed to test knowledge of recent events, revealed significant performance drops when models could not rely on pre-existing memory, leading to a reshuffling of existing performance rankings. AI

IMPACT Highlights limitations in current AI search capabilities, suggesting a need for models that can genuinely access and synthesize real-time information.

GPT-5.4
Kimi K2.6
Harbin Institute of Technology
LiveBrowseComp