AI search agents often confirm what they already know instead of actually researching the web
New research indicates that popular AI search agents, including GPT-5.4 and Kimi K2.6, frequently fail to conduct genuine web research. Instead, they tend to confirm information already present in their training data. A novel benchmark, LiveBrowseComp, designed to test knowledge of recent events, revealed significant performance drops when models could not rely on pre-existing memory, leading to a reshuffling of existing performance rankings. AI
IMPACT Highlights limitations in current AI search capabilities, suggesting a need for models that can genuinely access and synthesize real-time information.