Brief · PulseAugur

TOOL · arXiv cs.CL English(EN) · 4d

LLM Agents Already Know When to Call Tools -- Even Without Reasoning

Researchers have developed a new benchmark called When2Tool to evaluate when Large Language Model (LLM) agents should use external tools. The benchmark reveals that LLMs possess an internal understanding of tool necessity, detectable in their hidden states, but fail to act on this knowledge during generation. A proposed method, Probe&Prefill, leverages this internal signal to significantly reduce unnecessary tool calls with minimal accuracy loss, outperforming existing baselines. AI

IMPACT Improves LLM agent efficiency by reducing unnecessary tool calls, potentially lowering costs and latency for AI applications.

LLM Agents
Probe&Prefill
When2Tool
Chung En Sun