Internalizing Tool Knowledge in Small Language Models via QLoRA Fine-Tuning
Researchers have developed a method to internalize tool knowledge into small language models using QLoRA fine-tuning, reducing the need for explicit tool schemas in prompts. By training models like Gemma 4 E4B and Qwen3-4B on tool-use examples, they achieved better planning scores than a baseline that received full tool descriptions. This approach significantly cuts down on input length and inference overhead while maintaining or improving tool-planning quality, though it may impact general knowledge retention. AI
IMPACT Enables more efficient use of smaller models in agentic systems by reducing prompt token overhead.