Brief · PulseAugur

RESEARCH · arXiv cs.IR (Information Retrieval) · 3d · [2 sources]

HARNESS-LM: A Three-Phase Training Recipe for Harnessing SLMs in Sponsored Search Retrieval

Researchers have developed HARNESS-LM (HLM), a novel three-phase training framework designed to transfer the capabilities of large language models into compact, efficient models for sponsored search retrieval. This method involves training a high-performance "teacher" model, distilling its knowledge into a smaller "student" encoder, and then refining the student for optimal retrieval performance. HLM successfully recovers over 98% of the teacher model's precision while significantly reducing latency and increasing throughput, demonstrating practical efficacy through A/B testing on Bing Ads. AI

IMPACT Enables the deployment of powerful language models in latency-sensitive applications, improving efficiency and performance in areas like sponsored search.

Bing Ads
HARNESS-LM
NVIDIA A100 GPUs
Qwen3-Embedding-4B/8B
Small Language Models