HARNESS-LM: A Three-Phase Training Recipe for Harnessing SLMs in Sponsored Search Retrieval
Researchers have developed HARNESS-LM (HLM), a novel three-phase training framework designed to transfer the capabilities of large language models into compact, efficient models for sponsored search retrieval. This method involves training a high-performance "teacher" model, distilling its knowledge into a smaller "student" encoder, and then refining the student for optimal retrieval performance. HLM successfully recovers over 98% of the teacher model's precision while significantly reducing latency and increasing throughput, demonstrating practical efficacy through A/B testing on Bing Ads. AI
IMPACT Enables the deployment of powerful language models in latency-sensitive applications, improving efficiency and performance in areas like sponsored search.