Wisdom of Committee: Diverse Distillation from Large Foundation Models and Domain Experts
Researchers have developed a new framework called DiverseDistill to improve knowledge distillation from large foundation models to smaller, domain-specific models. This method uses a committee of diverse teachers, including the foundation model and domain experts, to generate teacher-conditioned queries. By aligning heterogeneous teacher outputs into the student's representation space, DiverseDistill significantly enhances performance, recovering a substantial portion of the performance gap between the student and teacher models. The framework operates with frozen teachers, adding no inference overhead and reducing training costs through a dynamic teacher importance mechanism. AI
IMPACT This research could enable more efficient deployment of large AI models into specialized, resource-constrained applications.