Cross-Modal Knowledge Distillation without Paired Data: Theoretical Foundation and Algorithm
Researchers have developed a new framework for cross-modal knowledge distillation (CMKD) that does not require paired data. This method establishes a distributional relationship between teacher and student models, identifying feature and label alignment as key to effective distillation. The proposed framework theoretically guarantees effective knowledge transfer by aligning distributions rather than individual samples, showing significant improvements in both paired and unpaired data scenarios across various benchmarks. AI
IMPACT Enables more efficient training of smaller models from larger ones, even when aligned data is scarce.