Routing by Analogy: kNN-Augmented Expert Assignment for Mixture-of-Experts
Researchers have developed a new routing mechanism for Mixture-of-Experts (MoE) models called kNN-MoE. This approach uses a memory of past routing decisions to dynamically assign tokens to experts, improving robustness against distribution shifts. The system leverages k-nearest neighbors to find similar past cases and uses their similarity as a confidence score for mixing expert assignments. Experiments indicate that kNN-MoE performs better than standard zero-shot methods and is comparable to supervised fine-tuning. AI
IMPACT Enhances efficiency and robustness of MoE models by improving token routing mechanisms.