STAR: Rethinking MoE Routing as Structure-Aware Subspace Learning
Researchers have introduced STAR, a novel approach to Mixture-of-Experts (MoE) routing that treats routing as a structure-aware subspace learning problem. Unlike traditional MoE methods that use limited linear projections, STAR incorporates an evolving principal subspace to track dominant input structures, enhancing routing stability and expert specialization. This method has demonstrated improved performance on language and vision tasks, with potential for further robustness through optional test-time subspace updates. AI
IMPACT Improves routing stability and performance in MoE models, potentially leading to more efficient and capable AI systems.