HASTE: Hardware-Aware Dynamic Sparse Training for Large Output Spaces
Researchers have developed HASTE, a novel method for optimizing extreme multi-label classification (XMC) models. HASTE addresses the bottleneck in XMC by introducing group-shared fixed fan-in sparsity, which allows semantically related labels to share sparse input patterns. This approach enhances hardware utilization and enables efficient GPU execution through custom CUDA kernels, leading to significant speedups in forward and backward passes compared to existing sparse methods. AI
IMPACT Introduces a new technique to improve efficiency and performance in extreme multi-label classification tasks.