Pocket Foundation Models: Distilling TFMs into CPU-Ready Gradient-Boosted Trees
Researchers have developed a method to distill large tabular foundation models (TFMs) into smaller, faster gradient-boosted tree models that can run on CPUs. This technique addresses the latency issue of TFMs, which are too slow for real-time applications like fraud scoring. By using stratified out-of-fold teacher labeling to prevent label leakage, the distilled models achieve performance close to the original TFMs but with significantly reduced inference times. AI
IMPACT Enables real-time AI applications by significantly reducing inference latency for complex tabular models.