Researchers distill large AI models into faster CPU-ready gradient-boosted trees

By PulseAugur Editorial · [2 sources] · 2026-05-18 17:00

Researchers have developed a method to distill large tabular foundation models (TFMs) into smaller, faster gradient-boosted tree models that can run on CPUs. This technique addresses the latency issue of TFMs, which are too slow for real-time applications like fraud scoring. By using stratified out-of-fold teacher labeling to prevent label leakage, the distilled models achieve performance close to the original TFMs but with significantly reduced inference times. AI

IMPACT Enables real-time AI applications by significantly reducing inference latency for complex tabular models.

RANK_REASON The cluster contains an academic paper detailing a new method for model distillation.

Read on arXiv cs.AI →

AI-generated summary · Google Gemini · from 2 sources. How we write summaries →

COVERAGE [2]

arXiv cs.AI TIER_1 English(EN) · Pratinav Seth · 2026-05-18 17:00

Pocket Foundation Models: Distilling TFMs into CPU-Ready Gradient-Boosted Trees

A fraud scorer needs to answer in under 2 ms. The best tabular foundation models (TFMs) take 151-1,275 ms on GPU. We close this gap by distilling the TFM offline into an XGBoost or CatBoost student that runs natively on CPU. The central obstacle is specific to in-context learning…
Hugging Face Daily Papers TIER_1 English(EN) · 2026-05-18 17:00

Pocket Foundation Models: Distilling TFMs into CPU-Ready Gradient-Boosted Trees

A fraud scorer needs to answer in under 2 ms. The best tabular foundation models (TFMs) take 151-1,275 ms on GPU. We close this gap by distilling the TFM offline into an XGBoost or CatBoost student that runs natively on CPU. The central obstacle is specific to in-context learning…

COVERAGE [2]

Pocket Foundation Models: Distilling TFMs into CPU-Ready Gradient-Boosted Trees

Pocket Foundation Models: Distilling TFMs into CPU-Ready Gradient-Boosted Trees

RELATED ENTITIES

RELATED TOPICS