SCALE: Scalable Cross-Attention Learning with Extrapolation for Agentic Workflow Scheduling
Researchers have developed SCALE, a new deep reinforcement learning scheduler designed for agentic LLM systems that can manage tasks across heterogeneous clusters of varying sizes. Unlike previous schedulers that require retraining for different cluster configurations, SCALE uses a cross-attention pointer network to generalize to unseen cluster scales without fine-tuning. By incorporating Structured Representation Regularization (SRR), which includes a decorrelation loss and a KL penalty, SCALE maintains stable feature statistics and achieves an 8.9% reduction in average response time when tested on larger clusters than it was trained on. AI
IMPACT This new scheduling method could improve the efficiency of LLM-based agentic systems by allowing them to adapt to varying computational resources without retraining.