Brief · PulseAugur

TOOL · arXiv cs.AI English(EN) · 9h

Online LLM Selection via Constrained Bandits with Time-Varying Demand

Researchers have developed a novel online learning algorithm to address the challenge of selecting the optimal Large Language Model (LLM) for diverse user tasks in edge-cloud inference systems. The algorithm is designed to handle time-varying task demands and operate under resource constraints, such as monetary expenditure limits and latency guarantees. By leveraging confidence bounds and demand predictions, the approach aims to maximize rewards while ensuring long-term constraint satisfaction, offering theoretical guarantees for sublinear regret and constraint violations compared to offline methods. AI

Hugging Face
arXiv
large-language models
machine learning
Constrained Bandits
Time-Varying Demand