New algorithm uses user retrials to optimize LLM routing and scheduling

By PulseAugur Editorial · [1 sources] · 2026-06-30 04:00

Researchers have developed a new algorithm called ACQB (anytime CQB) to improve the routing and scheduling of queries to Large Language Models (LLMs). This algorithm leverages implicit feedback from user retrial behaviors, rather than explicit ratings, to learn user preferences and optimize LLM assignment. ACQB aims to maintain queue stability and reduce cumulative regret in conversational LLM services, showing promising results in experiments on synthetic data, offline datasets, and real user logs. AI

IMPACT This research could lead to more efficient and stable LLM services by optimizing query handling and reducing user wait times.

RANK_REASON The cluster contains a research paper detailing a new algorithm for LLM routing and scheduling. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.LG →

paper
infra

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

New algorithm uses user retrials to optimize LLM routing and scheduling

COVERAGE [1]

arXiv cs.LG TIER_1 English(EN) · Seoungbin Bae, Junyoung Son, Dabeen Lee · 2026-06-30 04:00

Learning to Route and Schedule LLMs from User Retrials via Contextual Queueing Bandits

arXiv:2602.02061v2 Announce Type: replace Abstract: Explosive demands for LLMs often cause user queries to accumulate in server queues, requiring efficient routing (query-LLM matching) and scheduling (query prioritization) mechanisms. Several online algorithms are being deployed,…

COVERAGE [1]

Learning to Route and Schedule LLMs from User Retrials via Contextual Queueing Bandits

RELATED ENTITIES

RELATED TOPICS