Online Learning for Supervisory Switching Control
Researchers have developed a novel algorithm for supervisory switching control in partially-observed linear dynamical systems. This data-driven approach adapts multi-armed bandit algorithms to a control setting, aiming to identify and deploy the correct controller from a pool of candidates. The algorithm provides finite-time guarantees and can identify the appropriate controller within $O(N \log^2 N)$ steps while simultaneously achieving finite $L_2$-gain. AI