Brief

last 24h

[4/4] 224 sources

Multi-source AI news clustered, deduplicated, and scored 0–100 across authority, cluster strength, headline signal, and time decay.

RESEARCH · arXiv cs.LG English(EN) · 3d · [2 sources]

Online Convex Optimization with Sublinear Noisy Probes

Researchers have developed a new framework for Online Convex Optimization (OCO) that can improve worst-case regret even with a limited and noisy budget of pairwise probes. The proposed method unifies sublinear best-expert queries and pairwise feedback, showing that a sublinear, noisy probe budget can provably enhance regret in the full feedback OCO regime. The analysis quantifies the benefit of probing through variance reduction and a second-order analysis of Continuous Exponential Weights, yielding tight regret guarantees. AI
RESEARCH · arXiv stat.ML English(EN) · 3d · [2 sources]

Optimal Hidden-Target Learning for Online Inventory Optimization on General Convex Sets

Researchers have developed a novel principle for online inventory optimization (OIO) that achieves optimal performance on general convex sets. This method, which involves maintaining a hidden target and projecting it onto the feasible order-up-to set, improves regret guarantees for OIO and offers the first polylogarithmic regret for strongly convex losses. The analysis introduces a 'norm alignment' principle, reducing the problem to one-dimensional queue control, which has been validated through experiments on both synthetic and real-world inventory data. AI

IMPACT This research advances theoretical understanding in online learning and optimization, potentially impacting future inventory management systems.
- Online Convex Optimization
- Online Inventory Optimization
RESEARCH · arXiv cs.AI English(EN) · 2w · [2 sources]

Trivium: Temporal Regret as a First-Class Objective for Causal-Memory Controllers

Two new research papers explore advanced methods for improving AI agent decision-making and learning. The first paper, "Trivium," introduces temporal regret as a key objective for causal-memory controllers, aiming to log and correct errors more effectively than outcome-based methods. The second paper, "Parameter-free Dynamic Regret," presents a novel algorithm for online convex optimization that handles time-varying movement costs, delayed feedback, and memory, achieving improved dynamic regret bounds. AI

IMPACT These papers propose new theoretical frameworks for AI agents, potentially leading to more robust and efficient learning systems that can better handle complex, dynamic environments.
TOOL · arXiv cs.LG English(EN) · 1mo

Polyhedral Instability Governs Regret in Online Learning

Researchers have developed a new theoretical framework for understanding regret in online learning problems involving combinatorial actions. Their work introduces the concept of 'polyhedral instability,' which quantifies the number of changes in the active region during decision-making. This instability is shown to govern the regret rate, interpolating between existing expert-like and dimension-dependent bounds. AI

IMPACT Introduces a new theoretical lens for analyzing online learning algorithms, potentially improving their efficiency in combinatorial decision problems.

Brief

Online Convex Optimization with Sublinear Noisy Probes

Optimal Hidden-Target Learning for Online Inventory Optimization on General Convex Sets

Trivium: Temporal Regret as a First-Class Objective for Causal-Memory Controllers

Polyhedral Instability Governs Regret in Online Learning