Thompson sampling
PulseAugur coverage of Thompson sampling — every cluster mentioning Thompson sampling across labs, papers, and developer communities, ranked by signal.
4 天有情绪数据
-
New algorithm balances user reward with statistical accuracy in experiments
Researchers have developed a new algorithm called TS-PostDiff that aims to improve the balance between user benefit and statistical accuracy in online experiments. Traditional methods like uniform random assignment are …
-
New research advances contextual bandit algorithms for dynamic and complex environments
Researchers are exploring advanced techniques for contextual bandit problems, focusing on improving regret bounds and handling dynamic environments. One paper introduces a retry-aware bandit algorithm that aims to optim…
-
New 'Delight-gated exploration' algorithm optimizes vast action spaces
Researchers have introduced Delight-gated exploration (DE), a novel algorithm designed to optimize decision-making in scenarios with vast action spaces. DE prioritizes exploratory actions based on their potential "delig…
-
New algorithm Anchor-TS improves offline-to-online learning
Researchers have developed a new algorithm called Sample-Mean Anchored Thompson Sampling (Anchor-TS) to improve offline-to-online learning. This method addresses the challenge of distribution shift between offline and o…
-
New methods boost LLM code generation efficiency and theory
Researchers have developed new methods for improving Large Language Model (LLM) code generation efficiency. One approach, Planning-after-Trial (PaT), adaptively invokes a planner only when an initial generation attempt …
-
DARTS method optimizes covariate acquisition for budget-constrained sequential experiments
Researchers have developed DARTS (Dynamic Adaptive Rerandomization via Thompson Sampling), a novel method for optimizing covariate acquisition in budget-constrained sequential experiments. This approach treats the proce…
-
New algorithm tackles scalable policy learning under network interference
Researchers have developed a new Thompson sampling algorithm designed to optimize policy impact in dynamic networks where interference occurs. This algorithm addresses the scalability limitations of existing methods, wh…
-
New AI framework 'Bayesian Reflex' unifies online learning with autonomic nervous system analogy
A new paper introduces the "Bayesian reflex" as a framework for online learning in AI, drawing an analogy to the autonomic nervous system. This approach uses probabilistic representations, Bayes' theorem for sequential …
-
Thompson Sampling for Bayesian Optimization with Preferential Feedback Analyzed
Researchers have developed a new Thompson Sampling approach for Bayesian optimization that utilizes preferential feedback, such as pairwise comparisons, instead of scalar scores. This method models comparisons through a…
-
Eugene Yan recaps RecSys conferences, highlighting AI advancements in recommendation systems.
Eugene Yan's RecSys 2022 recap highlights a significant increase in industry submissions and a focus on algorithmic advancements and real-world applications. Key papers explored efficient training for sequential recomme…