English(EN) Policy Regret for Embedding Model Routing: Contextual Bandits with Low-Rank Experts

新算法优化推荐系统中的嵌入模型路由

作者 PulseAugur 编辑部 · [2 个来源] · 2026-06-12 20:09

一篇新研究论文介绍了一种用于优化推荐系统中嵌入模型路由的Hypentropy Policy Gradient (HPG)算法。该论文将此问题形式化为一个具有低秩专家的对抗性上下文线性老虎机问题，解决了对抗性查询和有限模型可观测性等挑战。HPG旨在适应未知的低秩结构，实现\tilde{\mathcal O}(s\sqrt{MT})的策略遗憾，并提供了一种高效、无参数的实现。 AI

排序理由该集群包含一篇在arXiv上发表的研究论文，详细介绍了一种新算法及其理论分析。

在 arXiv cs.AI 阅读 →

AI 生成摘要 · Google Gemini · 来自 2 个来源。我们如何撰写摘要 →

报道来源 [2]

arXiv cs.AI TIER_1 English(EN) · Yan Dai, Negin Golrezaei, Patrick Jaillet · 2026-06-16 04:00

Policy Regret for Embedding Model Routing: Contextual Bandits with Low-Rank Experts

arXiv:2606.14929v1 Announce Type: cross Abstract: Modern recommendation systems increasingly rely on dynamically routing diverse queries to multiple embedding models. Despite its practical significance, this problem remains poorly understood under realistic conditions like advers…
arXiv stat.ML TIER_1 English(EN) · Patrick Jaillet · 2026-06-12 20:09

Policy Regret for Embedding Model Routing: Contextual Bandits with Low-Rank Experts

Modern recommendation systems increasingly rely on dynamically routing diverse queries to multiple embedding models. Despite its practical significance, this problem remains poorly understood under realistic conditions like adversarial queries, bandit feedback, and limited observ…

报道来源 [2]

Policy Regret for Embedding Model Routing: Contextual Bandits with Low-Rank Experts

Policy Regret for Embedding Model Routing: Contextual Bandits with Low-Rank Experts

相关实体

相关话题