English(EN) Off-Policy Evaluation with Strategic Agents via Local Disclosure

新的OPE方法考虑了策略性智能体的行为

作者 PulseAugur 编辑部 · [2 个来源] · 2026-06-05 14:24

研究人员开发了一种新的离轨策略评估（OPE）方法，该方法考虑了会根据决策者策略修改其行为的策略性智能体。该方法解决了策略依赖的协变量偏移问题，该问题会破坏标准的OPE假设。所提出的技术通过事后解释进行局部披露，以揭示策略前的协变量，从而能够构建策略价值的双重稳健估计量。 AI

影响在具有策略性智能体的场景中引入了一种新颖的统计方法来评估策略，有可能改善复杂系统中的决策。

排序理由该集群包含一篇详细介绍离轨策略评估新方法的学术论文。

AI 生成摘要 · Google Gemini · 来自 2 个来源。我们如何撰写摘要 →

报道来源 [2]

arXiv cs.AI TIER_1 English(EN) · Kiet Q. H. Vo, Abbavaram Gowtham Reddy, Julian Rodemann, Siu Lun Chau, Krikamol Muandet · 2026-06-08 04:00

通过局部披露的战略性智能体进行离策略评估

arXiv:2606.07308v1 Announce Type: new Abstract: We study off-policy evaluation (OPE) under strategic behavior where decision subjects (or agents) respond to a decision maker's policy by strategically modifying their covariates. Such behavior induces a policy-dependent covariate s…
arXiv cs.AI TIER_1 English(EN) · Krikamol Muandet · 2026-06-05 14:24

通过局部披露的战略性智能体进行离策略评估

We study off-policy evaluation (OPE) under strategic behavior where decision subjects (or agents) respond to a decision maker's policy by strategically modifying their covariates. Such behavior induces a policy-dependent covariate shift, breaking the standard assumption in existi…