English(EN) Algorithms for Deciding the Safety of States in Fully Observable Non-deterministic Problems: Technical Report

新算法iPI增强了非确定性AI问题中安全状态的决策

作者 PulseAugur 编辑部 · [1 个来源] · 2026-06-29 04:00

研究人员开发了一种名为iPI的新策略迭代算法，该算法改进了现有方法，用于确定非确定性顺序决策问题中状态的安全性。虽然当前领先的算法TarjanSafe在基准测试中很有效，但其最坏情况运行时间可能是指数级的。存在一种线性时间替代算法，但在实践中速度较慢。新的iPI算法在最坏情况运行时间上保证了多项式时间，同时达到了TarjanSafe的最佳情况性能，在某些类型的问题中表现出更优越的可扩展性。 AI

影响为确保AI决策过程的安全性引入了一种更具可扩展性的算法。

排序理由该集群包含一篇学术论文，详细介绍了一种针对特定AI问题的新算法。[lever_c_demoted from research: ic=1 ai=1.0]

在 arXiv cs.AI 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

arXiv cs.AI TIER_1 English(EN) · Johannes Schmalz, Chaahat Jain · 2026-06-29 04:00

Algorithms for Deciding the Safety of States in Fully Observable Non-deterministic Problems: Technical Report

arXiv:2603.15282v2 Announce Type: replace Abstract: Learned action policies are increasingly popular in sequential decision-making, but suffer from a lack of safety guarantees. Recent work introduced a pipeline for testing the safety of such policies under initial-state and actio…

报道来源 [1]

Algorithms for Deciding the Safety of States in Fully Observable Non-deterministic Problems: Technical Report

相关实体

相关话题