English(EN) One Policy, Infinite NPCs: Persona-Traceable Shared RL Policies for Scalable Game Agents

新的强化学习策略可实现游戏中可扩展、由个性驱动的NPC

作者 PulseAugur 编辑部 · [2 个来源] · 2026-05-22 14:04

研究人员开发了一种名为pcsp的新型强化学习策略，旨在实现生命模拟游戏中可扩展且可控的非玩家角色（NPC）。这一单一策略以个性描述的LLM嵌入为条件，可实现独特且一致的NPC行为。该方法在零样本个性识别方面显著优于随机水平，并且与基于LLM的策略相比，推理速度更快，证明了其在商业游戏引擎中的可行性。 AI

影响使游戏中的NPC更加动态和可控，可能增强玩家的沉浸感和游戏设计的可能性。

排序理由发表了一篇详细介绍游戏代理新方法的学术论文。

在 arXiv cs.AI 阅读 →

AI 生成摘要 · Google Gemini · 来自 2 个来源。我们如何撰写摘要 →

报道来源 [2]

arXiv cs.AI TIER_1 English(EN) · Yoosung Hong · 2026-05-25 04:00

One Policy, Infinite NPCs: Persona-Traceable Shared RL Policies for Scalable Game Agents

arXiv:2605.23652v1 Announce Type: new Abstract: On a 300-persona life-simulation benchmark, pcsp achieves compositional zero-shot persona identification up to 17x above chance, Spearman rho approx 0.73 semantic-behavioral alignment, and 22x faster inference than an LLM-as-policy …
arXiv cs.AI TIER_1 English(EN) · Yoosung Hong · 2026-05-22 14:04

One Policy, Infinite NPCs: Persona-Traceable Shared RL Policies for Scalable Game Agents

On a 300-persona life-simulation benchmark, pcsp achieves compositional zero-shot persona identification up to 17x above chance, Spearman rho approx 0.73 semantic-behavioral alignment, and 22x faster inference than an LLM-as-policy baseline. Life simulation games require hundreds…

报道来源 [2]

One Policy, Infinite NPCs: Persona-Traceable Shared RL Policies for Scalable Game Agents

One Policy, Infinite NPCs: Persona-Traceable Shared RL Policies for Scalable Game Agents

相关实体

相关话题