One Policy, Infinite NPCs: Persona-Traceable Shared RL Policies for Scalable Game Agents
Researchers have developed a novel reinforcement learning policy called pcsp, designed to enable scalable and controllable non-player characters (NPCs) in life-simulation games. This single policy is conditioned on LLM embeddings of persona descriptions, allowing for distinct and consistent NPC behaviors. The method significantly outperforms chance in zero-shot persona identification and achieves faster inference times compared to LLM-based policies, demonstrating its viability in commercial game engines. AI
IMPACT Enables more dynamic and controllable NPCs in games, potentially enhancing player immersion and game design possibilities.