New WA* framework achieves zero-shot generalization in AI planning

By PulseAugur Editorial · [2 sources] · 2026-05-25 11:25

Researchers have developed a novel self-improving planning framework called WA* that combines a value heuristic represented by a Relational Graph Neural Network with Q-learning. This approach guides search and uses the resulting data to update the heuristic, enabling it to function as a general policy. The framework demonstrates strong zero-shot generalization capabilities, solving new problem instances without search, which is a significant advancement over traditional Deep Reinforcement Learning methods in sparse-reward domains. The system has shown success on benchmarks like Sokoban, PushWorld, The Witness, and the 2023 International Planning Competition. AI

IMPACT Achieves strong zero-shot generalization in planning tasks, potentially overcoming limitations of current DRL methods.

RANK_REASON The cluster contains an academic paper detailing a new AI research framework and its performance on benchmarks.

Read on arXiv cs.AI →

AI-generated summary · Google Gemini · from 2 sources. How we write summaries →

New WA* framework achieves zero-shot generalization in AI planning

COVERAGE [2]

arXiv cs.AI TIER_1 English(EN) · Michael Aichm\"uller, Yannik Hesse, Hector Geffner · 2026-05-26 04:00

Learning to Search and Searching to Learn for Generalization in Planning

arXiv:2605.25720v1 Announce Type: new Abstract: Combinatorial generalization remains a central challenge in Deep Reinforcement Learning (DRL). Classical planning provides a simple yet challenging setting to study this problem through explicit relational descriptions, without requ…
arXiv cs.AI TIER_1 English(EN) · Hector Geffner · 2026-05-25 11:25

Learning to Search and Searching to Learn for Generalization in Planning

Combinatorial generalization remains a central challenge in Deep Reinforcement Learning (DRL). Classical planning provides a simple yet challenging setting to study this problem through explicit relational descriptions, without requiring learning from perception. In sparse-reward…

COVERAGE [2]

Learning to Search and Searching to Learn for Generalization in Planning

Learning to Search and Searching to Learn for Generalization in Planning

RELATED ENTITIES

RELATED TOPICS