New JAMER benchmark tests AI code generation on game engines

By PulseAugur Editorial · [3 sources] · 2026-06-18 00:00

Researchers have introduced JAMER, a new dataset and benchmark designed to evaluate AI models on project-level code generation within professional game engines. Utilizing data from game jam competitions, JAMER focuses on the Godot engine and includes 8,133 verified projects. The benchmark assesses tasks like theme-driven generation and code completion using metrics such as compilation pass rates, Structural Completeness Score, and Behavioral Alignment Score. Initial evaluations show a significant drop in AI model performance as project complexity increases, highlighting architectural design as a key bottleneck. AI

IMPACT Highlights limitations in current AI code generation for complex project-level tasks, particularly in game development.

RANK_REASON The cluster describes a new dataset and benchmark for AI code generation, presented in an arXiv paper.

Read on arXiv cs.CL →

AI-generated summary · Google Gemini · from 3 sources. How we write summaries →

New JAMER benchmark tests AI code generation on game engines

COVERAGE [3]

arXiv cs.CL TIER_1 English(EN) · Jianwen Sun, Chuanhao Li, Zizhen Li, Yukang Feng, Fanrui Zhang, Yifei Huang, Yu Dai, Kaipeng Zhang · 2026-06-19 04:00

JAMER: Project-Level Code Framework Dataset and Benchmark on Professional Game Engines

arXiv:2606.19830v1 Announce Type: cross Abstract: Current AI-driven game development has made substantial progress in asset generation, gameplay design, and web-based game coding, yet project-level code engineering on professional game engines remains largely unexplored due to th…
arXiv cs.CL TIER_1 English(EN) · Kaipeng Zhang · 2026-06-18 06:17

JAMER: Project-Level Code Framework Dataset and Benchmark on Professional Game Engines

Current AI-driven game development has made substantial progress in asset generation, gameplay design, and web-based game coding, yet project-level code engineering on professional game engines remains largely unexplored due to the absence of large-scale datasets and deterministi…
Hugging Face Daily Papers TIER_1 English(EN) · 2026-06-18 00:00

JAMER: Project-Level Code Framework Dataset and Benchmark on Professional Game Engines

Game development frameworks and benchmarks were created using data from game jam competitions to evaluate code generation and project-level programming capabilities.

COVERAGE [3]

JAMER: Project-Level Code Framework Dataset and Benchmark on Professional Game Engines

JAMER: Project-Level Code Framework Dataset and Benchmark on Professional Game Engines

JAMER: Project-Level Code Framework Dataset and Benchmark on Professional Game Engines

RELATED ENTITIES

RELATED TOPICS