PulseAugur
EN
LIVE 09:10:28

New JAMER benchmark tests AI code generation on game engines

Researchers have introduced JAMER, a new dataset and benchmark designed to evaluate AI models on project-level code generation within professional game engines. Utilizing data from game jam competitions, JAMER focuses on the Godot engine and includes 8,133 verified projects. The benchmark assesses tasks like theme-driven generation and code completion using metrics such as compilation pass rates, Structural Completeness Score, and Behavioral Alignment Score. Initial evaluations show a significant drop in AI model performance as project complexity increases, highlighting architectural design as a key bottleneck. AI

IMPACT Highlights limitations in current AI code generation for complex project-level tasks, particularly in game development.

RANK_REASON The cluster describes a new dataset and benchmark for AI code generation, presented in an arXiv paper.

Read on arXiv cs.CL →

AI-generated summary · Google Gemini · from 3 sources. How we write summaries →

New JAMER benchmark tests AI code generation on game engines

COVERAGE [3]

  1. arXiv cs.CL TIER_1 English(EN) · Jianwen Sun, Chuanhao Li, Zizhen Li, Yukang Feng, Fanrui Zhang, Yifei Huang, Yu Dai, Kaipeng Zhang ·

    JAMER: Project-Level Code Framework Dataset and Benchmark on Professional Game Engines

    arXiv:2606.19830v1 Announce Type: cross Abstract: Current AI-driven game development has made substantial progress in asset generation, gameplay design, and web-based game coding, yet project-level code engineering on professional game engines remains largely unexplored due to th…

  2. arXiv cs.CL TIER_1 English(EN) · Kaipeng Zhang ·

    JAMER: Project-Level Code Framework Dataset and Benchmark on Professional Game Engines

    Current AI-driven game development has made substantial progress in asset generation, gameplay design, and web-based game coding, yet project-level code engineering on professional game engines remains largely unexplored due to the absence of large-scale datasets and deterministi…

  3. Hugging Face Daily Papers TIER_1 English(EN) ·

    JAMER: Project-Level Code Framework Dataset and Benchmark on Professional Game Engines

    Game development frameworks and benchmarks were created using data from game jam competitions to evaluate code generation and project-level programming capabilities.