Deutsch(DE) Distilling Game Code World Model Generation into Lightweight Large Language Models

LLM提炼用于代码生成；基准测试评估执行潜力

作者 PulseAugur 编辑部 · [4 个来源] · 2026-05-26 04:00

研究人员正在探索将大语言模型（LLM）的代码生成能力提炼到更小、更易于访问的模型中的方法。一项研究专注于为AI代理生成“游戏代码世界模型”（GameCWMs），使用精选数据集和新颖的训练流程来改进Qwen2.5-3B-Instruct等较小模型。另一篇论文回顾了基于LLM的代码生成任务的趋势、挑战和未来方向，强调了现实世界泛化、鲁棒性和评估有效性方面的问题。第三项研究工作引入了SURGE，这是一个旨在评估LLM作为通用替代代码执行器在各种编程任务和复杂性方面的潜力的基准。 AI

影响新的基准测试和提炼方法可以使先进的代码生成在AI开发中更加易于访问和可靠。

排序理由该集群包含三篇arXiv论文，讨论了LLM在代码生成、世界模型创建和执行预测方面的能力，包括新的基准测试和提炼技术。

在 arXiv cs.AI 阅读 →

AI 生成摘要 · Google Gemini · 来自 4 个来源。我们如何撰写摘要 →

报道来源 [4]

arXiv cs.AI TIER_1 Dansk(DA) · Yash Akhauri, Xingyou Song, Arissa Wongpanich, Bryan Lewandowski, Mohamed S. Abdelfattah · 2026-05-28 04:00

用于代码的回归语言模型

arXiv:2509.26476v2 Announce Type: replace-cross Abstract: We study code-to-metric regression: predicting numeric outcomes of code executions, a challenging task due to the open-ended nature of programming languages. While prior methods have resorted to heavy and domain-specific f…
arXiv cs.AI TIER_1 Deutsch(DE) · Tyrone Serapio, Arjun Prakash, Haoyang Xu, Kevin Wang, Amy Greenwald · 2026-05-26 04:00

将游戏代码世界模型生成提炼到轻量级大语言模型中

arXiv:2605.24375v1 Announce Type: new Abstract: Large Language Models (LLMs) have shown great ability in generating executable code from natural language, opening the possibility of automatically constructing environments for AI agents. Recent work on Code World Models (CWMs) dem…
arXiv cs.AI TIER_1 English(EN) · Muslim Chochlov, Michael English, Jim Buckley · 2026-05-26 04:00

大型语言模型代码生成任务的第三方评述：趋势、挑战与未来方向

arXiv:2605.25536v1 Announce Type: cross Abstract: Context. Large language models (LLMs) are increasingly applied to code-generating tasks (CGTs) in software engineering. While reported results are promising, the broader effects of such application and their integration into real-…
arXiv cs.CL TIER_1 English(EN) · Bohan Lyu, Siqiao Huang, Zichen Liang · 2026-05-26 04:00

SURGE：大型语言模型作为通用替代代码执行器的潜力

arXiv:2502.11167v5 Announce Type: replace-cross Abstract: Neural surrogate models are powerful and efficient tools in data mining. Meanwhile, large language models (LLMs) have demonstrated remarkable capabilities in code-related tasks, such as generation and understanding. Howeve…

报道来源 [4]

用于代码的回归语言模型

将游戏代码世界模型生成提炼到轻量级大语言模型中

大型语言模型代码生成任务的第三方评述：趋势、挑战与未来方向

SURGE：大型语言模型作为通用替代代码执行器的潜力

相关实体

相关话题