PulseAugur
实时 17:14:37
Deutsch(DE) Distilling Game Code World Model Generation into Lightweight Large Language Models

LLM提炼用于代码生成;基准测试评估执行潜力

研究人员正在探索将大语言模型(LLM)的代码生成能力提炼到更小、更易于访问的模型中的方法。一项研究专注于为AI代理生成“游戏代码世界模型”(GameCWMs),使用精选数据集和新颖的训练流程来改进Qwen2.5-3B-Instruct等较小模型。另一篇论文回顾了基于LLM的代码生成任务的趋势、挑战和未来方向,强调了现实世界泛化、鲁棒性和评估有效性方面的问题。第三项研究工作引入了SURGE,这是一个旨在评估LLM作为通用替代代码执行器在各种编程任务和复杂性方面的潜力的基准。 AI

影响 新的基准测试和提炼方法可以使先进的代码生成在AI开发中更加易于访问和可靠。

排序理由 该集群包含三篇arXiv论文,讨论了LLM在代码生成、世界模型创建和执行预测方面的能力,包括新的基准测试和提炼技术。

在 arXiv cs.AI 阅读 →

AI 生成摘要 · Google Gemini · 来自 3 个来源。 我们如何撰写摘要 →

报道来源 [3]

  1. arXiv cs.AI TIER_1 Deutsch(DE) · Tyrone Serapio, Arjun Prakash, Haoyang Xu, Kevin Wang, Amy Greenwald ·

    Distilling Game Code World Model Generation into Lightweight Large Language Models

    arXiv:2605.24375v1 Announce Type: new Abstract: Large Language Models (LLMs) have shown great ability in generating executable code from natural language, opening the possibility of automatically constructing environments for AI agents. Recent work on Code World Models (CWMs) dem…

  2. arXiv cs.AI TIER_1 English(EN) · Muslim Chochlov, Michael English, Jim Buckley ·

    A Tertiary Review of Large Language Model-Based Code Generating Tasks: Trends, Challenges, and Future Directions

    arXiv:2605.25536v1 Announce Type: cross Abstract: Context. Large language models (LLMs) are increasingly applied to code-generating tasks (CGTs) in software engineering. While reported results are promising, the broader effects of such application and their integration into real-…

  3. arXiv cs.CL TIER_1 English(EN) · Bohan Lyu, Siqiao Huang, Zichen Liang ·

    SURGE: On the Potential of Large Language Models as General-Purpose Surrogate Code Executors

    arXiv:2502.11167v5 Announce Type: replace-cross Abstract: Neural surrogate models are powerful and efficient tools in data mining. Meanwhile, large language models (LLMs) have demonstrated remarkable capabilities in code-related tasks, such as generation and understanding. Howeve…