中文(ZH) ICML 2026：从输入输出样例中自动生成程序——强化学习为大模型Programming-By-Example任务提供推理过程监督

New Framework Enhances LLMs for Program Synthesis from Examples

By PulseAugur Editorial · [1 sources] · 2026-06-16 05:47

Researchers have developed a novel framework called PRM-PBE to enhance the ability of large language models (LLMs) in Programming-by-Example (PBE) tasks. This method addresses the limitation of current LLMs in PBE, which often struggle with inferring underlying program logic from limited input-output examples due to a lack of fine-grained supervision on intermediate reasoning processes. PRM-PBE utilizes a process reward model (PRM) trained on feedback-guided reasoning trees to evaluate the reliability of intermediate steps, combined with a three-stage curriculum learning approach and PPO optimization for program synthesis. Experiments across multiple benchmarks demonstrated significant improvements over existing methods, even when using advanced models like DeepSeek-Coder-V2 and Claude-3.5-Sonnet. AI

IMPACT Enhances LLM program synthesis by providing intermediate reasoning supervision, potentially improving reliability in complex coding tasks.

RANK_REASON The cluster describes a new research paper and framework for improving LLM performance on a specific task (PBE), including experimental validation. [lever_c_demoted from research: ic=1 ai=1.0]

Read on 雷峰网 (Leiphone) →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

New Framework Enhances LLMs for Program Synthesis from Examples

COVERAGE [1]

雷峰网 (Leiphone) TIER_1 中文(ZH) · 2026-06-16 05:47

ICML 2026: Automatically Generating Programs from Input-Output Examples - Reinforcement Learning Provides Reasoning Process Supervision for Large Model Programming-By-Example Tasks

<section label="edit by 135editor"><section><section style="margin: 10px auto;"><section><section style="display: flex;"><section><section style="width: 8px;"><svg viewBox="0 0 13.99 22" xmlns="http://www.w3.org/2000/svg"><g><g><path d="M0,22V18.08l6.89-4.26,4.39-2.75v-.19L6.89,8…

COVERAGE [1]

ICML 2026: Automatically Generating Programs from Input-Output Examples - Reinforcement Learning Provides Reasoning Process Supervision for Large Model Programming-By-Example Tasks

RELATED ENTITIES

RELATED TOPICS