English(EN) Your Self-Play Algorithm is Secretly an Adversarial Imitator: Understanding LLM Self-Play through the Lens of Imitation Learning

LLM自玩与对抗性模仿学习相关联

作者 PulseAugur 编辑部 · [1 个来源] · 2026-06-09 04:00

研究人员已将大型语言模型的自玩微调方法与其对抗性模仿学习联系起来。他们将微调过程构建为一个最小-最大博弈，统一了自玩模仿和偏好对齐。这一理论框架表明自玩微调会收敛到一个均衡点，从而提出了一种新算法，该算法在稳定性和性能上优于现有方法。 AI

影响为自玩微调提供了理论基础，有望带来更稳定有效的LLM对齐技术。

排序理由这是一篇详细介绍LLM微调新理论框架和算法的研究论文。

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

arXiv cs.LG TIER_1 English(EN) · Shangzhe Li, Xuchao Zhang, Chetan Bansal, Weitong Zhang · 2026-06-09 04:00

你的自玩算法秘密地是一个对抗模仿者：从模仿学习的视角理解LLM自玩

arXiv:2602.01357v2 Announce Type: replace Abstract: Self-play post-training methods has emerged as an effective approach for finetuning large language models and turn the weak language model into strong language model without preference data. However, the theoretical foundations …