English(EN) Do Not Imitate, Reinforce: Iterative Classification via Belief Refinement

新的分类方法使用强化学习来精炼预测

作者 PulseAugur 编辑部 · [2 个来源] · 2026-04-23 23:06

研究人员推出了一种新颖的方法，称为强化迭代分类（RIC），它从模仿标签转向使用强化学习进行分类任务。该方法采用循环代理来迭代地精炼预测，通过提高准确性获得奖励，并提供随时分类的能力。在图像分类基准测试中，RIC 的准确性与监督方法相当，同时还显示出更好的校准和自适应计算分配。 AI

影响引入了一种新的分类范式，可以提高模型的效率和可靠性。

排序理由该集群包含一篇详细介绍新分类方法的学术论文。

AI 生成摘要 · Google Gemini · 来自 2 个来源。我们如何撰写摘要 →

报道来源 [2]

arXiv cs.LG TIER_1 English(EN) · Mahdi Kallel, Johannes T\"olle, Ahmed Hendawy, Carlo D'Eramo · 2026-04-27 04:00

Do Not Imitate, Reinforce: Iterative Classification via Belief Refinement

arXiv:2604.22110v1 Announce Type: new Abstract: Standard supervised classification trains models to imitate the exact labels provided by a perfect oracle. This imitation happens in a single pass, restricting the model to a fixed compute budget even when inputs vary in complexity.…
arXiv cs.LG TIER_1 English(EN) · Carlo D'Eramo · 2026-04-23 23:06

Do Not Imitate, Reinforce: Iterative Classification via Belief Refinement

Standard supervised classification trains models to imitate the exact labels provided by a perfect oracle. This imitation happens in a single pass, restricting the model to a fixed compute budget even when inputs vary in complexity. Moreover, the rigid training objective forces t…