PulseAugur
实时 20:45:27
English(EN) SSR-Zero: Simple Self-Rewarding Reinforcement Learning for Machine Translation

新的强化学习框架通过自奖励和新词感知方法推进机器翻译

研究人员开发了SSR-Zero,一个新颖的用于机器翻译的强化学习框架,它消除了对外部人工标注数据或预训练奖励模型的需求。通过利用自评估奖励和Qwen-2.5-7B骨干模型,SSR-Zero在英汉翻译任务上取得了优于现有模型的性能。通过外部监督进行的进一步增强,如在SSR-X-Zero-7B中所见,已实现了最先进的性能,超越了开源和闭源的替代品。 AI

影响 引入了用于机器翻译的自奖励强化学习,可能减少对昂贵的人工监督的依赖并提高翻译质量。

排序理由 该集群描述了关于新颖机器翻译框架和数据集的新学术论文。

在 arXiv cs.CL 阅读 →

AI 生成摘要 · Google Gemini · 来自 2 个来源。 我们如何撰写摘要 →

新的强化学习框架通过自奖励和新词感知方法推进机器翻译

报道来源 [2]

  1. arXiv cs.CL TIER_1 English(EN) · Wenjie Yang, Mao Zheng, Mingyang Song, Zheng Li, Sitong Wang ·

    SSR-Zero: Simple Self-Rewarding Reinforcement Learning for Machine Translation

    arXiv:2505.16637v4 Announce Type: replace Abstract: Large language models (LLMs) have recently demonstrated remarkable capabilities in machine translation (MT). However, most advanced MT-specific LLMs heavily rely on external supervision signals during training, such as human-ann…

  2. arXiv cs.CL TIER_1 English(EN) · Zhongtao Miao, Kaiyan Zhao, Masaaki Nagata, Yoshimasa Tsuruoka ·

    NeoAMT: Neologism-Aware Agentic Machine Translation with Reinforcement Learning

    arXiv:2601.03790v3 Announce Type: replace Abstract: Neologism-aware machine translation aims to translate source sentences containing neologisms into target languages. This field remains underexplored compared with general machine translation (MT). In this paper, we propose an ag…