Researchers have developed a novel post-hoc framework called Decoupled Test-time Synthesis (DoTS) to integrate Supervised Fine-Tuning (SFT) and Reinforcement Learning from Human Feedback (RLHF) for large language models. This method addresses the challenges of catastrophic forgetting and gradient conflicts that arise from sequential or joint training of these two paradigms. DoTS synthesizes the capabilities of independently trained SFT and RLHF checkpoints at inference time using task vector arithmetic, significantly reducing computational cost and avoiding parameter updates. AI
影响 Enables more efficient integration of SFT and RLHF, potentially improving LLM performance on diverse tasks without extensive retraining.
排序理由 The cluster contains an arXiv preprint detailing a new method for integrating SFT and RLHF.
AI 生成摘要 · Google Gemini · 来自 2 个来源。 我们如何撰写摘要 →