A new research paper published on arXiv investigates the effectiveness of phase-localized curation for filtering manipulation demonstrations in reinforcement learning. The study found that applying curation metrics within specific temporal phases of a task did not improve performance and, in some cases, led to worse results compared to applying metrics globally. The research suggests that concentrating defect signals in a single phase can be diluted by aggregating scores across defect-free phases, and that per-phase metric selection is not transferable across different tasks. AI
排序理由 The cluster contains a research paper published on arXiv detailing experimental results. [lever_c_demoted from research: ic=1 ai=1.0]
AI 生成摘要 · Google Gemini · 来自 1 个来源。 我们如何撰写摘要 →