Researchers have developed AesRM, a new family of reward models designed to improve the aesthetic quality of generated videos. This system breaks down video aesthetics into three dimensions: Visual Aesthetics, Visual Fidelity, and Visual Plausibility, with 15 specific criteria. AesRM utilizes expert feedback from a dataset of 2500 video pairs to train models that can predict preferences and generate interpretable reasoning. The models were trained through a three-stage process, including atomic aesthetic capability learning and reinforcement learning, and have shown improved performance and robustness compared to existing methods. Additionally, AesRM has been used to enhance the video generation model Wan2.2, resulting in noticeable aesthetic improvements. AI
影响 Introduces a new framework and models for evaluating and improving video generation aesthetics, potentially impacting content creation tools.
排序理由 This is a research paper detailing a new model and benchmark for video aesthetics.
AI 生成摘要 · Google Gemini · 来自 2 个来源。 我们如何撰写摘要 →