Researchers have introduced HuM-Eval, a new framework designed to better evaluate the quality of human motion in generated videos. Current metrics often miss fine-grained human details, leading to evaluations that don't align with human preferences. HuM-Eval employs a coarse-to-fine approach, first using a Vision Language Model for a general assessment and then analyzing 2D pose for anatomical correctness and 3D motion for stability. This method achieved a 58.2% correlation with human judgment, surpassing existing benchmarks. AI
Summary written by gemini-2.5-flash-lite from 1 source. How we write summaries →
IMPACT Introduces a more accurate method for evaluating human motion in generated videos, potentially guiding future improvements in text-to-video models.
RANK_REASON Academic paper introducing a new evaluation framework for video generation models.