PulseAugur
EN
LIVE 11:57:48

Meituan-Longcat releases open-source avatar video generator

Meituan-Longcat has released LongCat-Video-Avatar 1.5, an open-source framework for audio-driven human video generation. This upgraded version features an improved Whisper-Large audio encoder for more natural lip-syncing and enhanced stability for consistent identity and temporal coherence. The model supports various tasks like AT2V and ATI2V, generalizes to diverse styles including anime and animals, and offers efficient 8-step inference. AI

IMPACT Enables creation of diverse avatar videos from audio, potentially impacting content creation and virtual interactions.

RANK_REASON The cluster describes the release of an open-source model framework with technical details and evaluation metrics.

Read on Hugging Face Trending Models →

AI-generated summary · Google Gemini · from 2 sources. How we write summaries →

Meituan-Longcat releases open-source avatar video generator

COVERAGE [2]

  1. Hugging Face Trending Models TIER_1 (CA) · meituan-longcat ·

    meituan-longcat/LongCat-Video-Avatar-1.5

    0 downloads · 132 likes

  2. r/StableDiffusion TIER_2 English(EN) · /u/Turbulent_Corner9895 ·

    LongCat-Video-Avatar 1.5 Release

    <table> <tr><td> <a href="https://www.reddit.com/r/StableDiffusion/comments/1tm5oxh/longcatvideoavatar_15_release/"> <img alt="LongCat-Video-Avatar 1.5 Release" src="https://preview.redd.it/j7ay6s16j13h1.png?width=640&amp;crop=smart&amp;auto=webp&amp;s=e2eac6efeee2e3d8dc34d88c058…