Brief · PulseAugur

RESEARCH · Hugging Face Trending Models (CA) · 5d · [2 sources]

meituan-longcat/LongCat-Video-Avatar-1.5

Meituan-Longcat has released LongCat-Video-Avatar 1.5, an open-source framework for audio-driven human video generation. This upgraded version features an improved Whisper-Large audio encoder for more natural lip-syncing and enhanced stability for consistent identity and temporal coherence. The model supports various tasks like AT2V and ATI2V, generalizes to diverse styles including anime and animals, and offers efficient 8-step inference. AI

IMPACT Enables creation of diverse avatar videos from audio, potentially impacting content creation and virtual interactions.

Hugging Face
meituan-longcat
Whisper-Large
LongCat-Video-Avatar 1.5