Brief · PulseAugur

RESEARCH · arXiv cs.CV English(EN) · 1mo

EAD-Net: Emotion-Aware Talking Head Generation with Spatial Refinement and Temporal Coherence

Researchers have developed EAD-Net, a novel diffusion model designed for generating expressive talking head videos with accurate lip synchronization and emotional facial expressions. The model incorporates SyncNet supervision and Temporal Representation Alignment to prevent lip-sync degradation when integrating semantic information. EAD-Net also features a Spatio-Temporal Directional Attention mechanism for capturing global motion in long videos and a Temporal Frame graph Reasoning Module to ensure frame-to-frame coherence. AI

IMPACT Introduces a new method for generating more semantically rich and temporally coherent talking head videos, potentially improving applications in virtual avatars and content creation.

EAD-Net
SyncNet
MEAD