Brief

last 24h

[3/3] 224 sources

Multi-source AI news clustered, deduplicated, and scored 0–100 across authority, cluster strength, headline signal, and time decay.

TOOL · arXiv cs.CV English(EN) · 8h

SpatialAvatar-0: High-Quality 4D Head Avatar with Multi-Stage Reconstruction

Researchers have introduced SpatialAvatar-0, a novel method for generating high-quality 4D head avatars from limited source images. This approach utilizes a shared FLAME-mesh-bound Gaussian representation, enabling both generalizable feed-forward prediction and efficient per-subject refinement. SpatialAvatar-0 achieves state-of-the-art results on cross-domain benchmarks, outperforming existing methods like GAGAvatar and GeoAvatar with significantly reduced computational requirements. AI

IMPACT Advances 4D head avatar generation, potentially improving telepresence and AR/VR applications with more efficient and higher-quality results.
RESEARCH · arXiv cs.CV English(EN) · 2w · [2 sources]

CogPortrait: Fine-Grained Eye-Region Control in Portrait Animation via Hierarchical Agent Planning

Researchers have introduced CogPortrait, a novel two-stage framework designed for generating portrait animations with fine-grained control over the eye region. This system utilizes three chain-of-thought Multimodal Large Language Models (MLLMs) agents to translate high-level labels into detailed facial keypoints. A DiT-based video generation backbone then synthesizes the animation, incorporating advanced techniques for enhanced visual quality and identity consistency, particularly in challenging boundary cases. AI

IMPACT This research introduces a novel approach to portrait animation, potentially improving the realism and expressiveness of AI-generated characters by offering more precise control over facial features like the eyes.
RESEARCH · arXiv cs.CV English(EN) · 1mo

EAD-Net: Emotion-Aware Talking Head Generation with Spatial Refinement and Temporal Coherence

Researchers have developed EAD-Net, a novel diffusion model designed for generating expressive talking head videos with accurate lip synchronization and emotional facial expressions. The model incorporates SyncNet supervision and Temporal Representation Alignment to prevent lip-sync degradation when integrating semantic information. EAD-Net also features a Spatio-Temporal Directional Attention mechanism for capturing global motion in long videos and a Temporal Frame graph Reasoning Module to ensure frame-to-frame coherence. AI

IMPACT Introduces a new method for generating more semantically rich and temporally coherent talking head videos, potentially improving applications in virtual avatars and content creation.
- EAD-Net
- SyncNet
- MEAD

Brief

SpatialAvatar-0: High-Quality 4D Head Avatar with Multi-Stage Reconstruction

CogPortrait: Fine-Grained Eye-Region Control in Portrait Animation via Hierarchical Agent Planning

EAD-Net: Emotion-Aware Talking Head Generation with Spatial Refinement and Temporal Coherence