New Ex-Omni Model Integrates 3D Facial Animation with LLMs

By PulseAugur Editorial · [1 sources] · 2026-06-12 04:00

Researchers have developed Ex-Omni, an open-source model designed to integrate 3D facial animation generation with omni-modal large language models (OLLMs). This model addresses the challenge of bridging LLMs' discrete reasoning with the continuous dynamics of facial motion by using speech units for temporal structure and hidden speech representations for facial cues. Ex-Omni aims to improve human-computer interaction by enabling OLLMs to produce synchronized speech and 3D facial animations, demonstrating faster generation and better audio-visual synchronization compared to existing cascaded methods. AI

IMPACT Enables more natural human-computer interaction by synchronizing LLM-generated speech with 3D facial animations.

RANK_REASON Research paper detailing a new model for multimodal generation. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.AI →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

COVERAGE [1]

arXiv cs.AI TIER_1 English(EN) · Haoyu Zhang, Zhipeng Li, Yiwen Guo, Tianshu Yu · 2026-06-12 04:00

Ex-Omni: Enabling 3D Facial Animation Generation for Omni-modal Large Language Models

arXiv:2602.07106v2 Announce Type: replace-cross Abstract: Omni-modal large language models (OLLMs) aim to unify multimodal understanding and generation, yet extending them to jointly produce speech and 3D facial animation remains largely unexplored despite its importance for natu…

COVERAGE [1]

Ex-Omni: Enabling 3D Facial Animation Generation for Omni-modal Large Language Models

RELATED TOPICS