Brief

last 24h

[2/2] 222 sources

Multi-source AI news clustered, deduplicated, and scored 0–100 across authority, cluster strength, headline signal, and time decay.

SIGNIFICANT · Hugging Face Trending Models Deutsch(DE) · 6d

ernie-research/NAVA

Baidu's ERNIE Team has released NAVA, a 6.3 billion parameter model capable of generating synchronized audio and video from a single text prompt. NAVA utilizes an Align-then-Fuse MMDiT architecture to achieve state-of-the-art performance on benchmarks like Verse-Bench for audio-visual synchronization and video quality. The model can generate one minute of 720p video with synchronized audio in approximately one minute and offers features like precise multi-timbre control and language-described camera control. AI

IMPACT Sets new SOTA on audio-visual synchronization benchmarks with a smaller parameter count, potentially lowering the barrier for high-quality AV generation.
TOOL · Hugging Face Trending Models Bahasa(ID) · 6d

baidu/NAVA

Baidu has released NAVA, a 6.3 billion parameter model capable of generating synchronized audio and video from a single text prompt. This model utilizes an Align-then-Fuse MMDiT architecture to achieve state-of-the-art performance on audio-visual synchronization benchmarks. NAVA can produce 720p, one-minute videos with stereo audio in approximately one minute and offers precise control over speaker voice timbre. AI

IMPACT Sets new SOTA on audio-visual synchronization benchmarks with a significantly smaller parameter count.