MM-DiT
PulseAugur coverage of MM-DiT — every cluster mentioning MM-DiT across labs, papers, and developer communities, ranked by signal.
1 天有情绪数据
-
New method improves AI portrait generation by balancing alignment, realism, and aesthetics
Researchers have developed a new method to improve human portrait generation in text-to-image diffusion models, addressing the common trade-offs between text-image alignment, realism, and aesthetics. Their approach uses…
-
Galaxy General LDA-1B模型统一多样化数据,迎来具身AI的GPT-2时刻
Galaxy General LDA 推出了 LDA-1B,一个拥有 16 亿参数的模型,旨在统一具身 AI 的多样化数据源利用。该模型采用了新颖的世界-动作融合方法,使其能够从广泛的数据中学习,包括虚拟模拟、真实世界镜头,甚至噪声或未标记的输入。通过打破数据孤岛,LDA-1B 旨在克服先前具身 AI 模型的局限性,并迎来可扩展、通用机器人智能的时代。
-
UniSonate model unifies speech, music, and sound effect generation
Researchers have developed UniSonate, a novel unified framework for generating speech, music, and sound effects using natural language instructions. This model addresses the fragmentation in generative audio by reconcil…