Diffusion Transformer
PulseAugur coverage of Diffusion Transformer — every cluster mentioning Diffusion Transformer across labs, papers, and developer communities, ranked by signal.
3 天有情绪数据
-
Omni2Sound model unifies video, text to audio generation with new dataset
Researchers have developed Omni2Sound, a unified diffusion model capable of generating audio from video, text, or a combination of both. The model addresses challenges in data scarcity and cross-task competition by intr…
-
New Keyframe-Driven Method Enhances Video Virtual Try-On Realism
Researchers have introduced KeyTailor, a new framework designed to improve video virtual try-on (VVT) by addressing challenges in capturing garment dynamics and maintaining background consistency. The method utilizes a …
-
X-WAM model unifies robotic action and 4D world synthesis with asynchronous denoising
Researchers have developed X-WAM, a novel Unified 4D World Model designed to integrate real-time robotic action execution with high-fidelity 4D world synthesis. This framework addresses limitations in previous models by…
-
UniSER foundation model unifies soft effects removal in images
Researchers have developed UniSER, a novel foundation model designed to address a variety of soft visual degradations in digital images, such as lens flare, haze, shadows, and reflections. Unlike previous specialized mo…
-
MetaSR framework uses Diffusion Transformer for adaptive metadata in generative super-resolution
Researchers have developed MetaSR, a novel framework for generative super-resolution that adaptively selects and injects relevant metadata to enhance image and video quality. This Diffusion Transformer-based approach is…
-
Audio-Omni framework unifies audio generation, editing, and understanding
Researchers have introduced Audio-Omni, a novel framework designed to unify audio understanding, generation, and editing across diverse domains like speech, music, and general sounds. This system integrates a frozen Mul…
-
New REDEdit framework enables mask-free local image editing with diffusion transformers
Researchers have developed REDEdit, a novel adapter framework designed to enhance the precision of local image editing in large diffusion transformers (DiTs). This system retrofits existing DiTs without altering their c…