diffusion model
PulseAugur coverage of diffusion model — every cluster mentioning diffusion model across labs, papers, and developer communities, ranked by signal.
6 天有情绪数据
-
Complete-muE framework optimizes hyperparameter transfer for MoE models
Researchers have introduced Complete-muE, a novel framework designed to optimize hyperparameter transfer for Mixture-of-Experts (MoE) models. This system addresses the limitations of existing tools by enabling effective…
-
GlowGS improves 3D Gaussian Splatting for nighttime scenes
Researchers have developed GlowGS, a novel method for improving 3D Gaussian Splatting (3DGS) in nighttime scenes, particularly in areas with glow. Existing 3DGS methods struggle with low-light conditions due to a lack o…
-
Diffusion model speedup hinges on overhead reduction, not just fewer steps
Single-image diffusion model inference is slowed by kernel launch overhead and attention memory traffic, rather than raw computational power. Optimizing with `torch.compile` in `reduce-overhead` mode, employing a fused …
-
New diffusion model erases video subtitles in one step
Researchers have developed SEDiT, a novel one-stage diffusion transformer model designed for mask-free video subtitle erasure. This approach directly removes subtitles without requiring a pre-extracted mask, improving u…
-
Diffusion model and LSTM optimize radiotherapy plans
Researchers have developed a novel diffusion model and LSTM-based approach for optimizing radiotherapy plans, specifically for Volumetric Modulated Arc Therapy (VMAT). This method aims to significantly reduce the planni…
-
Cold diffusion tackles percussive audio dereverberation
Researchers have developed a novel cold diffusion framework to address the challenge of dereverberating percussive audio signals, such as drums, which have been largely overlooked in favor of speech processing. This new…
-
New theory resolves instability in MeanFlow generative models
Researchers have developed a theoretical framework to address instability issues in MeanFlow training, a one-step generative modeling technique. They identified that the conditional velocity field is misused in the loss…
-
New BRIDGE method improves local image editing by controlling mask influence
Researchers have developed a new method called BRIDGE for local image editing, which aims to modify specific regions of an image while keeping the background intact. This approach tackles the issue of "mask-shape bias,"…
-
X-Cache accelerates world model inference for autonomous driving simulations
Researchers have developed X-Cache, a novel method to accelerate the inference of autoregressive world models used in autonomous driving simulations. This technique caches residual computations across generation chunks …
-
AI researchers explore learning the integral of diffusion models
A new paper explores the mathematical concept of integrating diffusion models, which are foundational to many generative AI systems. The research delves into the theoretical underpinnings of these models, potentially le…
-
StyleShield framework evades AI content detectors with controllable style transfer
Researchers have developed StyleShield, a novel framework that manipulates text style in the continuous token embedding space to evade AI-generated content detectors. This method utilizes a DiT backbone with cross-atten…
-
Ortho-Hydra paper introduces new method to improve LoRA fine-tuning for diffusion transformers
Researchers have introduced Ortho-Hydra, a novel re-parameterization technique designed to improve LoRA fine-tuning for diffusion transformers (DiT) on multi-style data. This method addresses the issue of 'style bleed' …
-
Mamoda2.5 model integrates multimodal AI with efficient DiT-MoE for top video editing
Researchers have introduced Mamoda2.5, a unified AR-Diffusion framework designed for multimodal understanding and generation. This model utilizes a Diffusion Transformer backbone enhanced with a Mixture-of-Experts (MoE)…
-
New AI methods enhance time series forecasting accuracy and interpretability
Researchers have introduced several new methods for time-series forecasting, aiming to improve accuracy and generalization. MeLISA, a latent-free autoregressive model, enhances rollout efficiency and long-horizon statis…
-
Video Generation with Predictive Latents
Researchers have developed several new methods to improve the efficiency and quality of visual generative models. DC-DiT introduces dynamic chunking to Diffusion Transformers, adaptively compressing visual data for fast…
-
YOSE framework speeds up video object removal with token selection
Researchers have developed YOSE, a new framework designed to significantly speed up video object removal using Diffusion Transformer (DiT) models. YOSE achieves this efficiency by adaptively selecting only the essential…
-
Researchers release TripVVT dataset and framework for in-the-wild video virtual try-on
Researchers have introduced TripVVT, a new framework for in-the-wild video virtual try-on, addressing limitations caused by scarce data and improper mask usage. The system utilizes a Diffusion Transformer and a stable h…
-
Omni2Sound model unifies video, text to audio generation with new dataset
Researchers have developed Omni2Sound, a unified diffusion model capable of generating audio from video, text, or a combination of both. The model addresses challenges in data scarcity and cross-task competition by intr…
-
New Keyframe-Driven Method Enhances Video Virtual Try-On Realism
Researchers have introduced KeyTailor, a new framework designed to improve video virtual try-on (VVT) by addressing challenges in capturing garment dynamics and maintaining background consistency. The method utilizes a …
-
X-WAM model unifies robotic action and 4D world synthesis with asynchronous denoising
Researchers have developed X-WAM, a novel Unified 4D World Model designed to integrate real-time robotic action execution with high-fidelity 4D world synthesis. This framework addresses limitations in previous models by…