New diffusion model erases video subtitles in one step

作者 PulseAugur 编辑部 · [1 个来源] · 2026-05-14 14:37

Researchers have developed SEDiT, a novel one-stage diffusion transformer model designed for mask-free video subtitle erasure. This approach directly removes subtitles without requiring a pre-extracted mask, improving upon existing two-stage methods that rely on segmentation precision. SEDiT utilizes a one-step generation process, theoretically justified by Lipschitz continuity, and employs a hybrid training strategy with first-frame conditioning to ensure long-term temporal consistency. The model efficiently handles high-resolution and long-duration videos through its chunk-wise streaming inference capabilities. AI

影响 Introduces a more efficient and effective method for video editing tasks like subtitle removal.

排序理由 Publication of an academic paper detailing a new AI model and methodology. [lever_c_demoted from research: ic=1 ai=1.0]

在 arXiv cs.CV 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

arXiv cs.CV TIER_1 English(EN) · Yunlong Bai · 2026-05-14 14:37

SEDiT: Mask-Free Video Subtitle Erasure via One-step Diffusion Transformer

Recent breakthroughs in video diffusion models have significantly accelerated the development of video editing techniques. However, existing methods often rely on inpainting video frames based on masked input, which requires extracting the target video mask in advance, and the pr…

报道来源 [1]

SEDiT: Mask-Free Video Subtitle Erasure via One-step Diffusion Transformer

相关实体

相关话题