Brief

last 24h

[2/2] 223 sources

Multi-source AI news clustered, deduplicated, and scored 0–100 across authority, cluster strength, headline signal, and time decay.

RESEARCH · arXiv cs.LG English(EN) · 1d · [2 sources]

Video-Rate Streaming Stylization on a Vision-Aware MLLM-Conditioned Edit Diffusion: Asymmetric Batched Inference on a Distilled UNet + MLLM Text Encoder

Researchers have developed a new streaming pipeline for video stylization that achieves high frame rates by optimizing the diffusion U-Net and MLLM text encoder. The system uses asymmetric pipelining and batched inference to overcome per-frame bottlenecks, enabling real-time video editing on consumer hardware. This approach sustains over 27 frames per second on an RTX 3090 Ti and significantly higher on more powerful GPUs, demonstrating efficient video-rate throughput. AI

IMPACT Achieves video-rate throughput for stylization, potentially enabling real-time AI-powered video editing tools.
- RTX 4090
- Qwen3-VL
- RTX 5090
- RTX 3090 Ti
- DAVIS-2017
TOOL · arXiv cs.AI English(EN) · 2d

Wavelet as Tokenizer: Preliminary Results on a Shared Wavelet Token Schema for Natural Signals

Researchers have proposed a novel approach using wavelets as a common tokenization method for audio, images, and video, moving away from modality-specific latent grids. Their preliminary model, featuring a Haar DWT/IDWT frontend and a shared coefficient-token layout, achieved notable PSNR scores on benchmark datasets for speech, images, and video. The study suggests that a unified wavelet token schema could be viable, with further experiments indicating that sparse training and energy selection methods offer efficient compression strategies. AI

IMPACT Proposes a unified tokenization approach for multi-modal AI, potentially simplifying model architectures and improving efficiency.

Brief

Video-Rate Streaming Stylization on a Vision-Aware MLLM-Conditioned Edit Diffusion: Asymmetric Batched Inference on a Distilled UNet + MLLM Text Encoder

Wavelet as Tokenizer: Preliminary Results on a Shared Wavelet Token Schema for Natural Signals