Brief

last 24h

[2/2] 224 sources

Multi-source AI news clustered, deduplicated, and scored 0–100 across authority, cluster strength, headline signal, and time decay.

TOOL · Hugging Face Daily Papers English(EN) · 2w

Linearizing Vision Transformer with Test-Time Training

Researchers have developed a method to convert pretrained Vision Transformer models into linear-complexity Test-Time Training (TTT) architectures. This approach aligns architectural and representational properties, allowing for efficient weight transfer from Softmax attention models. By applying this to Stable Diffusion 3.5, they created SD3.5-T^5, which achieves comparable image quality with significantly faster inference times after minimal fine-tuning. AI

IMPACT Enables faster inference for large vision models by adapting existing architectures.
RESEARCH · arXiv cs.CV English(EN) · 1mo · [2 sources]

Linearizing Vision Transformer with Test-Time Training

Researchers have developed a method to adapt pretrained Softmax attention models to linear-complexity architectures using Test-Time Training (TTT). This approach addresses the representational gap between different attention mechanisms by focusing on architectural and representational alignment. The technique was applied to Stable Diffusion 3.5, resulting in a new model, SD3.5-T$^5$, which achieves comparable image quality with significantly faster inference speeds after only one hour of fine-tuning. AI

IMPACT Accelerates inference for diffusion models by enabling efficient adaptation of pretrained weights to linear-complexity architectures.