PulseAugur
实时 13:19:07

MUSE framework resolves visual tokenization trade-offs with topological orthogonality

Researchers have introduced MUSE, a novel framework designed to resolve manifold misalignment in visual tokenization. This approach utilizes Topological Orthogonality to decouple optimization within Transformers, allowing structural gradients to refine attention topology and semantic gradients to update feature values. Experiments demonstrate that MUSE effectively breaks the trade-off between reconstruction fidelity and semantic abstraction, achieving state-of-the-art generation quality and improving linear probing performance. AI

影响 Introduces a new method to improve visual tokenization, potentially enhancing performance in generative models and downstream perception tasks.

排序理由 This is a research paper detailing a new framework and methodology for visual tokenization. [lever_c_demoted from research: ic=1 ai=1.0]

在 arXiv cs.CV 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。 我们如何撰写摘要 →

MUSE framework resolves visual tokenization trade-offs with topological orthogonality

报道来源 [1]

  1. arXiv cs.CV TIER_1 English(EN) · Panqi Yang, Haodong Jing, Jiahao Chao, Tingyan Xiang, Li Lin, Yao Hu, Yang Luo, Yongqiang Ma ·

    MUSE: Resolving Manifold Misalignment in Visual Tokenization via Topological Orthogonality

    arXiv:2605.05646v1 Announce Type: new Abstract: Unified visual tokenization faces a fundamental trade-off between high-fidelity pixel reconstruction (spatial equivariance) and semantic abstraction (conceptual invariance). We attribute this conflict to Manifold Misalignment: naive…