PulseAugur
实时 12:46:27
English(EN) Adaptive Tokenisation Via Temporal Redundancy Masking And Latent Inpainting

新方法实现自适应图像和视频分词

研究人员开发了新的自适应图像和视频分词方法,使模型能够根据视觉复杂性动态分配计算资源。AdaTok 是一种自预算离散一维分词器,它学习调整每张图像的 token 数量,平均而言,在显著更少的 token 数量下实现了具有竞争力的保真度。此外,一个用于自适应视频分词的新框架利用时间冗余掩码和潜在修复来实现高效、内容驱动的 token 分配,从而在推理时获得显著的加速。 AI

影响 这些自适应分词技术有望带来更高效的图像和视频处理 AI 模型,降低计算成本并提高推理速度。

排序理由 该集群包含两篇不同的研究论文,介绍了计算机视觉任务中自适应分词的新方法。

在 arXiv cs.CV 阅读 →

AI 生成摘要 · Google Gemini · 来自 4 个来源。 我们如何撰写摘要 →

报道来源 [4]

  1. arXiv cs.CV TIER_1 English(EN) · Xiaocheng Lu, Yuxi Chen, Jie Zhang, Jian Liu, Jingcai Guo, Fangqi Zhu, Tao Han, Song Guo ·

    AdaTok:具有质量保持动态令牌的自预算图像令牌化

    arXiv:2606.07185v1 Announce Type: new Abstract: Image tokenizers, from 2D grids to recent 1D sequences, typically encode every image with the same fixed number of tokens. Yet visual complexity is highly heterogeneous, so a uniform budget overspends on simple inputs and underserve…

  2. arXiv cs.CV TIER_1 English(EN) · Song Guo ·

    AdaTok:具有质量保持动态令牌的自预算图像令牌化

    Image tokenizers, from 2D grids to recent 1D sequences, typically encode every image with the same fixed number of tokens. Yet visual complexity is highly heterogeneous, so a uniform budget overspends on simple inputs and underserves complex ones. Existing elastic tokenizers expo…

  3. arXiv cs.CV TIER_1 English(EN) · Kevin Dave, Sai Aditya Patkuri, Chhaya Kumar Das, Gouranga Bala, R. Venkatesh Babu, Rajeshkumar SA ·

    通过时间冗余掩码和潜在修复进行自适应分词

    arXiv:2606.06158v1 Announce Type: new Abstract: Adaptive video tokenisation seeks to dynamically allocate token budgets based on the underlying visual complexity of a sequence. Current continuous-regime approaches achieve this via iterative binarised searches or trained neural re…

  4. arXiv cs.CV TIER_1 English(EN) · Rajeshkumar SA ·

    通过时间冗余掩码和潜在修复进行自适应分词

    Adaptive video tokenisation seeks to dynamically allocate token budgets based on the underlying visual complexity of a sequence. Current continuous-regime approaches achieve this via iterative binarised searches or trained neural regressors, while discrete methods often require a…