PulseAugur
实时 19:07:06
English(EN) RT-Counter: Real-Time Text-Guided Open-Vocabulary Object Counting

新AI模型高效稳健地应对对象计数挑战

研究人员推出了一种新的文本引导开放词汇对象计数框架MambaCount,该框架利用空间稀疏状态空间对偶(S^4D)块来克服Transformer在处理密集场景和大尺度变化方面的局限性。MambaCount解决了Mamba中的因果建模问题和空间标记响应中的高熵问题,在线性复杂度下在FSC-147数据集上取得了最先进的性能。同时,RT-Counter为该任务提供了一个实时解决方案,通过视觉原型文本化模块和编织Transformer层来平衡准确性和效率,取得了具有竞争力的结果,同时速度更快、参数效率更高。此外,还提出了一个新的基准Robust-TOOC,用于评估在不利条件下的对象计数,以及Dual-TTT,一个旨在提高鲁棒性而不改变现有架构的测试时训练框架。 AI

影响 对象计数方面的这些进步可以提高AI理解和交互复杂视觉场景的能力,影响机器人、自动驾驶和图像分析等应用。

排序理由 多篇研究论文介绍了计算机视觉领域的新模型和基准。

在 arXiv cs.CV 阅读 →

AI 生成摘要 · Google Gemini · 来自 6 个来源。 我们如何撰写摘要 →

报道来源 [6]

  1. arXiv cs.CL TIER_1 English(EN) · Hao-Yuan Ma, Li Zhang, Minjie Qiang, Jie Gao ·

    MambaCount: Efficient Text-guided Open-vocabulary Object Counting with Spatial Sparse State Space Duality Block

    arXiv:2606.17650v1 Announce Type: cross Abstract: Text-guided Open-vocabulary Object Counting (TOOC) aims to estimate the number of objects described by text prompts, which is particularly challenging in dense scenes with large scale variations. Existing TOOC approaches predomina…

  2. arXiv cs.CL TIER_1 English(EN) · Jie Gao ·

    MambaCount: Efficient Text-guided Open-vocabulary Object Counting with Spatial Sparse State Space Duality Block

    Text-guided Open-vocabulary Object Counting (TOOC) aims to estimate the number of objects described by text prompts, which is particularly challenging in dense scenes with large scale variations. Existing TOOC approaches predominantly rely on Transformers, whose quadratic complex…

  3. arXiv cs.CV TIER_1 English(EN) · Hao-Yuan Ma, Li Zhang, Zhiwei Zhu, Jie Gao ·

    RT-Counter: Real-Time Text-Guided Open-Vocabulary Object Counting

    arXiv:2606.17561v1 Announce Type: new Abstract: Text-guided open-vocabulary object counting (TOOC) aims to count objects belonging to the categories specified by natural language descriptions. Although vision-language pre-trained models have been successful applied to TOOC tasks,…

  4. arXiv cs.CV TIER_1 English(EN) · Hao-Yuan Ma, Yuda Zou, Li Zhang, Yongchao Xu ·

    Test-Time Training for Robust Text-Guided Open-Vocabulary Object Counting

    arXiv:2606.17601v1 Announce Type: new Abstract: Text-guided Open-vocabulary Object Counting (TOOC) enables counting arbitrary object categories specified by text prompts, offering substantially greater flexibility than conventional closed-set counting. However, existing TOOC meth…

  5. arXiv cs.CV TIER_1 English(EN) · Yongchao Xu ·

    Test-Time Training for Robust Text-Guided Open-Vocabulary Object Counting

    Text-guided Open-vocabulary Object Counting (TOOC) enables counting arbitrary object categories specified by text prompts, offering substantially greater flexibility than conventional closed-set counting. However, existing TOOC methods are developed and evaluated primarily on ide…

  6. arXiv cs.CV TIER_1 English(EN) · Jie Gao ·

    RT-Counter: Real-Time Text-Guided Open-Vocabulary Object Counting

    Text-guided open-vocabulary object counting (TOOC) aims to count objects belonging to the categories specified by natural language descriptions. Although vision-language pre-trained models have been successful applied to TOOC tasks, they still struggle with fine-grained spatial u…