New model HieraCount improves object counting with multi-grained approach

作者 PulseAugur 编辑部 · [1 个来源] · 2026-05-11 17:32

Researchers have introduced a new framework for open-world object counting, addressing the brittleness of current vision-language models in accurately identifying and counting objects based on user intent. They propose redefining counting as a multi-grained problem, where both visual examples and detailed text prompts, including negative prompts, specify the target appearance and semantic granularity. To overcome the data limitations for this approach, they developed an automated pipeline using 3D synthesis and VLM filtering to create KubriCount, the largest dataset for counting tasks. Their new model, HieraCount, leverages both text and visual exemplars to significantly improve multi-grained counting accuracy and generalize to real-world scenarios. AI

影响 Introduces a more robust method for object counting, potentially improving applications that rely on visual scene understanding and quantification.

排序理由 The cluster contains a research paper detailing a new model and dataset for object counting. [lever_c_demoted from research: ic=1 ai=1.0]

在 arXiv cs.CV 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

arXiv cs.CV TIER_1 (TL) · Weidi Xie · 2026-05-11 17:32

Count Anything at Any Granularity

Open-world object counting remains brittle: despite rapid advances in vision-language models (VLMs), reliably counting the objects a user intends is far from solved. We argue that a central reason is that counting granularity is left implicit; users may refer to a specific identi…

报道来源 [1]

Count Anything at Any Granularity

相关实体

相关话题