New model HieraCount improves object counting with multi-grained approach

By PulseAugur Editorial · Summary by gemini-2.5-flash-lite from 1 source

Researchers have introduced a new framework for open-world object counting, addressing the brittleness of current vision-language models in accurately identifying and counting objects based on user intent. They propose redefining counting as a multi-grained problem, where both visual examples and detailed text prompts, including negative prompts, specify the target appearance and semantic granularity. To overcome the data limitations for this approach, they developed an automated pipeline using 3D synthesis and VLM filtering to create KubriCount, the largest dataset for counting tasks. Their new model, HieraCount, leverages both text and visual exemplars to significantly improve multi-grained counting accuracy and generalize to real-world scenarios. AI

Summary written by gemini-2.5-flash-lite from 1 source. How we write summaries →

IMPACT Introduces a more robust method for object counting, potentially improving applications that rely on visual scene understanding and quantification.

RANK_REASON The cluster contains a research paper detailing a new model and dataset for object counting. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.CV →

COVERAGE [1]

arXiv cs.CV TIER_1 (TL) · Weidi Xie · 2026-05-11 17:32

Count Anything at Any Granularity

Open-world object counting remains brittle: despite rapid advances in vision-language models (VLMs), reliably counting the objects a user intends is far from solved. We argue that a central reason is that counting granularity is left implicit; users may refer to a specific identi…

COVERAGE [1]

Count Anything at Any Granularity

RELATED ENTITIES

RELATED TOPICS