A new AI model called "Count Anything" can identify and count objects in any image using only text prompts, halving error rates compared to existing systems. The breakthrough addresses a persistent challenge in computer vision, though dense crowds and ambiguous terms still pose problems.
Researchers have developed "Count Anything," an AI model designed to count objects across virtually any visual context—from pedestrian crowds to microscopic cell samples—using simple text instructions.
The model represents a significant advance in object counting, a task that sounds straightforward but requires sophisticated understanding of visual complexity. In head-to-head testing, "Count Anything" reduces error rates by 50% compared to previous counting systems.
This improvement matters for practical applications. Researchers, medical professionals, and security analysts currently rely on manual counting or task-specific tools that only work for predetermined object types. A universal counting system could streamline workflows across industries, from epidemiology to retail inventory management.
The text-prompt interface makes the tool more accessible than traditional computer vision approaches. Instead of building separate models for different counting tasks, users simply describe what they want counted, and the system adapts.
However, the model has identifiable limitations. Extremely dense clusters of objects—such as large crowds or tightly packed particles—remain challenging. The system also struggles with ambiguous language, where counting instructions could be interpreted multiple ways.
These constraints suggest the technology is production-ready for specific use cases but not yet a complete replacement for human expertise in edge cases. Developers will likely focus on refining performance in high-density scenarios and improving how the model interprets nuanced counting instructions.
The release follows a broader trend of AI models gaining versatility through natural language interfaces. Similar "anything" models have recently emerged for image generation, segmentation, and video understanding, each expanding what single AI systems can accomplish across varying contexts.
"Count Anything" adds to this momentum by tackling a quantification problem that bridges multiple scientific and commercial domains. As the model matures, expect adoption in research labs and enterprises where counting currently demands specialized expertise or manual labor.
The New Yorker's profile of OpenAI CEO Sam Altman featured an AI-generated illustration, raising questions about whether AI coverage should rely on AI tools.
Major AI systems from Google, OpenAI, Anthropic, and xAI perform poorly when predicting Premier League match outcomes. xAI's Grok shows particularly weak performance.
A tester who previously dismissed Siri and Apple Intelligence is reconsidering after 24 hours with the redesigned Siri AI in macOS 27 Golden Gate's developer beta.
Anthropic's Claude AI model generated a playable browser game called Shepherd's Dog, sparking discussion about AI capabilities and risks in the developer community.