TOP AI MODELS HALVE PERFORMANCE ON COMPLEX CHARTS

AI DESK■ 2 MIN READ

SUN, APR 19, 2026

■ AI-SUMMARIZED FROM 1 SOURCE ▸ TIMELINE

A new benchmark reveals that even the best AI models struggle significantly with complicated visualizations. The RealChart2Code test shows leading proprietary models lose nearly 50% of their performance when handling complex charts built from real-world data.

Researchers have introduced RealChart2Code, a benchmark designed to measure how well AI models can interpret and process complex data visualizations. The test evaluates 14 leading AI models against charts and graphics constructed from actual datasets, revealing a substantial performance gap compared to simpler chart interpretation tasks. The findings expose a critical weakness in current AI capabilities. While models excel at basic chart analysis, their ability to handle real-world complexity drops dramatically. This performance degradation affects both open-source and proprietary models, with even top-tier commercial systems experiencing roughly 50% accuracy loss. The implications are significant for practical applications. Businesses and organizations often work with intricate visualizations—multi-layered dashboards, overlapping datasets, and complex annotations—that current AI models struggle to parse accurately. This gap between simple and complex chart interpretation could limit AI adoption in data analysis and business intelligence roles. RealChart2Code addresses a notable blind spot in existing benchmarks. Previous tests typically focus on simplified or standardized visualizations that don't reflect the messy reality of production data. The new benchmark uses real-world datasets to create charts that more accurately represent actual use cases. The benchmark's findings suggest that improving AI performance on complex visualizations should be a priority for model developers. Enhanced chart understanding could unlock valuable applications in data science, financial analysis, scientific research, and report generation. These results highlight the gap between AI capabilities on controlled tasks versus real-world scenarios. As organizations increasingly rely on AI for data interpretation, addressing this performance drop will be essential for building trustworthy systems that can handle production environments.

■ SOURCES

► The Decoder

■ SUMMARY WRITTEN BY AI FROM THE LINKS ABOVE

■ MORE FROM THE AI DESK

P379CLAUDE CODE BURNS 33K TOKENS VS OPENCODE'S 7K

A comparative study found Claude Code consumes nearly five times more tokens than OpenCode before even processing user prompts, raising efficiency concerns for developers managing API costs.

3H AGO— AI Desk

P377AI ACCELERATES RESEARCH BUT LIMITS IDEA DIVERSITY

A new study finds that AI tools are helping researchers advance their careers faster while simultaneously narrowing the range of ideas being explored. The research suggests AI adoption in science may be creating a homogenizing effect on academic discovery.

4H AGO— AI Desk

P374LINKEDIN DOMINATES AI-GENERATED CONTENT, STUDY FINDS

LinkedIn accounts for nearly two-thirds of all AI-generated long-form posts across major social platforms, according to a Pangram analysis. The platform's 41 percent AI-written rate far exceeds competitors despite making up only a third of all scanned posts.

5H AGO— AI Desk

P371ROBOTAXI COMPANIES FACE PRESSURE TO DELIVER

The autonomous vehicle industry confronts mounting demands for real-world performance and regulatory compliance. Companies must prove viability or face increased scrutiny.

6H AGO— Industry Desk

◄ BACK TO NEWS

TOP AI MODELS HALVE PERFORMANCE ON COMPLEX CHARTS

■ MORE FROM THE AI DESK

■ SUBSCRIBE TO THE DAILY BRIEF