Google Deepmind released Gemma 4 12B, an open-source multimodal model that runs on laptops with just 16GB of RAM. The model processes text, images, and audio natively while matching the performance of larger competitors.
Google Deepmind's new Gemma 4 12B model brings multimodal AI capabilities to consumer hardware. The 12-billion parameter model handles text, images, and audio without requiring separate encoding systems, using an encoder-free architecture that reduces computational overhead.
The model runs on any laptop with 16GB of VRAM or unified memory, making it accessible to developers and users without high-end GPUs. This represents a significant shift toward practical, on-device AI as many companies pursue increasingly large models.
Performance and Efficiency
Gemma 4 12B nearly matches the performance of Google's larger 26B model in benchmarks, achieving comparable results at half the size. The model achieves this efficiency through optimized encoding schemes and improved token prediction methods that maximize output quality without proportional increases in parameter count.
Licensing and Availability
The model ships under an Apache 2.0 license, permitting commercial use without restrictions. This open-source approach allows developers to deploy the model locally, avoiding cloud service dependencies and keeping data processing on-device.
Market Context
While many AI providers focus on scaling up larger, more powerful models, Google continues investing in the efficiency side of the market. This dual approach addresses different use cases—from enterprise deployments requiring maximum capability to edge and consumer applications needing reasonable performance with minimal infrastructure.
The release underscores ongoing competition in open-source AI, where model efficiency and local deployment have become key differentiators. As multimodal AI becomes more common, the ability to run these systems on standard consumer hardware removes barriers to adoption and deployment.
ElevenLabs has released Dubbing v2, an AI dubbing model that maintains original speaker emotion, tone, and pacing across 90+ languages while keeping audio synchronized with video content.
Mistral AI announced fresh capabilities and product updates at its inaugural Now Summit in Paris, drawing significant developer interest. The French AI company outlined its vision for open and efficient language models.
Google's new Gemini Spark AI agent can access emails, documents, and calendars to plan events, but a real-world test revealed significant gaps in understanding personal relationships.
Google has patched several bugs in its Gemini app that caused users to rapidly exhaust their usage quotas. The fixes include doubling video generation limits for Ultra members and eliminating charges for failed requests.