Google released DiffusionGemma, a 26-billion-parameter open model that generates text through diffusion rather than token-by-token prediction, achieving roughly 1,000 tokens per second on a single H100 GPU. The approach trades speed for output quality, positioning it as an experimental tool for developers.
DiffusionGemma applies diffusion techniques—traditionally used in image generation—to text production. Instead of predicting one token at a time like standard autoregressive models, the system generates text by iteratively refining noise into coherent output, similar to how diffusion models transform static into images.
Performance metrics show significant speed gains. On a single Nvidia H100 GPU, DiffusionGemma reaches approximately 1,000 tokens per second, roughly four times faster than comparable autoregressive models. This speed advantage could make parallel text generation practical for applications requiring rapid output.
The trade-off is measurable. Output quality lags behind traditional language models, limiting immediate production use. Google acknowledges this limitation by framing DiffusionGemma as an experimental offering aimed at developer exploration rather than a direct replacement for existing models.
The open-source release invites the research community to investigate diffusion-based text generation further. As a 26-billion-parameter model, it sits in a middle tier—larger than small instruction-tuned models but smaller than frontier models.
Diffusion-based text generation remains relatively unexplored compared to autoregressive approaches. Success here could reshape how text AI systems balance speed and quality, particularly for use cases where faster generation matters more than perfect outputs. The experimental designation suggests Google is gathering feedback before any broader deployment.
Thibault Sottiaux, who built OpenAI's fast-growing code generation business, is now heading core products as the company plans to merge ChatGPT and Codex into a unified super app.
Avataar AI has launched a distilled video generation model priced at $0.005 per second, positioning itself as a cost-effective alternative for India's market. The platform combines affordability with cultural awareness capabilities.
A survey of nearly 52,000 Americans reveals widespread anxiety about artificial intelligence, with 64% concerned about job displacement and 56% worried about losing independent thinking abilities.
Moonshot AI released Kimi K2.7-Code, an open-source coding model designed for improved token efficiency. The release targets developers seeking cost-effective alternatives for code generation tasks.