Google releases Gemma 4 checkpoints; AI token costs rise

By PulseAugur Editorial · [1 sources] · 2026-06-06 06:05

Google has released Gemma 4 QAT checkpoints, enabling a 12-billion parameter model to run on 8GB of VRAM. This development comes as AI token costs are reportedly escalating at major companies like Uber and Microsoft. Concurrently, New York is considering a freeze on data center construction. AI

IMPACT Enables more efficient deployment of large models on consumer hardware, potentially lowering inference costs.

RANK_REASON Frontier-lab model release with system card. [lever_c_demoted from frontier_release: ic=1 ai=1.0]

Read on Mastodon — fosstodon.org →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

COVERAGE [1]

Mastodon — fosstodon.org TIER_1 English(EN) · [email protected] · 2026-06-06 06:05

Google releases Gemma 4 QAT checkpoints fitting a 12B model in 8GB VRAM, AI token costs spiral out of control at Uber and Microsoft, and New York moves to freez

Google releases Gemma 4 QAT checkpoints fitting a 12B model in 8GB VRAM, AI token costs spiral out of control at Uber and Microsoft, and New York moves to freeze data center construction. https:// ai0.news/posts/2026-06-06-dail y-digest/ # AI # LocalLLM # OpenSource # DevTools

LINKS ai0.news/…/2026-06-06-daily-digest

COVERAGE [1]

Google releases Gemma 4 QAT checkpoints fitting a 12B model in 8GB VRAM, AI token costs spiral out of control at Uber and Microsoft, and New York moves to freez

RELATED ENTITIES

RELATED TOPICS