Google has released Gemma 4 QAT checkpoints, enabling a 12-billion parameter model to run on 8GB of VRAM. This development comes as AI token costs are reportedly escalating at major companies like Uber and Microsoft. Concurrently, New York is considering a freeze on data center construction. AI
IMPACT Enables more efficient deployment of large models on consumer hardware, potentially lowering inference costs.
RANK_REASON Frontier-lab model release with system card. [lever_c_demoted from frontier_release: ic=1 ai=1.0]
Read on Mastodon — fosstodon.org →
AI-generated summary · Google Gemini · from 1 sources. How we write summaries →