Google DeepMind has released Gemma 4 12B, a new 12-billion-parameter multimodal model that integrates text, image, audio, and video processing without separate encoders. This novel architecture allows the model to run complex agentic workflows on consumer hardware with as little as 16 GB of RAM. The model is available under the Apache 2.0 license, with weights downloadable from Hugging Face and Kaggle, and supports various inference stacks for local deployment. AI
IMPACT Enables advanced multimodal AI capabilities on consumer hardware, potentially accelerating local agent development and deployment.
RANK_REASON New model release from a frontier lab with system card details. [lever_c_demoted from frontier_release: ic=1 ai=1.0]
AI-generated summary · Google Gemini · from 1 sources. How we write summaries →