Google DeepMind has released Gemma 4 12B, an open-weights multimodal model designed for efficient local deployment. This 11.95 billion parameter model uniquely processes text, images, audio, and video through a unified pathway, eliminating the need for separate vision and audio encoders. This architecture allows it to run on devices with as little as 16GB of memory, making it suitable for a variety of offline applications like transcription, summarization, and local coding assistants. AI
IMPACT Enables more capable local multimodal applications by reducing computational overhead.
RANK_REASON New model release from a frontier lab with system card. [lever_c_demoted from frontier_release: ic=1 ai=1.0]
AI-generated summary · Google Gemini · from 1 sources. How we write summaries →