Google prepares Gemma 4, focusing on text capabilities

By PulseAugur Editorial · [1 sources] · 2026-06-03 15:10

Google is reportedly developing Gemma 4, a new iteration of its open-source large language model. Early indications suggest this version will focus on core text-based capabilities, omitting specialized towers for vision and audio processing. This development was hinted at through a pull request in the Hugging Face Transformers library, suggesting ongoing work on the model's integration and functionality. AI

IMPACT Focus on text-only capabilities in Gemma 4 may streamline development for specific applications and indicate a strategic direction for Google's open-source models.

RANK_REASON The cluster discusses an upcoming model release based on a pull request, indicating ongoing development and potential future capabilities rather than a finalized product launch. [lever_c_demoted from research: ic=1 ai=1.0]

Read on r/LocalLLaMA →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

COVERAGE [1]

r/LocalLLaMA TIER_1 English(EN) · /u/duckyzz003 · 2026-06-03 15:10

Gemma 4 is coming - No Vision Tower - No Audio Tower

<div class="md"><p><a href="https://github.com/huggingface/transformers/pull/46385">https://github.com/huggingface/transformers/pull/46385</a></p> </div>   submitted by   <a href="https://www.reddit.com/user/duckyzz003"> /u/duckyzz003 </a> <br…

COVERAGE [1]

Gemma 4 is coming - No Vision Tower - No Audio Tower

RELATED ENTITIES

RELATED TOPICS