English(EN) Google DeepMind Releases Gemma 4 12B: An Encoder-Free Multimodal Model with Native audio that runs on a 16 GB laptop

Google DeepMind 发布无需编码器的多模态模型 Gemma 4 12B

作者 PulseAugur 编辑部 · [1 个来源] · 2026-06-03 18:46

Google DeepMind 发布了 Gemma 4 12B，这是一款新的 120 亿参数多模态模型，集成了文本、图像、音频和视频处理，无需单独的编码器。这种新颖的架构允许该模型在仅需 16 GB RAM 的消费级硬件上运行复杂的代理工作流。该模型可在 Apache 2.0 许可下使用，权重可从 Hugging Face 和 Kaggle 下载，并支持各种用于本地部署的推理栈。 AI

影响使消费级硬件上能够实现先进的多模态人工智能功能，可能加速本地代理的开发和部署。

排序理由来自前沿实验室的新模型发布，包含系统卡详情。[lever_c_demoted from frontier_release: ic=1 ai=1.0]

在 MarkTechPost 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

Google DeepMind 发布无需编码器的多模态模型 Gemma 4 12B

报道来源 [1]

MarkTechPost TIER_1 English(EN) · Asif Razzaq · 2026-06-03 18:46

Google DeepMind Releases Gemma 4 12B: An Encoder-Free Multimodal Model with Native audio that runs on a 16 GB laptop

<p>Gemma 4 12B feeds vision and audio straight into the LLM backbone, running locally under an Apache 2.0 license.</p> <p>The post <a href="https://www.marktechpost.com/2026/06/03/google-deepmind-releases-gemma-4-12b-an-encoder-free-multimodal-model-with-native-audio-that-runs-on…

报道来源 [1]

Google DeepMind Releases Gemma 4 12B: An Encoder-Free Multimodal Model with Native audio that runs on a 16 GB laptop

相关实体

相关话题