Brief

last 24h

[4/4] 222 sources

Multi-source AI news clustered, deduplicated, and scored 0–100 across authority, cluster strength, headline signal, and time decay.

FRONTIER RELEASE · Mastodon — fosstodon.org English(EN) · 2h · [3 sources]

Gemma 4 12B: A unified, encoder-free multimodal model https:// blog.google/innovation-and-ai/ technology/developers-tools/introducing-gemma-4-12b/ # ai # google

Google has unveiled Gemma 4 12B, a new multimodal AI model. This model is notable for its unified architecture and the absence of an encoder component. It is designed to process and understand various types of data, including text and images, in a cohesive manner. AI

IMPACT This release introduces a new unified architecture for multimodal AI, potentially simplifying development and improving performance for tasks involving diverse data types.
- Google
- Gemma 4 12B
TOOL · r/LocalLLaMA English(EN) · 20m

How to use audio and vision modalities in llama.cpp?

A user on the r/LocalLLaMA subreddit is seeking guidance on integrating audio and vision capabilities into the llama.cpp framework. They are using the b9494 release and have encountered issues where the command-line interface only recognizes text modalities. The user also reported that attempting to add an image causes the program to crash. AI

IMPACT This query highlights user interest in expanding the multimodal capabilities of local LLM inference tools.
TOOL · Unsloth — Releases English(EN) · 2h

Gemma 4 12B, New UI, MCP, Projects

Unsloth has released version 0.1.44-beta, introducing a new chat UI, project management features, and experimental canvas capabilities. This update also integrates Google's Gemma 4 12B model, which can run locally on 8GB of RAM and supports image, audio, and a 256K context window. Significant improvements have been made to the MCP (Model-Controlled Processes) feature, allowing models to use live tools without API keys, and enhanced compatibility for various CUDA and ROCm versions across different operating systems. AI

IMPACT Enhances local model execution and tool integration for developers using Unsloth Studio.
- Unsloth
- Gemma 4 12B
- Google
- MCP
- Projects
- Canvas
- CUDA
- ROCm
SIGNIFICANT · X — Google DeepMind Nederlands(NL) · 2h

RT @googlegemma: Meet Gemma 4 12B!

Google DeepMind has announced Gemma 4 12B, a new multimodal model. This model is designed with an encoder-free architecture to deliver advanced intelligence directly to users. It aims to provide high-performance capabilities for various applications. AI

IMPACT This release introduces a new multimodal model, potentially enhancing AI capabilities in areas requiring integrated vision and language understanding.
- Google DeepMind
- Gemma 4 12B

Brief

Gemma 4 12B: A unified, encoder-free multimodal model https:// blog.google/innovation-and-ai/ technology/developers-tools/introducing-gemma-4-12b/ # ai # google

How to use audio and vision modalities in llama.cpp?

Gemma 4 12B, New UI, MCP, Projects

RT @googlegemma: Meet Gemma 4 12B!