nvidia/diffusiongemma-26B-A4B-it-NVFP4 · Hugging Face
Google DeepMind has released DiffusionGemma 26B A4B IT, an open-weights multimodal generative model capable of processing text, image, and video inputs to produce text output. Built on a Gemma 4 26B A4B Mixture-of-Experts architecture, it features 25.2 billion total parameters with 3.8 billion active parameters. The model supports a 256K token context window, multilingual inference across over 35 languages, and can generate over 1,100 tokens per second on NVIDIA H100 GPUs. AI
IMPACT Accelerates multimodal AI development with an open-weights model supporting text, image, and video inputs.