A new family of omnimodels called Cosmos has been released, featuring three variants: Edge (4B), Nano (16B), and Super (64B). These models are designed to process and generate various modalities including text, image, video, audio, and action sequences within a unified mixture-of-transformers architecture. The Super variant includes specialized fine-tuning for text-to-image and image-to-video tasks. AI
IMPACT Introduces a unified architecture for multimodal AI, potentially streamlining development across various generative tasks.
RANK_REASON Release of a new family of open-source models with multiple variants. [lever_c_demoted from research: ic=1 ai=1.0]
AI-generated summary · Google Gemini · from 1 sources. How we write summaries →