Mistral AI releases Pixtral 12B, a multimodal model surpassing Llama

By PulseAugur Editorial · Summary by gemini-2.5-flash-lite from 1 source

Mistral AI has released Pixtral 12B, a new open-source multimodal model. This model is capable of processing both text and images, setting it apart from many other open-source alternatives. Pixtral 12B aims to compete with models like Google's Gemini and OpenAI's GPT-4V by offering strong performance on tasks that require understanding of visual and textual information. AI

Summary written by gemini-2.5-flash-lite from 1 source. How we write summaries →

RANK_REASON Release of an open-source multimodal model from a notable AI lab.

Read on Smol AINews →

COVERAGE [1]

Smol AINews TIER_1 (CA) · 2024-09-12 00:30

Pixtral 12B: Mistral beats Llama to Multimodality

**Mistral AI** released **Pixtral 12B**, an open-weights **vision-language model** with a **Mistral Nemo 12B** text backbone and a 400M vision adapter, featuring a large vocabulary of **131,072 tokens** and support for **1024x1024 pixel images**. This release notably beat **Meta …

COVERAGE [1]

Pixtral 12B: Mistral beats Llama to Multimodality

RELATED TOPICS