Together AI releases Mamba-3, prioritizing inference speed over training

By PulseAugur Editorial · [1 sources] · 2026-03-17 00:00

Together AI has released Mamba-3, a new state space model (SSM) prioritizing inference efficiency over training speed. This model features a more expressive recurrence formula, complex-valued state tracking, and a multi-input, multi-output (MIMO) variant that enhances accuracy without sacrificing decoding speed. Mamba-3 SISO has demonstrated superior performance in prefill and decode latency compared to previous Mamba versions and even the Llama-3.2-1B Transformer model at the 1.5B parameter scale. The team has also open-sourced the model's kernels, developed collaboratively with researchers from Carnegie Mellon University, Princeton University, and Cartesia AI. AI

IMPACT Sets a new benchmark for inference efficiency in state space models, potentially influencing future LLM architectures and deployment strategies.

RANK_REASON New model release from a frontier AI lab (Together AI) with performance claims. [lever_c_demoted from frontier_release: ic=1 ai=1.0]

Read on Together AI blog →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

Together AI releases Mamba-3, prioritizing inference speed over training

COVERAGE [1]

Together AI blog TIER_1 (SW) · 2026-03-17 00:00

Mamba-3

Meet Mamba-3: the SSM built for inference. Faster than Transformers at decode, stronger than Mamba-2, and open-source from day one.

COVERAGE [1]

Mamba-3

RELATED ENTITIES

RELATED TOPICS