MiniMax AI releases open-weight M3 model with 1M context

By PulseAugur Editorial · [1 sources] · 2026-06-12 21:16

MiniMax AI has released its new open-weight model, MiniMax M3, featuring a 1 million token context window and advanced capabilities. The model utilizes a novel sparse attention architecture called MSA, which includes dedicated prefill and decode kernels. It supports BF16 and MXFP8 formats on NVIDIA Hopper and Blackwell architectures, enabling efficient serving of long contexts with prefix caching and chunked prefill. AI

IMPACT This release pushes the boundaries of open-weight models, potentially accelerating research and development in long-context handling and sparse attention architectures.

RANK_REASON Frontier-lab model release with system card. [lever_c_demoted from frontier_release: ic=1 ai=1.0]

Read on X — MiniMax AI →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

MiniMax AI releases open-weight M3 model with 1M context

COVERAGE [1]

X — MiniMax AI TIER_1 English(EN) · MiniMax_AI · 2026-06-12 21:16

day-0 in @vllm_project and it comes with:

day-0 in @vllm_project and it comes with: dedicated MSA prefill/decode kernels, 1M-context serving with prefix caching + chunked prefill, BF16 + MXFP8 on both Hopper and Blackwell 🚀 this is what open-weight done properly looks like. thanks @vllm_project, @NVIDIAAI, @AIatAMD,

COVERAGE [1]

day-0 in @vllm_project and it comes with:

RELATED ENTITIES

RELATED TOPICS