MiniMax AI has released its new open-weight model, MiniMax M3, featuring a 1 million token context window and advanced capabilities. The model utilizes a novel sparse attention architecture called MSA, which includes dedicated prefill and decode kernels. It supports BF16 and MXFP8 formats on NVIDIA Hopper and Blackwell architectures, enabling efficient serving of long contexts with prefix caching and chunked prefill. AI
IMPACT This release pushes the boundaries of open-weight models, potentially accelerating research and development in long-context handling and sparse attention architectures.
RANK_REASON Frontier-lab model release with system card. [lever_c_demoted from frontier_release: ic=1 ai=1.0]
AI-generated summary · Google Gemini · from 1 sources. How we write summaries →