PulseAugur
EN
LIVE 02:06:42

MiniMax unveils Sparse Attention for 1M token context windows

MiniMax has introduced a novel attention architecture, MiniMax Sparse Attention (MSA), designed to handle context windows of up to 1 million tokens. This new approach restructures memory access patterns to avoid the quadratic complexity typically associated with long contexts, achieving significant speedups and reduced compute. MSA reportedly offers 4x faster execution than previous sparse attention methods, with per-token compute reduced to 1/20th at full context depth, and claims to be the first open-weight model with frontier coding, 1M context, and native multimodality. AI

IMPACT Enables significantly longer context windows for AI models, potentially improving performance on tasks requiring extensive information recall.

RANK_REASON The cluster describes a new model architecture and its performance characteristics, presented as a research development. [lever_c_demoted from research: ic=1 ai=1.0]

Read on r/MachineLearning →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

MiniMax unveils Sparse Attention for 1M token context windows

COVERAGE [1]

  1. r/MachineLearning TIER_1 English(EN) · /u/superintelligence03 ·

    MiniMax dropped a new attention architecture. [N]

    <table> <tr><td> <a href="https://www.reddit.com/r/MachineLearning/comments/1tvameq/minimax_dropped_a_new_attention_architecture_n/"> <img alt="MiniMax dropped a new attention architecture. [N]" src="https://preview.redd.it/gvokff4l0z4h1.png?width=140&amp;height=80&amp;auto=webp&…