RT @SkylerMiao7: M3 brings sparse attention + 1M context + multimodality, and Together did the hard serving work to make it fast.
MiniMax AI has released its M3 model, boasting enhanced speed and a 1 million token context window. The model also incorporates sparse attention mechanisms and multimodality. Together AI played a key role in optimizing the M3 model for fast serving. AI
IMPACT Sets new SOTA on context window length; pressures competitors to match multimodal and speed improvements.