A new AI model, Llama 4 Scout, has been announced with a claimed 10 million token context window, significantly larger than existing models from OpenAI, Anthropic, and Google. This model utilizes a Mixture-of-Experts architecture and interleaved Rotary Position Embeddings (iRoPE) to manage its extensive context length and is priced affordably. However, real-world testing reveals limitations, with the practical context window capped at 327,680 tokens on hosted platforms and comprehension significantly degrading beyond approximately 256,000 tokens, making it more of a search index than a reasoning partner at its full claimed capacity. AI
IMPACT Challenges existing long-context models and pricing, but practical limitations may temper its impact.
RANK_REASON New model release with a significant claimed capability increase. [lever_c_demoted from frontier_release: ic=1 ai=1.0]
AI-generated summary · Google Gemini · from 1 sources. How we write summaries →