Subquadratic, a startup with 11 PhD researchers, has launched a new model featuring its Subquadratic Selective Attention (SSA) architecture, which claims to scale linearly with context length. This innovation allows for a 12-million-token context window, aiming to overcome the quadratic cost limitations of traditional dense attention mechanisms in LLMs. Early benchmarks show competitive performance against models like GPT-5.5 and Claude Opus on tasks such as MRCR v2 and SWE-Bench, with significantly faster inference speeds. AI
Summary written by gemini-2.5-flash-lite from 1 source. How we write summaries →
IMPACT Linear scaling in compute and memory with context length could significantly reduce the cost and improve the ROI of RAG and agentic decomposition.
RANK_REASON A startup released a new model with a novel architecture and provided benchmark results. [lever_c_demoted from research: ic=1 ai=1.0]