Subquadratic, a startup with 11 PhD researchers, has launched a new model featuring its Subquadratic Selective Attention (SSA) architecture, which claims to scale linearly with context length. This innovation allows for a 12-million-token context window, aiming to overcome the quadratic cost limitations of traditional dense attention mechanisms in LLMs. Early benchmarks show competitive performance against models like GPT-5.5 and Claude Opus on tasks such as MRCR v2 and SWE-Bench, with significantly faster inference speeds. AI
影响 Linear scaling in compute and memory with context length could significantly reduce the cost and improve the ROI of RAG and agentic decomposition.
排序理由 A startup released a new model with a novel architecture and provided benchmark results. [lever_c_demoted from research: ic=1 ai=1.0]
- Alex Whedon
- Claude Opus
- Gemini 3.1 Pro
- GPT-5.5
- Magic.dev
- MRCR v2
- Needle-in-a-haystack
- RULER
- SWE-Bench
AI 生成摘要 · Google Gemini · 来自 1 个来源。 我们如何撰写摘要 →