Subquadratic Inc. has unveiled SubQ, a new long-context language model that claims to process entire codebases or document sets in a single pass. The model utilizes a subquadratic, sparse-attention design, which theoretically allows compute to scale linearly with context length rather than quadratically. While vendor-published benchmarks show promising results in long-context retrieval, its coding capabilities are reportedly middling compared to frontier models. The model is currently in private beta, accessible via an OpenAI-compatible REST API, with a marketed ceiling of 12 million tokens, though evaluations have so far been limited to 1 million tokens. AI
IMPACT Potentially enables new workflows for code analysis and document processing by eliminating traditional RAG limitations.
RANK_REASON New model release from a startup claiming novel architecture and performance metrics. [lever_c_demoted from frontier_release: ic=1 ai=1.0]
- Alex Whedon
- Claude Opus
- Flashattention
- Justin Dangel
- May 5, 2026
- MRCR v2
- OpenAI
- Python
- SubQ LLM
- subq-preview
- Subquadratic Inc.
AI-generated summary · Google Gemini · from 1 sources. How we write summaries →