Researchers from Tsinghua University have developed Spatial-TTT, an open-source spatial intelligence model that has been accepted into ECCV 2026. This model excels at continuously learning and updating its spatial memory from long video streams, outperforming models like Gemini and GPT-5 on various benchmarks. Spatial-TTT utilizes a novel hybrid architecture with fast weights for dynamic memory, a spatial prediction mechanism to better understand geometric relationships, and dense scene description supervision to build a comprehensive 3D understanding of environments. AI
IMPACT This research advances the capabilities of multimodal AI in understanding and interacting with dynamic environments, potentially accelerating applications in robotics and autonomous systems.
RANK_REASON The cluster details a new research paper and model release from a university, including benchmark results and comparisons to existing models. [lever_c_demoted from research: ic=1 ai=1.0]
- ECCV 2026
- Gemini
- Gemini 3 Pro
- GPT-5
- International Conference on Computer Vision
- Liu Fangfu
- Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
- Spatial-TTT
- Tsinghua University
AI-generated summary · Google Gemini · from 1 sources. How we write summaries →