Om AI, a team from Hangzhou, has released VLX, a series of three end-to-end streaming multimodal models designed for real-world, on-device applications. The models, VLX-Flow, VLX-Seek, and VLX-Go, enable continuous perception, precise localization, and action decision-making, forming a closed-loop system for physical world interaction. Unlike traditional cloud-based models, VLX is engineered from the ground up for edge devices like phones, drones, and robots, prioritizing efficiency and real-time responsiveness. AI
IMPACT Enables more capable and responsive AI agents on edge devices, potentially accelerating robotics and embodied AI development.
RANK_REASON New multimodal model series released by a research team, focusing on novel on-device streaming capabilities. [lever_c_demoted from frontier_release: ic=1 ai=1.0]
AI-generated summary · Google Gemini · from 1 sources. How we write summaries →