MiniCPM-o 4.5 is a new 9B parameter omni-modal large language model designed for real-time, full-duplex interaction. It can simultaneously process and generate audio, video, and text, enabling proactive behaviors and continuous environmental understanding. The model utilizes the Omni-Flow framework for time-aligned processing and is optimized for efficient inference, allowing it to run on edge devices with less than 12GB of RAM. AI
IMPACT Enables real-time, full-duplex omni-modal interaction on consumer hardware, lowering the barrier for advanced AI applications.
RANK_REASON Release of a technical report and open-source model with performance claims and new framework.
- M1 Max
- CosyVoice2
- Gemini 2.5 Flash
- llama.cpp
- M5 Pro
- MiniCPM-o 4.5
- MiniCPM-V
- Omni-Flow
- OpenBMB
- Qwen3-Omni-30B-A3B
- RTX 5070
- THUNLP
- Tsinghua University
AI-generated summary · Google Gemini · from 3 sources. How we write summaries →