ByteDance has released Lance, a new 3-billion parameter open-source multimodal model designed to run on consumer GPUs. This model can process both images and text, aiming to make advanced AI capabilities more accessible. Concurrently, the popular inference engine llama.cpp has received significant performance enhancements through Multi-Threaded Pipelining (MTP), which boosts local inference speeds. Additionally, a new open-source chat client called Horizon has been launched, offering cross-platform support for interacting with local models via Ollama, as well as cloud-based AI services. AI
影响 Advances in lightweight multimodal models and inference engine optimizations will accelerate the development and deployment of local AI applications.
排序理由 Cluster covers release of open-source models and software updates for local inference. [lever_c_demoted from research: ic=1 ai=1.0]
AI 生成摘要 · Google Gemini · 来自 1 个来源。 我们如何撰写摘要 →