The llama.cpp project has released several updates, including b9608, which features an update to cpp-httplib and provides pre-compiled binaries for various platforms like macOS, Linux, Android, and Windows. Release b9606 introduces EAGLE3 speculative decoding support, enhancing model inference capabilities. Release b9605 includes OpenCL kernel additions for Adreno GPUs, improving performance on certain mobile devices. Release b9604 addresses CI build and release issues for the SYCL backend, ensuring greater stability. AI
IMPACT These updates to llama.cpp improve the efficiency and accessibility of running large language models on diverse hardware.
RANK_REASON This is a software release for a tool that facilitates running LLMs, not a new frontier model release or significant research paper.
Read on llama.cpp — Releases →
- Adreno
- Android
- b9603
- CUDA
- Linux
- llama.cpp
- macOS
- Windows
- OpenCL
- OpenVINO
- Qualcomm
- ROCm
- Vulkan
- b9604
- b9605
- b9606
- b9608
- cpp-httplib
- EAGLE3
AI-generated summary · Google Gemini · from 5 sources. How we write summaries →