llama.cpp 发布增强性能并添加新功能

作者 PulseAugur 编辑部 · [5 个来源] · 2026-06-12 05:17

llama.cpp 项目发布了多个更新，包括 b9608，该版本更新了 cpp-httplib 并为 macOS、Linux、Android 和 Windows 等各种平台提供了预编译二进制文件。b9606 版本引入了 EAGLE3 推测解码支持，增强了模型推理能力。b9605 版本包括为 Adreno GPU 添加 OpenCL 内核，提高了在某些移动设备上的性能。b9604 版本解决了 SYCL 后端的 CI 构建和发布问题，确保了更高的稳定性。 AI

影响 llama.cpp 的这些更新提高了在各种硬件上运行大型语言模型的效率和可访问性。

排序理由这是用于运行 LLM 的工具的软件发布，而不是新的前沿模型发布或重要的研究论文。

在 llama.cpp — Releases 阅读 →

AI 生成摘要 · Google Gemini · 来自 5 个来源。我们如何撰写摘要 →

报道来源 [5]

llama.cpp — Releases TIER_1 (SO) · github-actions[bot] · 2026-06-12 10:03

b9608

<details open=""> vendor : update cpp-httplib to 0.47.0 (<a class="issue-link js-issue-link" href="https://github.com/ggml-org/llama.cpp/pull/24395">#24395</a>) Signed-off-by: Adrien Gallouët <a href="mailto:[email protected]">[email protected]</a> </details> <p…
llama.cpp — Releases TIER_1 (SO) · github-actions[bot] · 2026-06-12 08:47

b9606

<details open=""> spec: add EAGLE3 speculative decoding support (<a class="issue-link js-issue-link" href="https://github.com/ggml-org/llama.cpp/pull/18039">#18039</a>) <ul> <li> llama : enable layer input extraction </li> <li> spec: support eagle3 </li> <li>…
llama.cpp — Releases TIER_1 (SO) · github-actions[bot] · 2026-06-12 08:13

b9605

<details open=""> ggml: support concat for scalar types at cuda backend (<a class="issue-link js-issue-link" href="https://github.com/ggml-org/llama.cpp/pull/24011">#24011</a>) <ul> <li> cuda: support concat for scalar types </li> <li> Update concat.cu </li> …
llama.cpp — Releases TIER_1 (SO) · github-actions[bot] · 2026-06-12 07:28

b9604

<details open=""> [SYCL] Fix CI build & release for SYCL backend (<a class="issue-link js-issue-link" href="https://github.com/ggml-org/llama.cpp/pull/24387">#24387</a>) <ul> <li> restore SYCL build and release, remove github cache </li> <li> modify for test …
llama.cpp — Releases TIER_1 (SO) · github-actions[bot] · 2026-06-12 05:17

b9603

<details open=""> opencl: add q5_0/q5_1 gemm and gemv kernels for Adreno (<a class="issue-link js-issue-link" href="https://github.com/ggml-org/llama.cpp/pull/24319">#24319</a>) <ul> <li> opencl: add q5_0 adreno support </li> <li> opencl: add q5_1 adreno support<…

报道来源 [5]

b9608

b9606

b9605

b9604

b9603

相关实体

相关话题