llama.cpp 发布增加新的张量支持和错误修复

作者 PulseAugur 编辑部 · [8 个来源] · 2026-05-22 21:38

llama.cpp 项目发布了多个更新，包括 b9297 版本，增加了 NVFP4 MTP 标量张量并链接了 Qwen3.5 MTP 张量。之前的版本，如 b9296 和 b9295，则侧重于 Vulkan 和其他功能的错误修复和改进。这些版本为包括 macOS、Linux、Android 和 Windows 在内的各种操作系统和硬件架构提供了预编译的二进制文件，并支持 CUDA、ROCm、Vulkan 和 SYCL 等多种计算后端。 AI

影响 llama.cpp 的持续开发为用户提供了更高效、更兼容的工具，以便在各种硬件上运行大型语言模型。

排序理由该集群包含一个开源项目的多个发布版本，该项目提供运行大型语言模型的工具，表明其正在持续开发和更新。

在 llama.cpp — Releases 阅读 →

llama.cpp

AI 生成摘要 · Google Gemini · 来自 8 个来源。我们如何撰写摘要 →

报道来源 [8]

llama.cpp — Releases TIER_1 (SO) · github-actions[bot] · 2026-05-23 17:17

b9297

<details open=""> model : add NVFP4 MTP scale tensors (<a class="issue-link js-issue-link" href="https://github.com/ggml-org/llama.cpp/pull/23563">#23563</a>) <ul> <li> Add NVFP4 MTP scale tensors </li> <li> Link Qwen3.5 MTP tensors </li> <li> Aligned null…
llama.cpp — Releases TIER_1 (SO) · github-actions[bot] · 2026-05-23 13:01

b9296

<details open=""> ggml : Check the right iface method before using the fallback 2d get (<a class="issue-link js-issue-link" href="https://github.com/ggml-org/llama.cpp/pull/23514">#23514</a>) </details> macOS/iOS: <ul> <li><a href="https://github.co…
llama.cpp — Releases TIER_1 (SO) · github-actions[bot] · 2026-05-23 09:57

b9295

<details open=""> vulkan: fix windows find_package of SPIRV-Headers (<a class="issue-link js-issue-link" href="https://github.com/ggml-org/llama.cpp/pull/23215">#23215</a>) <ul> <li> vulkan: fix windows find_package of SPIRV-Headers </li> <li> not windows-only</p…
llama.cpp — Releases TIER_1 (SO) · github-actions[bot] · 2026-05-23 01:51

b9294

<details open=""> opencl: generalize Adreno MoE kernels on M (<a class="issue-link js-issue-link" href="https://github.com/ggml-org/llama.cpp/pull/23449">#23449</a>) </details> macOS/iOS: <ul> <li><a href="https://github.com/ggml-org/llama.cpp/relea…
llama.cpp — Releases TIER_1 (SO) · github-actions[bot] · 2026-05-22 22:19

b9291

<details open=""> SYCL: improve MoE prefill throughput (<a class="issue-link js-issue-link" href="https://github.com/ggml-org/llama.cpp/pull/23142">#23142</a>) <ul> <li>change <code>k_copy_src1_to_contiguous</code> so that uses a precomputed contiguous mapping where all ro…
llama.cpp — Releases TIER_1 (SO) · github-actions[bot] · 2026-05-22 22:19

b9292

<details open=""> perplexity : fix integer overflow (<a class="issue-link js-issue-link" href="https://github.com/ggml-org/llama.cpp/pull/23496">#23496</a>) Co-authored-by: Stanisław Szymczyk <a href="mailto:[email protected]">[email protected]</a> </details> <…
llama.cpp — Releases TIER_1 (SO) · github-actions[bot] · 2026-05-22 22:14

b9290

<details open=""> sycl : Level Zero detection in ggml_sycl_init (<a class="issue-link js-issue-link" href="https://github.com/ggml-org/llama.cpp/pull/23097">#23097</a>) <ul> <li> [SYCL] Centralize Level Zero detection in ggml_sycl_init </li> <li> use the same wor…
llama.cpp — Releases TIER_1 (SO) · github-actions[bot] · 2026-05-22 21:38

b9289

<details open=""> SYCL : gated_delta_net K>1 (<a class="issue-link js-issue-link" href="https://github.com/ggml-org/llama.cpp/pull/23174">#23174</a>) <ul> <li> sycl_gated_delta_net K>1 </li> <li> editor_config </li> </ul> </details> macOS/iOS…

报道来源 [8]

相关实体

相关话题