PulseAugur
实时 19:14:07

llama.cpp 发布增加新的张量支持和错误修复

llama.cpp 项目发布了多个更新,包括 b9297 版本,增加了 NVFP4 MTP 标量张量并链接了 Qwen3.5 MTP 张量。之前的版本,如 b9296 和 b9295,则侧重于 Vulkan 和其他功能的错误修复和改进。这些版本为包括 macOS、Linux、Android 和 Windows 在内的各种操作系统和硬件架构提供了预编译的二进制文件,并支持 CUDA、ROCm、Vulkan 和 SYCL 等多种计算后端。 AI

影响 llama.cpp 的持续开发为用户提供了更高效、更兼容的工具,以便在各种硬件上运行大型语言模型。

排序理由 该集群包含一个开源项目的多个发布版本,该项目提供运行大型语言模型的工具,表明其正在持续开发和更新。

在 llama.cpp — Releases 阅读 →

AI 生成摘要 · Google Gemini · 来自 8 个来源。 我们如何撰写摘要 →

llama.cpp 发布增加新的张量支持和错误修复

报道来源 [8]

  1. llama.cpp — Releases TIER_1 (SO) · github-actions[bot] ·

    b9297

    <details open=""> <p>model : add NVFP4 MTP scale tensors (<a class="issue-link js-issue-link" href="https://github.com/ggml-org/llama.cpp/pull/23563">#23563</a>)</p> <ul> <li> <p>Add NVFP4 MTP scale tensors</p> </li> <li> <p>Link Qwen3.5 MTP tensors</p> </li> <li> <p>Aligned null…

  2. llama.cpp — Releases TIER_1 (SO) · github-actions[bot] ·

    b9296

    <details open=""> <p>ggml : Check the right iface method before using the fallback 2d get (<a class="issue-link js-issue-link" href="https://github.com/ggml-org/llama.cpp/pull/23514">#23514</a>)</p> </details> <p><strong>macOS/iOS:</strong></p> <ul> <li><a href="https://github.co…

  3. llama.cpp — Releases TIER_1 (SO) · github-actions[bot] ·

    b9295

    <details open=""> <p>vulkan: fix windows find_package of SPIRV-Headers (<a class="issue-link js-issue-link" href="https://github.com/ggml-org/llama.cpp/pull/23215">#23215</a>)</p> <ul> <li> <p>vulkan: fix windows find_package of SPIRV-Headers</p> </li> <li> <p>not windows-only</p…

  4. llama.cpp — Releases TIER_1 (SO) · github-actions[bot] ·

    b9294

    <details open=""> <p>opencl: generalize Adreno MoE kernels on M (<a class="issue-link js-issue-link" href="https://github.com/ggml-org/llama.cpp/pull/23449">#23449</a>)</p> </details> <p><strong>macOS/iOS:</strong></p> <ul> <li><a href="https://github.com/ggml-org/llama.cpp/relea…

  5. llama.cpp — Releases TIER_1 (SO) · github-actions[bot] ·

    b9291

    <details open=""> <p>SYCL: improve MoE prefill throughput (<a class="issue-link js-issue-link" href="https://github.com/ggml-org/llama.cpp/pull/23142">#23142</a>)</p> <ul> <li>change <code>k_copy_src1_to_contiguous</code> so that uses a precomputed contiguous mapping where all ro…

  6. llama.cpp — Releases TIER_1 (SO) · github-actions[bot] ·

    b9292

    <details open=""> <p>perplexity : fix integer overflow (<a class="issue-link js-issue-link" href="https://github.com/ggml-org/llama.cpp/pull/23496">#23496</a>)</p> <p>Co-authored-by: Stanisław Szymczyk <a href="mailto:[email protected]">[email protected]</a></p> </details> <p><…

  7. llama.cpp — Releases TIER_1 (SO) · github-actions[bot] ·

    b9290

    <details open=""> <p>sycl : Level Zero detection in ggml_sycl_init (<a class="issue-link js-issue-link" href="https://github.com/ggml-org/llama.cpp/pull/23097">#23097</a>)</p> <ul> <li> <p>[SYCL] Centralize Level Zero detection in ggml_sycl_init</p> </li> <li> <p>use the same wor…

  8. llama.cpp — Releases TIER_1 (SO) · github-actions[bot] ·

    b9289

    <details open=""> <p>SYCL : gated_delta_net K&gt;1 (<a class="issue-link js-issue-link" href="https://github.com/ggml-org/llama.cpp/pull/23174">#23174</a>)</p> <ul> <li> <p>sycl_gated_delta_net K&gt;1</p> </li> <li> <p>editor_config</p> </li> </ul> </details> <p><strong>macOS/iOS…