llama.cpp SYCL benchmarks show mixed performance for Gemma and Qwen models

By PulseAugur Editorial · [1 sources] · 2026-06-20 05:20

Benchmarks for the llama.cpp project using the SYCL backend have been released, showcasing performance metrics for various models. The tests included Gemma 4 models of different sizes (4.65B, 11.91B, and 25.23B parameters) and Qwen 35 models (27.32B and 34.66B parameters). While the SYCL backend appears functional, the results suggest that performance could be further optimized. AI

IMPACT Provides insights into the performance of SYCL backend for local LLM inference, potentially guiding optimization efforts.

RANK_REASON The item details performance benchmarks for a specific software implementation (llama.cpp) using a particular backend (SYCL) with various language models. [lever_c_demoted from research: ic=1 ai=1.0]

Read on r/LocalLLaMA →

infra

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

llama.cpp SYCL benchmarks show mixed performance for Gemma and Qwen models

COVERAGE [1]

r/LocalLLaMA TIER_1 English(EN) · /u/siegevjorn · 2026-06-20 05:20

Some llama.cpp B70 SYCL benchmarks

<div class="md">build: dd4623a74 (9640) | model | size | params | backend | ngl | test | t/s | | ------------------------------ | ---------: | ---------: | ---------- | --: | --------------: | -------------------: | | gemma4 12B Q8_0 | 11.78…

COVERAGE [1]

Some llama.cpp B70 SYCL benchmarks

RELATED ENTITIES

RELATED TOPICS