llama.cpp

ENTITY llama.cpp

llama.cpp

PulseAugur coverage of llama.cpp — every cluster mentioning llama.cpp across labs, papers, and developer communities, ranked by signal.

Show in brief

Total · 30d

382

382 over 90d

Releases · 30d

0

0 over 90d

Papers · 30d

14

14 over 90d

TIER MIX · 90D

frontier release 7
significant 5
research 23
tool 278
commentary 50
meme 19

TOPICS

RELATIONSHIPS

TIMELINE

2026-06-25 product_launch The llama.cpp project released version b9802 with pre-compiled binaries for multiple operating systems and hardware. source
2026-06-25 product_launch llama.cpp version b9788 introduces tensor split support for Intel GPUs. source
2026-06-17 product_launch llama.cpp has added API support for on-demand model management, including downloading and unloading models. source
2026-06-08 research_milestone llama.cpp merged a pull request to optimize KV cache performance for the Gemma-4 model. source
2026-06-05 product_launch A SYCL backend has been ported to llama.cpp, offering performance improvements for Intel Arc GPUs. source
2026-05-30 product_launch llama.cpp released version b9438, adding custom CSS injection for web UI theming. source
2026-05-25 research_milestone A fix is expected for llama.cpp to address split mode tensor crashes. source
2026-05-25 product_launch A pull request was submitted to improve checkpoint creation and context handling in llama.cpp. source
2026-05-24 product_launch llama.cpp released version b9305 with pre-compiled binaries for multiple platforms. source
2026-05-17 research_milestone llama.cpp implements MTP optimizations and prompt decode improvements for faster local AI inference. source
2026-05-14 product_launch A performance-optimized fork of llama.cpp was released with new features. source
2026-05-12 product_launch llama.cpp project integrates llama-eval tool for model benchmarking. source