实体
ik_llama.cpp
ik_llama.cpp
PulseAugur coverage of ik_llama.cpp — every cluster mentioning ik_llama.cpp across labs, papers, and developer communities, ranked by signal.
总计 · 30天
2
90 天内 2
发布 · 30天
0
90 天内 0
论文 · 30天
0
90 天内 0
层级分布 · 90 天
情绪 · 30 天
1 天有情绪数据
最近 · 第 1/1 页 · 共 2 条
-
Qwen 3.6 model hits 110 tokens/sec on consumer GPUs via llama.cpp
The open-weight model Qwen 3.6, in its 35 billion parameter version, has achieved an impressive 110 tokens per second inference speed on consumer GPUs with 12GB of VRAM. This performance was enabled by a specialized var…
-
llama.cpp and ik_llama.cpp add FP4 inference support for VRAM savings
The llama.cpp and ik_llama.cpp projects have both integrated support for FP4 (4-bit floating-point) inference, a significant advancement for model quantization. llama.cpp now includes NVFP4, an Nvidia-specific format, w…