English(EN) Looking for a working Deepseek-v4-Flash quant

用户寻求可用的 Deepseek-v4-Flash 量化版本

作者 PulseAugur 编辑部 · [1 个来源] · 2026-05-27 17:58

在 r/LocalLLaMA 子版块上，用户正在寻找可用的 Deepseek-v4-Flash 模型量化版本。一位用户分享了一个 Hugging Face 链接，指向一个 Deepseek-V4-Flash-FP4-FP8-GGUF 量化版本，但报告称质量低下且输出不连贯。该用户还指出，目前 VLLM 仅支持 H100 GPU 运行此模型，并正在寻找兼容 llama.cpp 或 vLLM 的替代量化版本。 AI

影响用户在特定模型量化方面遇到困难，表明在优化大型模型以进行本地部署方面仍存在挑战。

排序理由用户讨论模型量化和兼容性问题，并非主要发布或基准测试。

在 r/LocalLLaMA 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

r/LocalLLaMA TIER_1 English(EN) · /u/ortegaalfredo · 2026-05-27 17:58

Looking for a working Deepseek-v4-Flash quant

<div class="md"><p>Best I tried so far is <a href="https://huggingface.co/nsparks/DeepSeek-V4-Flash-FP4-FP8-GGUF">https://huggingface.co/nsparks/DeepSeek-V4-Flash-FP4-FP8-GGUF</a> with the custom llama.cpp fork, but it suffers from low quality and random incoherent…

报道来源 [1]

Looking for a working Deepseek-v4-Flash quant

相关实体

相关话题