English(EN) Self-hosted LLM, same prompt, temperature zero - 6 different answers

自托管 LLM 在并行处理时显示输出不一致

作者 PulseAugur 编辑部 · [1 个来源] · 2026-06-07 06:32

使用自托管 LLM 在并行进程中运行相同的提示可能会导致输出不一致，即使温度设置为零。这是因为同时处理的请求以更大的批次处理，由于 GPU 调度可能会产生不同的浮点结果。开发人员可以通过在允许模型在并行代理应用程序中执行操作之前实现一致性探测来检测此问题。 AI

影响强调了自托管 LLM 在并行使用时可能存在的不一致性，影响了代理的可靠性。

排序理由该条目描述了关于 LLM 行为的技术发现，而不是产品发布或重大行业事件。[lever_c_demoted from research: ic=1 ai=1.0]

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

dev.to — LLM tag TIER_1 English(EN) · Alex · 2026-06-07 06:32

自托管大模型，相同提示，温度零 - 6 个不同答案

<p>Sequential execution was perfect, same expected answer 100% of the time. Looked like a reliable system. Then we ran the same test with five parallel processes, and the model started disagreeing with itself. It returned the expected answer only 87% of the time.</p> <p>What's ac…