English(EN) "One thing that we've been seeing recently is that inference benchmarks don't really match production workloads that well." - @realDanFu, VP of Kernels

Together AI：推理基准测试未能反映生产实际情况

作者 PulseAugur 编辑部 · [1 个来源] · 2026-05-19 20:38

Together公司Kernels部门副总裁Dan Fu表示，推理基准测试可能无法准确反映真实的生产工作负载。当运行需要大型上下文窗口的众多并发编码代理时，这一点尤其明显。Fu建议基准测试应更好地与这些复杂、高要求的操作场景保持一致。 AI

影响强调了AI模型评估与实际应用之间潜在的脱节，表明需要更相关的基准测试。

排序理由该条目是公司代表关于当前基准测试局限性的陈述，而非新的发布或研究发现。

在 X — Together (inference / OSS) 阅读 →

其他

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

X — Together (inference / OSS) TIER_1 English(EN) · togethercompute · 2026-05-19 20:38

"One thing that we've been seeing recently is that inference benchmarks don't really match production workloads that well." - @realDanFu, VP of Kernels

"One thing that we've been seeing recently is that inference benchmarks don't really match production workloads that well." - @realDanFu, VP of Kernels When you're running dozens of concurrent coding agents — each with 45k–200k token contexts — the benchmarks that matter are the…

报道来源 [1]

"One thing that we've been seeing recently is that inference benchmarks don't really match production workloads that well." - @realDanFu, VP of Kernels

相关实体

相关话题