实体 Kog Inference Engine

Kog Inference Engine

PulseAugur coverage of Kog Inference Engine — every cluster mentioning Kog Inference Engine across labs, papers, and developer communities, ranked by signal.

Show in brief

总计 · 30天

90 天内 1

发布 · 30天

90 天内 0

论文 · 30天

90 天内 0

层级分布 · 90 天

主题

时间线

2026-05-29 product_launch Kog AI launched a tech preview of its Kog Inference Engine, demonstrating high inference speeds on standard GPUs. 来源

最近 · 第 1/1 页 · 共 1 条

TOOL · CL_59358 · May 29 · 09:47

Kog AI 在标准GPU上实现每秒3000个token的LLM推理

Kog AI 推出了其Kog推理引擎（KIE）的技术预览版，在标准数据中心GPU上展示了显著更快的实时LLM推理速度。该引擎在8块AMD MI300X GPU上实现了每秒3000个输出token，在8块NVIDIA H200 GPU上实现了每秒2100个token，重点在于优化整个软件栈的内存带宽而非原始FLOPS。这一进步对于AI代理尤其关键，因为单请求的解码速度直接影响迭代速度以及在给定时间预算内可完成的任务的复杂性。

Kog AI 在标准GPU上实现每秒3000个token的LLM推理