English(EN) Together AI delivers fastest inference for the top open-source models

Together AI 提升自定义模型推理速度，优化开源大模型

作者 PulseAugur 编辑部 · [3 个来源] · 2025-08-27 00:00

Together AI 推出了名为 Dedicated Container Inference 的新服务，旨在优化自定义生成媒体模型的部署和性能。该平台处理自动扩展、排队和流量隔离等复杂的编排任务，使团队能够专注于模型逻辑。该服务已展现出显著的推理速度提升，部分客户的性能提升高达 2.6 倍。此外，Together AI 还宣布了其推理平台的进步，通过利用下一代 GPU 硬件和优化的内核，为顶级开源模型实现了高达 2 倍的服务器无服务器推理速度。 AI

影响加速自定义和开源 AI 模型的部署和推理，可能降低专业 AI 应用的成本并提高其可访问性。

排序理由该集群宣布了来自知名 AI 基础设施提供商的新产品和现有服务的重大性能改进。

在 Together AI blog 阅读 →

AI 生成摘要 · Google Gemini · 来自 3 个来源。我们如何撰写摘要 →

报道来源 [3]

Together AI blog TIER_1 English(EN) · 2026-02-12 00:00

推出专用容器推理：为自定义 AI 模型提供快 2.6 倍的推理速度

Together AI launches production-grade orchestration for custom AI models with 1.4x–2.6x faster inference.
Together AI blog TIER_1 English(EN) · 2025-12-01 00:00

Together AI 为顶级开源模型提供最快的推理速度

Together AI achieves up to 2x faster inference for top open-source models like Qwen, DeepSeek, and Kimi through GPU optimization, advanced speculative decoding, and FP4 quantization—ranking #1 in speed benchmarks on NVIDIA Blackwell architecture.
Together AI blog TIER_1 English(EN) · 2025-08-27 00:00

DeepSeek-V3.1：混合思维模型现已在 Together AI 上可用

Access DeepSeek-V3.1 on Together AI: MIT-licensed hybrid model with thinking/non-thinking modes, 66% SWE-bench Verified, serverless deployment, 99.9% SLA.

报道来源 [3]

推出专用容器推理：为自定义 AI 模型提供快 2.6 倍的推理速度

Together AI 为顶级开源模型提供最快的推理速度

DeepSeek-V3.1：混合思维模型现已在 Together AI 上可用

相关实体

相关话题