Fireworks AI has announced Kimi K2.7 Fast, an updated inference infrastructure powered by the same technology as Kimi K2.6 Fast. This new offering is available on Fireworks' serverless platform, with options for standard (pay-per-token) and priority tiers, and a fast path coming soon. The service supports a 256K context window and features pricing at $0.95 per 1 million input tokens, $4 per 1 million output tokens, and $0.19 per 1 million cache hits. Additionally, Fireworks AI suggests that internal benchmarks, like those developed by Ramp, are more valuable than saturated public leaderboards for evaluating AI models. AI
IMPACT Enhances inference capabilities for AI applications with a large context window and tiered service options.
RANK_REASON This is a product update for an inference infrastructure service, not a new frontier model release.
Read on X — Fireworks (inference infra) →
AI-generated summary · Google Gemini · from 2 sources. How we write summaries →