PulseAugur
EN
LIVE 23:10:46

Fireworks AI launches Kimi K2.7 Fast inference, touts internal benchmarks

Fireworks AI has announced Kimi K2.7 Fast, an updated inference infrastructure powered by the same technology as Kimi K2.6 Fast. This new offering is available on Fireworks' serverless platform, with options for standard (pay-per-token) and priority tiers, and a fast path coming soon. The service supports a 256K context window and features pricing at $0.95 per 1 million input tokens, $4 per 1 million output tokens, and $0.19 per 1 million cache hits. Additionally, Fireworks AI suggests that internal benchmarks, like those developed by Ramp, are more valuable than saturated public leaderboards for evaluating AI models. AI

IMPACT Enhances inference capabilities for AI applications with a large context window and tiered service options.

RANK_REASON This is a product update for an inference infrastructure service, not a new frontier model release.

Read on X — Fireworks (inference infra) →

AI-generated summary · Google Gemini · from 2 sources. How we write summaries →

COVERAGE [2]

  1. X — Fireworks (inference infra) TIER_1 English(EN) · FireworksAI_HQ ·

    RT @richychn: the same tech that powered Kimi K2.6 Fast on Fire Pass now powers Kimi K2.7 Fast

    RT @richychn: the same tech that powered Kimi K2.6 Fast on Fire Pass now powers Kimi K2.7 Fast

  2. X — Fireworks (inference infra) TIER_1 English(EN) · FireworksAI_HQ ·

    RT @eglyman: public benchmarks are saturated. every frontier model has trained against them, and the leaderboard tells you near nothing.

    RT @eglyman: public benchmarks are saturated. every frontier model has trained against them, and the leaderboard tells you near nothing. w…