Fireworks AI launches K2.7 Code model for efficient agentic coding

By PulseAugur Editorial · [3 sources] · 2026-06-13 04:38

Fireworks AI has launched its K2.7 Code model, an advancement in their K2 line of coding models, now available on their serverless platform and API. This new model is designed to reduce reasoning token usage in long agent loops, leading to faster generations and lower costs per completed task. K2.7 Code achieves this by producing approximately 30% fewer reasoning tokens than its predecessor, K2.6, while simultaneously improving performance on coding benchmarks. AI

IMPACT This release offers improved efficiency for agentic coding tasks, potentially lowering operational costs for AI developers.

RANK_REASON This is a product launch for an inference infrastructure provider, not a frontier model release from a core AI lab.

Read on X — Fireworks (inference infra) →

AI-generated summary · Google Gemini · from 3 sources. How we write summaries →

Fireworks AI launches K2.7 Code model for efficient agentic coding

COVERAGE [3]

X — Fireworks (inference infra) TIER_1 English(EN) · FireworksAI_HQ · 2026-06-13 04:38

Available now on Fireworks serverless.

Available now on Fireworks serverless. → Standard tier (pay per token) → Priority tier for critical workloads → Fast path coming soon Pricing: $0.95 / 1M input, $4 / 1M output, $0.19 / 1M cache hits. 256K context. Full details here: https://t.co/R589xwqjf0 https://t.co/PwOW1Bjd…
X — Fireworks (inference infra) TIER_1 English(EN) · FireworksAI_HQ · 2026-06-13 04:38

In long agent loops, reasoning tokens get reused as context on every following turn.

In long agent loops, reasoning tokens get reused as context on every following turn. Shorter reasoning means smaller contexts downstream, faster generations, and fewer retries. K2.7 Code reduces that overhead without giving up quality, which lowers the real cost per completed
X — Fireworks (inference infra) TIER_1 English(EN) · FireworksAI_HQ · 2026-06-13 04:38

Moonshot released K2.7 Code, the latest in their K2 line of coding models, and it's live on Fireworks Day 0, on serverless and the API.

Moonshot released K2.7 Code, the latest in their K2 line of coding models, and it's live on Fireworks Day 0, on serverless and the API. It produces roughly 30% fewer reasoning tokens than K2.6 while scoring higher on Moonshot’s coding benchmarks. For agentic coding work, that h…

COVERAGE [3]

Available now on Fireworks serverless.

In long agent loops, reasoning tokens get reused as context on every following turn.

Moonshot released K2.7 Code, the latest in their K2 line of coding models, and it's live on Fireworks Day 0, on serverless and the API.

RELATED ENTITIES

RELATED TOPICS