AI Model Pricing Shifts: NVIDIA, MoonshotAI, DeepSeek Cut Costs; Z.ai Adds Long-Context Model

By PulseAugur Editorial · [1 sources] · 2026-06-17 11:57

Several AI model providers have announced pricing adjustments and new model releases. NVIDIA's Nemotron 3 Ultra has seen a completion price drop, benefiting long-form generation workloads. MoonshotAI's Kimi K2.7 Code and DeepSeek's V4 Flash models have also reduced prompt and completion costs, targeting developers sensitive to input token expenses and users seeking low-latency inference. Additionally, Z.ai has introduced GLM 5.2, a model with a 1,048,576 token context window, albeit with moderate-to-high generation costs. AI

IMPACT Price reductions and new models with varying context lengths offer more options for developers optimizing cost and performance.

RANK_REASON This item is a digest of pricing changes and new model additions from various AI providers, not a primary release from a frontier lab.

Read on dev.to — LLM tag →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

COVERAGE [1]

dev.to — LLM tag TIER_1 (AF) · 4663437Mehdi · 2026-06-17 11:57

2026-06-17 Digest

<h1> 2026-06-17 Digest </h1> <h2> Most Impactful Change </h2> <ul> <li> <strong>NVIDIA: Nemotron 3 Ultra</strong> – completion price fell from <strong>$2.50/1M</strong> to <strong>$2.20/1M</strong> (prompt unchanged at <strong>$0.50/1M</strong>). <em>Who should care:</em> Teams r…

COVERAGE [1]

2026-06-17 Digest

RELATED ENTITIES

RELATED TOPICS