AI token costs to drop by 2027 amid hardware/software gains · 4 sources tracked

By PulseAugur Editorial · [4 sources] · 2026-06-27 17:00

SemiAnalysis reports that the cost of AI tokens is projected to decrease significantly by 2027, driven by advancements in hardware and software optimization. These improvements, such as increased throughput and efficient task substitution, are fundamentally altering the unit economics of professional services and AI operations. The firm notes that their own token expenditure has reached 30% of employee compensation, highlighting a broader trend across research firms and financial institutions. AI

IMPACT Predicts a significant decrease in AI token costs by 2027, impacting the economics of AI operations and professional services.

RANK_REASON Analysis and projections about AI token costs and adoption trends from a research firm.

Read on X — SemiAnalysis →

AI-generated summary · Google Gemini · from 4 sources. How we write summaries →

AI token costs to drop by 2027 amid hardware/software gains · 4 sources tracked

COVERAGE [4]

X — SemiAnalysis TIER_1 English(EN) · SemiAnalysis_ · 2026-06-27 17:00

If you are an operator trying to write down what tokens will cost in 2027, the answer is materially lower than today, and the firms that have already adopted ar

If you are an operator trying to write down what tokens will cost in 2027, the answer is materially lower than today, and the firms that have already adopted are the ones setting the pace. The full math, plus a value capture breakdown across labs, hyperscalers, inference
X — SemiAnalysis TIER_1 English(EN) · SemiAnalysis_ · 2026-06-27 17:00

The throughput math has gotten the most pushback in our reader notes, so its worth being precise. On the same B300 running DeepSeek R1, baseline FP8 sits near 1

The throughput math has gotten the most pushback in our reader notes, so its worth being precise. On the same B300 running DeepSeek R1, baseline FP8 sits near 1,000 tokens/sec/GPU, adding wideEP plus disagg gets you to roughly 8,000, and layering MTP on top pushes it to about
X — SemiAnalysis TIER_1 English(EN) · SemiAnalysis_ · 2026-06-27 17:00

The substitution math is the part to internalize. Tasks that used to need a junior analyst for several hours, converting a model to a dashboard, building chart

The substitution math is the part to internalize. Tasks that used to need a junior analyst for several hours, converting a model to a dashboard, building chart packs from earnings, rebuilding a comp set, now resolve in minutes for a few dollars of tokens. The blended Opus 4.7
X — SemiAnalysis TIER_1 English(EN) · SemiAnalysis_ · 2026-06-27 17:00

One of the more uncomfortable observations in our AI Value Capture piece is internal: our token spend at SemiAnalysis now runs at roughly 30% of employee compen

One of the more uncomfortable observations in our AI Value Capture piece is internal: our token spend at SemiAnalysis now runs at roughly 30% of employee compensation, with employees pulling just under 5 billion tokens per month on average, over 5x more than Meta, and our top htt…

COVERAGE [4]

If you are an operator trying to write down what tokens will cost in 2027, the answer is materially lower than today, and the firms that have already adopted ar

The throughput math has gotten the most pushback in our reader notes, so its worth being precise. On the same B300 running DeepSeek R1, baseline FP8 sits near 1

The substitution math is the part to internalize. Tasks that used to need a junior analyst for several hours, converting a model to a dashboard, building chart

One of the more uncomfortable observations in our AI Value Capture piece is internal: our token spend at SemiAnalysis now runs at roughly 30% of employee compen

RELATED ENTITIES

RELATED TOPICS