Ollama users seek token count without inference

By PulseAugur Editorial · [1 sources] · 2026-05-13 23:40

Users are inquiring about the possibility of obtaining token counts from Ollama without initiating a full inference process. The current API structure appears to require a prompt, leading to an inference even when only token estimation is desired. This suggests a potential feature gap for developers needing precise token calculations for prompt optimization or cost management. AI

IMPACT This inquiry highlights a potential usability improvement for AI developers using Ollama, enabling more efficient prompt engineering and cost tracking.

RANK_REASON User inquiry about a specific feature of an existing AI tool.

Read on Mastodon — fosstodon.org →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

COVERAGE [1]

Mastodon — fosstodon.org TIER_1 English(EN) · [email protected] · 2026-05-13 23:40

# Ollama # AI is there no way to request the token count without inferencing? I should be able to be like... `curl -s http://localhost:11434/api/count -d '{ "mo

# Ollama # AI is there no way to request the token count without inferencing? I should be able to be like... `curl -s http://localhost:11434/api/count -d '{ "model": "gemma3:4b", "prompt": "Why is the sky blue? Answer in one sentence.", "stream": false }'` and then it respond wit…

COVERAGE [1]

# Ollama # AI is there no way to request the token count without inferencing? I should be able to be like... `curl -s http://localhost:11434/api/count -d '{ "mo

RELATED ENTITIES

RELATED TOPICS