Ollama users seek token count without inference

By PulseAugur Editorial · Summary by gemini-2.5-flash-lite from 1 source

Users are inquiring about the possibility of obtaining token counts from Ollama without initiating a full inference process. The current API structure appears to require a prompt, leading to an inference even when only token estimation is desired. This suggests a potential feature gap for developers needing precise token calculations for prompt optimization or cost management. AI

Summary written by gemini-2.5-flash-lite from 1 source. How we write summaries →

IMPACT This inquiry highlights a potential usability improvement for AI developers using Ollama, enabling more efficient prompt engineering and cost tracking.

RANK_REASON User inquiry about a specific feature of an existing AI tool.

Read on Mastodon — fosstodon.org →

COVERAGE [1]

Mastodon — fosstodon.org TIER_1 · [email protected] · 2026-05-13 23:40

# Ollama # AI is there no way to request the token count without inferencing? I should be able to be like... `curl -s http://localhost:11434/api/count -d '{ "mo

# Ollama # AI is there no way to request the token count without inferencing? I should be able to be like... `curl -s http://localhost:11434/api/count -d '{ "model": "gemma3:4b", "prompt": "Why is the sky blue? Answer in one sentence.", "stream": false }'` and then it respond wit…

COVERAGE [1]

# Ollama # AI is there no way to request the token count without inferencing? I should be able to be like... `curl -s http://localhost:11434/api/count -d '{ "mo

RELATED ENTITIES

RELATED TOPICS