Brief · PulseAugur

TOOL · Simon Willison English(EN) · 1w · [3 sources]

How fast is 10 tokens per second really?

A new interactive tool allows users to visualize the speed of language model token generation, from 5 to 800 tokens per second. Developed by Mike Veerman, this web application helps users understand advertised speeds like "30 tokens/second" by simulating the output in real-time. The tool is useful for gauging the practical performance of different LLMs. AI

IMPACT Helps users intuitively grasp and compare LLM generation speeds, aiding in model selection and expectation setting.

language models
tokens per second
LLM
Mike Veerman