How fast is 10 tokens per second really?
A new interactive tool allows users to visualize the speed of language model token generation, from 5 to 800 tokens per second. Developed by Mike Veerman, this web application helps users understand advertised speeds like "30 tokens/second" by simulating the output in real-time. The tool is useful for gauging the practical performance of different LLMs. AI
IMPACT Helps users intuitively grasp and compare LLM generation speeds, aiding in model selection and expectation setting.