GitHub - chopratejas/headroom: Compress tool outputs, logs, files, and RAG chunks before they reach the LLM. 60-95% fewer tokens, same answers. Library, proxy, MCP server.
Headroom is a new open-source tool designed to compress data before it is processed by large language models. This compression can reduce token usage by 60-95%, leading to faster processing times and making smaller models more viable for complex tasks. The tool functions as a library, proxy, or MCP server and includes optional telemetry that can be disabled by the user. AI
IMPACT Reduces token usage and speeds up LLM processing, making smaller models more practical.