Haiku to Opus in Just 10 bits: LLMs Unlock Large Compression Gains
Researchers have developed new methods for compressing text generated by large language models (LLMs), achieving significant gains in both lossless and lossy compression. By adapting LoRA adapters for lossless compression, they improved LLM-based arithmetic coding by twofold. For lossy compression, a novel interactive protocol called Question-Asking (QA) compression was introduced, where a smaller model asks yes/no questions to a larger model to refine its response. This QA method achieved compression ratios over 100 times smaller than previous LLM-based techniques, effectively transferring knowledge with minimal data. AI
IMPACT New compression techniques could significantly reduce the cost and latency of deploying LLMs by enabling more efficient knowledge transfer.