Prediction Under Imperfect Compression: A Theory of Approximate MDL
Researchers are exploring novel methods for compressing large models and datasets to improve efficiency. Papers discuss unifying dataset pruning and distillation, bootstrapped tokenization for image generation, and activation-informed low-rank compression for LLMs and VLMs. Other work focuses on generic triple-latent sequence models, theoretical aspects of prediction under imperfect compression, and jointly optimizing architectural and quantization choices for LLM compression. AI
IMPACT Advances in compression techniques could significantly reduce deployment costs and increase the accessibility of large AI models.