PulseAugur
实时 14:22:25

StateSMix compressor uses Mamba SSMs and n-grams for online lossless compression

Researchers have developed StateSMix, a novel lossless compression algorithm that utilizes Mamba-style State Space Models (SSMs) combined with sparse n-gram context mixing. This system trains token-by-token on the data being compressed, eliminating the need for pre-trained weights or GPUs. StateSMix achieves competitive compression ratios, outperforming xz (LZMA2) on the enwik8 benchmark by up to 8.7%. The implementation is in pure C and can process approximately 2,000 tokens per second on standard hardware. AI

影响 Introduces a new method for lossless compression using state space models, potentially improving data storage efficiency.

排序理由 This is a research paper detailing a new algorithm and its performance on a standard benchmark. [lever_c_demoted from research: ic=1 ai=1.0]

在 arXiv cs.LG 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。 我们如何撰写摘要 →

StateSMix compressor uses Mamba SSMs and n-grams for online lossless compression

报道来源 [1]

  1. arXiv cs.LG TIER_1 English(EN) · Roberto Tacconelli ·

    StateSMix: Online Lossless Compression via Mamba State Space Models and Sparse N-gram Context Mixing

    arXiv:2605.02904v1 Announce Type: new Abstract: We present StateSMix, a fully self-contained lossless compressor that couples an online-trained Mamba-style State Space Model (SSM) with sparse n-gram context mixing and arithmetic coding. The model is initialised from scratch and t…