PulseAugur
LIVE 10:08:17
tool · [1 source] ·
0
tool

StateSMix compressor uses Mamba SSMs and n-grams for online lossless compression

Researchers have developed StateSMix, a novel lossless compression algorithm that utilizes Mamba-style State Space Models (SSMs) combined with sparse n-gram context mixing. This system trains token-by-token on the data being compressed, eliminating the need for pre-trained weights or GPUs. StateSMix achieves competitive compression ratios, outperforming xz (LZMA2) on the enwik8 benchmark by up to 8.7%. The implementation is in pure C and can process approximately 2,000 tokens per second on standard hardware. AI

Summary written by gemini-2.5-flash-lite from 1 source. How we write summaries →

IMPACT Introduces a new method for lossless compression using state space models, potentially improving data storage efficiency.

RANK_REASON This is a research paper detailing a new algorithm and its performance on a standard benchmark. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.LG →

COVERAGE [1]

  1. arXiv cs.LG TIER_1 · Roberto Tacconelli ·

    StateSMix: Online Lossless Compression via Mamba State Space Models and Sparse N-gram Context Mixing

    arXiv:2605.02904v1 Announce Type: new Abstract: We present StateSMix, a fully self-contained lossless compressor that couples an online-trained Mamba-style State Space Model (SSM) with sparse n-gram context mixing and arithmetic coding. The model is initialised from scratch and t…