PulseAugur
EN
LIVE 07:15:59

Clark Labs shrinks text-to-image model to 374MB with ternary quantization

Clark Labs has released Clark Air, a 1.6 billion parameter text-to-image transformer model that has been compressed to approximately 1.85 bits per weight. This quantization results in a model size that is 8.6 times smaller than its FP16 equivalent, reducing the packed file size to 374 MB while maintaining near-FP16 quality. The model is based on the Sana 1.6B architecture and utilizes ternary quantization with a small high-precision tail for conditioning and projection layers. AI

IMPACT Enables deployment of advanced text-to-image models on devices with limited storage and computational resources.

RANK_REASON Release of a quantized model with performance metrics.

Read on r/LocalLLaMA →

AI-generated summary · Google Gemini · from 2 sources. How we write summaries →

Clark Labs shrinks text-to-image model to 374MB with ternary quantization

COVERAGE [2]

  1. r/LocalLLaMA TIER_1 English(EN) · /u/pmttyji ·

    clark-labs/clark-air-sana-1.6b-1.58bit · Hugging Face

    <table> <tr><td> <a href="https://www.reddit.com/r/LocalLLaMA/comments/1uhobd0/clarklabsclarkairsana16b158bit_hugging_face/"> <img alt="clark-labs/clark-air-sana-1.6b-1.58bit · Hugging Face" src="https://external-preview.redd.it/zfvkx31-NM_6KccYq9q1OexxZ6aookbW9oDQ8RFfnSM.png?wid…

  2. r/StableDiffusion TIER_2 English(EN) · /u/LumenLime ·

    clark-labs/clark-air-sana-1.6b-1.58bit · Hugging Face

    <!-- SC_OFF --><div class="md"><p><a href="https://huggingface.co/clark-labs/clark-air-sana-1.6b-1.58bit">https://huggingface.co/clark-labs/clark-air-sana-1.6b-1.58bit</a></p> <p><strong>A Sana 1.6B text-to-image transformer compressed to ternary (~1.85 bits/weight): 8.6× smaller…