User runs 5B parameter video model on 8GB VRAM via block swapping

By PulseAugur Editorial · [1 sources] · 2026-06-20 17:52

A user has successfully run the Wan 2.2 TI2V 5B model on a graphics card with only 8 GB of VRAM by employing a technique called WanVideoBlockSwap. This method offloads transformer blocks to the CPU's system RAM during inference, allowing larger models to operate on less powerful hardware. While this significantly impacts generation speed, the user reports that the output quality remains indistinguishable from models run on high-VRAM GPUs. AI

IMPACT Enables running larger video generation models on consumer-grade hardware with limited VRAM.

RANK_REASON User-developed technique for running a large model on limited hardware.

Read on r/StableDiffusion →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

User runs 5B parameter video model on 8GB VRAM via block swapping

COVERAGE [1]

r/StableDiffusion TIER_2 English(EN) · /u/ApprehensiveAd1946 · 2026-06-20 17:52

How I got Wan 2.2 TI2V 5B running on 8 GB VRAM using block swapping (and what the tradeoffs actually are)

<div class="md"><p>I've been running Wan 2.2 TI2V 5B Turbo locally on an RTX 4060 8 GB and the thing that made it possible was WanVideoBlockSwap — a ComfyUI node that offloads transformer blocks to CPU RAM between attention passes instead of keeping the whole model…

COVERAGE [1]

How I got Wan 2.2 TI2V 5B running on 8 GB VRAM using block swapping (and what the tradeoffs actually are)

RELATED ENTITIES

RELATED TOPICS