WAN 2.2 SVI 2.0 Pro — Unlimited Length AI Videos in ComfyUI

WAN 2.2 produces some of the best AI video quality available — but most setups cap you at around 5 seconds before memory runs out. The SVI (Stable Video Infinity) 2.0 Pro LoRAs completely change that by chaining clips together seamlessly so you can generate long-form video that technically runs as long as you want. And with quantized GGUF models, this works on cards with as little as 6 GB VRAM.

How SVI Chaining Works

Instead of generating a 1-minute video in one shot (which would require enormous VRAM), SVI uses a smarter approach:

The workflow generates the video in 5-second segments.
After each segment, the SVI LoRA reads the last few frames — motion, lighting, character position — and uses them as the starting point for the next segment.
The transitions are seamless: the result looks like one continuous shot, not separate clips glued together.

♾️

Infinite Length

Chain as many 5-second segments as needed — no hard video length limit

💾

6 GB VRAM Minimum

GGUF quantized models enable low-VRAM operation (12 GB+ recommended for comfort)

⚡

4–8 Steps with LightX2V

LightX2V LoRAs cut required steps from ~20 down to 4–8 for faster generation

🔗

Seamless Transitions

SVI LoRAs maintain motion, lighting, and character consistency between segments

One-click installer available on Patreon — handles all file downloads, model placement, and ComfyUI setup automatically.

Step 1 — Install ComfyUI (Portable Windows)

Download the ComfyUI Portable ZIP from the ComfyUI releases page and extract it with 7-Zip.
Navigate into the custom_nodes folder, click the address bar, type cmd, and press Enter.
Run git clone for the ComfyUI Manager repository.
Navigate back to the main ComfyUI folder (where the Python embedded folder is) and run the specific command from the written guide to install manager dependencies inside the portable Python environment.

Written guide with all commands and links is linked in the video description — copy-paste all commands from there to avoid typos.

Step 2 — Download Models

You'll need to gather several files before launching the workflow:

File	Source	Destination
WAN 2.2 14B GGUF (high-noise)	Quantstack HuggingFace — Image-to-Video repo → high-noise folder	`models/unet/`
WAN 2.2 14B GGUF (low-noise)	Quantstack HuggingFace — Image-to-Video repo → low-noise folder	`models/unet/`
SVI V2 Pro LoRA (high + low noise)	Kaiji HuggingFace → loras/stable_video_infinity/v2.0/	`models/loras/`
LightX2V LoRA (high + low noise)	Kaiji HuggingFace → loras/lightx2v/ or LightX2V WAN 2.2 HuggingFace repo	`models/loras/`
UMT5 XXL CLIP model	city96 HuggingFace — umt5 repository	`models/clip/`
WAN 2.1 VAE	Comfy-Org WAN repackaged → files/VAE folder	`models/vae/`
Upscale model	Channel HuggingFace repo (link in guide)	`models/upscale_models/`

GGUF quantization level: The Q3_K_M quantization is a good starting point — it runs on 6 GB VRAM cards and delivers solid results. If you have 12 GB+ VRAM, download a larger quant (Q5, Q6, Q8) for better quality.

Important: Make sure you download from the Image-to-Video repository on Quantstack, not the Text-to-Video one. They are separate repositories.

Step 3 — Set Up the Workflow in ComfyUI

Launch ComfyUI and load the SVI long-video workflow (download link on CivitAI — linked in video description).
If any nodes appear red, go to Manager → Install Missing Nodes. Install each missing package, then restart ComfyUI and refresh your browser.
Check all model loader nodes — verify the arrows point to the files you actually downloaded.

GGUF vs. Full Diffusion Model

The workflow defaults to the GGUF model loader for low-VRAM operation. If you have a high-end GPU and want to use the full-precision diffusion model instead, there's a fast group bypasser switch in the workflow — enable the diffusion model loader and disable the GGUF option.

Step 4 — Generate a Long Video

Load your starting image in the Load Image node.
Set resolution in the Resize Image node — this controls both the input resize and the output video dimensions.
The default workflow generates a 20-second video split into four 5-second segments. Write a separate prompt for each segment describing what happens in that 5-second window.
Check the seed settings — default is "fixed" (same output every run). Change to "randomize" if you want variations.
Click Run. The SVI LoRA automatically passes the final frames of each segment into the next, creating seamless continuity.

Generation speed on RTX 4090: A full 20-second video takes approximately 5–7 minutes. On 6 GB VRAM hardware it's slower but functional.

Extending Beyond 20 Seconds (Infinite Chaining)

Want more than 20 seconds? Adding more segments is straightforward:

Select all nodes in one of the existing 5-second subsection groups.
Right-click → Clone.
Drag the clone into position and connect the extended image output from the previous segment into the previous image input of the new clone.
Connect the new clone's extended image output to the upscale image connector at the end of the workflow.

Each clone adds 5 seconds. Repeat as many times as you want.

Tips for Best Results

Write per-segment prompts. Each 5-second block needs its own prompt describing that specific portion — account for what happened in the previous segment to maintain narrative flow.
Start with Q3_K_M GGUF if you're on limited VRAM. Upgrade the quantization level as your hardware allows.
Use fixed seeds when you find a good generation so you can reproduce it exactly. Switch to randomize for exploration.
SmoothMix WAN 2.2 is an alternative — faster motion, fewer steps, no LoRAs needed — but results can be less consistent than the standard setup with SVI LoRAs.
RunPod is great for testing before committing to a long local generation on lower-end hardware.

📦 Want to skip the setup?

The Local Lab offers pre-configured AI installer packages so you can get running in minutes, not hours.

Get the Installer →

WAN 2.2 SVI 2.0 Pro — Generate Long AI Videos in ComfyUI (Low VRAM)

How SVI Chaining Works

Step 1 — Install ComfyUI (Portable Windows)

Step 2 — Download Models

Step 3 — Set Up the Workflow in ComfyUI

GGUF vs. Full Diffusion Model

Step 4 — Generate a Long Video

Extending Beyond 20 Seconds (Infinite Chaining)

Tips for Best Results

Resources & Links

WAN 2.2 SVI 2.0 Pro — Generate Long AI Videos in ComfyUI (Low VRAM)

How SVI Chaining Works

Step 1 — Install ComfyUI (Portable Windows)

Step 2 — Download Models

Step 3 — Set Up the Workflow in ComfyUI

GGUF vs. Full Diffusion Model

Step 4 — Generate a Long Video

Extending Beyond 20 Seconds (Infinite Chaining)

Tips for Best Results

Resources & Links

Related Posts