WAN 2.2 14B Text/Image To Video(6GB) - ComfyUI Workflow - One Click Installer
Run the best video generation model locally on devices with as little as 6GB VRAM using ComfyUI.
WAN 2.2 14B models have just been released, bringing state-of-the-art video generation capabilities in a format optimized for low VRAM devices. The quantized GGUF versions allow high-quality results even on consumer GPUs as low as 6GB VRAM, while standard FP16/FP8 variants are available for those with more powerful hardware.
One-Click Installer & Custom Workflow:
I’ve developed a one-click installer and a custom workflow enabling you to run WAN 2.2 quantized GGUF models for high-quality video generation on lower VRAM setups. The installer automatically sets up the Wan2.2 T2V A14B Low Noise Q3_K_S GGUF model, but if your system allows, you can easily add higher quant GGUF or the standard FP16/FP8 models for even better results.
Preloaded Models within the Installer (Low VRAM)
umt5-xxl-encoder-Q5_K_M.gguf (ComfyUI\models\clip) - https://huggingface.co/city96/umt5-xxl-encoder-gguf/tree/main
wan_2.1_vae.safetensors (ComfyUI\models\vae) - https://huggingface.co/Kijai/WanVideo_comfy/tree/main
Wan2.2-T2V-A14B-LowNoise-Q3_K_S.gguf (ComfyUI\models\unet) - https://huggingface.co/QuantStack/Wan2.2-T2V-A14B-GGUF/tree/main/LowNoise
2xLexicaRRDBNet_Sharp.pth Upscale model (ComfyUI\models\upscale_models) - https://huggingface.co/Thelocallab/2xLexicaRRDBNet_Sharp/blob/main/2xLexicaRRDBNet_Sharp.pth
Wan21 T2V 14B lightx2v (ComfyUI\models\loras) - https://huggingface.co/Thelocallab/WAN-2.1-loras/tree/main
The standard Wan2.2 Low Noise 14B diffusion models (FP16 and FP8) are not packaged with the installer, but you can download them directly from the Comfy Org Hugging Face repository and add them to your ComfyUI/models/diffusion_models folder:
Comfy Org Wan2.2 Diffusion Models
Speed:
Generate 480p resolution videos in about 10–15 minutes on an RTX 4050 with 6GB VRAM. Faster performance is possible on higher-end GPUs.
System Requirements:
Nvidia RTX 30XX, 40XX, or 50XX series GPU (FP16 support required; GTX 10XX/20XX not tested)
CUDA-compatible GPU with at least 6 GB VRAM
Windows OS
At least 40 GB free storage
What’s Included:
Portable ComfyUI Windows Installer, pre-configured for WAN 2.2 text-to-video & image-to-video
Custom workflow supporting text-to-video & image-to-video generation
Automatic downloads for all required nodes and models
Usage Notes:
You can utilize either the WAN 2.2 14B GGUF models or the standard WAN 2.2 14B diffusion models in the workflow.
Enable or disable workflow sections using the levers in the Fast Groups Bypasser node on the left side of the workflow.
Enter your detailed descriptive text prompt for the scene you want to generate.
For best results, enhance your prompt with an LLM.
Support and More Information
Community Support: For troubleshooting or to connect with other users, join the Discord server.
Buy on Patreon
While I improve the store, you can purchase these items or sign up for a membership on Patreon - https://www.patreon.com/TheLocalLab.