top of page

WAN Self Forcing T2V & VACE I2V ComfyUI(6 GB VRAM) - One Click Windows Installer

A new variant of the WAN Video Generation model, Self Forcing, is now available, enabling fast, high-quality video generation on consumer GPUs. This release includes a one-click installer and a complete workflow for both text-to-video and VACE image-to-video tasks.

 

What is the Self Forcing Model?

 

Self Forcing is an advanced autoregressive video diffusion model designed for real-time, streaming video generation. It simulates the inference process during training, using autoregressive rollout with KV caching. This approach eliminates the common mismatch between training and inference, resulting in smoother, more temporally consistent videos. The model excels at generating high-resolution 480p videos with an initial latency of about 0.8 seconds, followed by streaming frame generation at around 10 FPS on a single RTX 4090 GPU. Compared to previous models, Self Forcing offers:

  • Significantly faster generation (150–400x lower latency than prior models)

  • Superior or comparable visual quality, with smoother motion and no over-saturation

  • Real-time, identity-consistent, and motion-smooth video synthesis, especially when using long, detailed prompts

 

Model Details:

  • Model size: 1.3B parameters

  • Output: High-quality 480p videos

  • Speed: ~10 FPS on RTX 4090, faster on enterprise GPUs

  • Quality: Matches or exceeds state-of-the-art diffusion models; excels with long, descriptive prompts

 

System Requirements

  • Nvidia RTX 30XX, 40XX, or 50XX series GPU (fp16 and bf16 support required; GTX 10XX/20XX not tested)

  • CUDA-compatible GPU with at least 6GB VRAM

  • Windows operating system

  • Minimum 30GB free storage

 

What's Included in This Post

  • Portable ComfyUI Windows Installer: Pre-configured for Self Forcing WAN Video Generation.

  • Custom Workflow: Supports both text-to-video and VACE image-to-video generation.

  • Automatic Node and Model Download: All required custom nodes and models are downloaded and installed automatically.

 

Preloaded Models

The following models are included and will be automatically downloaded:

  • Wan2.1-T2V-1.3B-Self-Forcing-DMD-VACE-FP16.safetensors (ComfyUI\models\diffusion_models)

  • Wan2.1-T2V-1.3B-Self-Forcing-DMD-VACE-FP8_e4m3fn.safetensors (ComfyUI\models\diffusion_models)

  • umt5-xxl-encoder-Q5_K_M.gguf (ComfyUI\models\clip)

  • wan_2.1_vae.safetensors (ComfyUI\models\vae)

  • self_forcing_dmd.pt (ComfyUI\models\diffusion_models)

 

Usage Notes

  • The model performs best with long, detailed prompts, as it was specifically trained on such data.

  • Both text-to-video and image-to-video workflows are supported.

 

Support and More Information

  • Project GitHub: For technical details, updates, and documentation, visit the project’s GitHub repository.

  • Community Support: For troubleshooting or to connect with other users, join the Discord server.

  • Buy On Patreon

    While I improve the store, you can purchase these items or sign up for a membership on Patreon  - https://www.patreon.com/TheLocalLab.

$3.00Price
Quantity
    bottom of page