OmniGen2 Multimodal Image Editing ComfyUI Workflow - Windows One Click Installer
OmniGen2 is now available and can run on low VRAM devices! I’ve created a one-click Windows installer for ComfyUI, plus a workflow using the Detail Daemon node for extra detail.
What is OmniGen2?
OmniGen2 is a new open-source AI model that can both understand images and generate high-quality images from text prompts. It supports text-to-image, image editing by instructions, and can combine elements from multiple images. It’s fast, efficient, and works great even on consumer GPUs.
Model Details
Text Generation Model Size: ~3 billion parameters
Image Generation Model Size: ~4 billion parameters
Speed: Under 1 minute on RTX 4090; even faster on enterprise GPUs
System Requirements
Nvidia RTX 30XX, 40XX, or 50XX series GPU (FP16 support required; GTX 10XX/20XX not tested)
CUDA-compatible GPU with at least 6GB VRAM
Windows OS
At least 30GB free storage
What’s Included
Portable ComfyUI Windows Installer: Pre-configured for OmniGen2
Custom Workflow: Supports text-to-image generation
Automatic Downloads: All required nodes and models install automatically
Preloaded Models
omnigen2_fp16.safetensors (ComfyUI\models\diffusion_models)
qwen_2.5_vl_fp16.safetensors (ComfyUI\models\text_encoders)
ae.safetensors (ComfyUI\models\vae)
Usage Notes
OmniGen2 works best with detailed, instruction-based prompts.
For best likeness, experiment with the Image CFG value (try 2.0–4.0; adjust as needed).
Support and More Information
Project GitHub: For technical details, updates, and documentation, visit the project’s GitHub repository.
Community Support: For troubleshooting or to connect with other users, join the Discord server.
Buy On Patreon
While I improve the store, you can purchase these items or sign up for a membership on Patreon - https://www.patreon.com/TheLocalLab.