F5 TTS is a powerful open-source text-to-speech system that can clone almost any voice from just 10 seconds of audio. Running it inside ComfyUI's node-based workflow system makes it accessible without any coding — and it works on systems with as little as 6 GB VRAM. Think of it as a completely free, local alternative to ElevenLabs voice cloning.
What F5 TTS Can Do
Two Workflow Options
- Record voice via microphone directly
- Manual transcription required
- Available from the ComfyUI web viewer node repository
- Upload any audio file (≤15 sec)
- Auto-transcription via Whisper Small
- Text file input for long scripts
- Ollama + Gemini API text generation nodes
- One-click Windows installer included
Manual Setup — Basic Workflow
This path sets up the basic voice recording workflow from scratch. You'll need a microphone for audio input.
Install ComfyUI (Portable Windows)
- Download the ComfyUI Portable ZIP from the ComfyUI releases page. Extract it with 7-Zip.
- Navigate into the
custom_nodesfolder, click the address bar, typecmd, and press Enter. - Run:
- Navigate back to the main ComfyUI folder (where the Python embedded folder is) and run the dependency install command from the written guide linked in the video description.
Load the Workflow
- Launch ComfyUI. Download the basic F5 TTS workflow file (link in video description — also available in the ComfyUI web viewer node repository's workflows folder).
- Drag the workflow JSON into ComfyUI. Red nodes will appear — this is normal.
- Open Manager → Install Missing Nodes. Install each missing node one by one, then restart ComfyUI.
- After restart, the workflow is ready. Use the audio record node to record your voice sample and enter the text you want to clone.
Patreon Premium Workflow Setup (One-Click)
- Download and double-click the
F5TTS_ComfyUI.batfile from the Patreon page. - Once installation completes, launch ComfyUI and load the enhanced workflow file.
- Sections of nodes are disabled by default. To enable a section: hold Ctrl, select the nodes in that section, right-click, and choose Bypass.
Using the Enhanced Workflow
The premium workflow has four main sections, each adding capabilities on top of the basic recording feature:
Section 1 — Microphone Recording
Record your voice sample directly in ComfyUI. Speak clearly and record in a quiet environment for best results.
Section 2 — Audio File Upload
Upload a pre-recorded audio file as your voice source. Keep clips to 15 seconds or less — the Whisper Small model automatically transcribes it, so you don't need to manually type what was said.
Section 3 — Text File Input
Instead of typing text directly, upload a .txt file and its content becomes the script your cloned voice will speak. Useful for longer content or pre-prepared scripts.
Section 4 — AI-Generated Text (Ollama + Gemini)
Generate the script text using an AI model:
- Ollama: If Ollama is installed and running locally, the workflow auto-detects your downloaded models in a dropdown. Select a model and write a prompt.
- Gemini API: Create an API key at Google AI Studio, open the config file in the Ollama/Gemini custom nodes folder, and paste the key between the quotation marks. Enable the Gemini node in the workflow, select your model, and enter your prompt.
Running a Generation
- Choose your audio input method (record, upload, or use an existing sample).
- Type (or load) the text you want the cloned voice to speak.
- Click the Queue button. Generation typically completes in just a few seconds.
- Listen directly in ComfyUI using the Open Web Viewer button, or find the output file at
ComfyUI/output/audio/.
Tips for Best Clone Quality
- Use 10–15 seconds of clean, clear audio. One speaker, no background noise, no music.
- Record multiple tones — the same voice recorded in calm, excited, and conversational tones can be combined for more expressive outputs.
- Match speaking pace — try to include natural speech rhythm in your reference clip, not just flat reading.
- Short clips clone faster and better than long recordings — 10–15 seconds is the sweet spot.
📦 Want to skip the setup?
The Local Lab offers pre-configured AI installer packages so you can get running in minutes, not hours.
Get the Installer →