top of page

Spark TTS LLM-Based Text-to-Speech - Local Windows One Click Installer

Spark TTS open-source Windows one-click installer. This powerful text-to-speech system leverages large language models for highly accurate, natural-sounding voice synthesis. The installer downloads and sets up the project locally, including the 0.5b Spark TTS pretrained model in the model folder. Future updates will include simple methods to fine-tune and train your own Spark models to run locally—stay tuned!

 

Key Features

  • Efficient and Simple: Built on Qwen2.5, Spark-TTS directly reconstructs audio from LLM-predicted codes, eliminating the need for extra generation models and streamlining the process.

  • High-Quality Voice Cloning: Supports zero-shot voice cloning, enabling replication of voices without specific training data, ideal for cross-lingual and code-switching scenarios.

  • Bilingual Support: Works seamlessly with both Chinese and English, maintaining high naturalness and accuracy across languages.

  • Controllable Speech: Customize virtual speakers by adjusting gender, pitch, and speaking rate.

 

Optimal System Requirements:

  • CUDA-compatible GPU with at least 6GB VRAM

  • Windows operating system

  • At least 25GB of free storage

 

Installation Instructions

  • Download the batch file below and place it in a dedicated folder.

  • Install FFMpeg if not already installed (https://www.ffmpeg.org/download.html). Reboot if necessary.

  • Run the batch file to start installation.

  • Launch the project by running start.bat.

 

For more details, visit the Spark TTS GitHub page: https://github.com/SparkAudio/Spark-TTS.

 

Join our Discord to share feedback or get help with setup: https://discord.gg/5hmB4N4JFc.

  • Buy On Patreon

    While I improve the store, you can purchase these items or sign up for a membership on Patreon  - https://www.patreon.com/TheLocalLab.

$3.00Price
Quantity
    bottom of page