Discover TTS WebUI Extensions
Enhance your Text-to-Speech WebUI with powerful extensions from the community
75 extensions found
Featured Extensions
Vall-E-X
text-to-speechMultilingual text-to-speech model supporting English, Chinese, and Japanese
by rsxdalv
StyleTTS2
text-to-speechStyleTTS2 is a text-to-speech model that generates high-quality speech with controllable style
by rsxdalv
Seamless M4T
text-to-speechSeamlessM4T is a multilingual and multimodal translation model supporting text and speech
by rsxdalv
Extensions List
Extensions List shows the list of interface extensions in the web UI
by rsxdalv
Decorator Extensions List
Decorator Extensions List shows the list of decorator extensions in the web UI
by rsxdalv
Gradio Settings
Gradio Settings allows to configure Gradio interface options from the web UI
by rsxdalv
GPU Info
Display GPU information such as VRAM, CUDA version, and more.
by openai
Installed Packages
Pip List shows the list of installed packages in the web UI
by rsxdalv
Model Location Settings
Model Location Settings allows changing the location of the model cache directories used by Hugging Face and Torch Hub.
by rsxdalv
External Extensions Installer
Add external extension entries via JSON and install them without restarts.
by rsxdalv
Log Viewer
View, search, and manage log files from the TTS Generation WebUI. Browse installation logs, filter by keywords, and clean up old logs.
by rsxdalv
Vall-E-X
Multilingual text-to-speech model supporting English, Chinese, and Japanese
by rsxdalv
StyleTTS2
StyleTTS2 is a text-to-speech model that generates high-quality speech with controllable style
by rsxdalv
Seamless M4T
SeamlessM4T is a multilingual and multimodal translation model supporting text and speech
by rsxdalv
MMS
MMS (Massively Multilingual Speech) is a text-to-speech model supporting over 1000 languages
by rsxdalv
Tortoise TTS
Tortoise TTS is a high-quality text-to-speech model with voice cloning capabilities
by rsxdalv
F5-TTS
F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching.
by rsxdalv
Chatterbox
Chatterbox, Resemble AI's first production-grade open source TTS model
by rsxdalv
Kokoro
Kokoro: A small, fast, and high-quality TTS model
by rsxdalv
Bark
Bark: A text-to-speech model
by rsxdalv
XTTS
XTTS-Simple is a Gradio UI for XTTSv2
by rsxdalv
Parler-TTS
Parler-TTS is a training and inference library for high-fidelity text-to-speech (TTS) models.
by rsxdalv
CosyVoice [Unstable]
CosyVoice: High-quality text-to-speech synthesis.
by rsxdalv
MARS5
MARS5: A novel speech model for insane prosody
by rsxdalv
DIA
DIA: A text-to-dialogue model
by rsxdalv
GPT-SoVITS [low compatibility]
GPT-SoVITS: A TTS solution powered by GPT and SoftVC VITS Singing Voice Conversion.
by rsxdalv
Maha TTS
Maha TTS allows generating speech from text using the MahaTTS model.
by rsxdalv
OpenVoice
OpenVoice: A versatile instant voice cloning approach
by rsxdalv
OpenVoice V2
OpenVoice: A versatile instant voice cloning approach
by rsxdalv
Piper TTS
Piper TTS is a text-to-speech model by rsxdalv
by rsxdalv
Higgs V2 (Early Access)
Higgs V2
by rsxdalv
VibeVoice (Early Access)
A template extension for TTS Generation WebUI
by rsxdalv
Kitten TTS
A template extension for TTS Generation WebUI
by rsxdalv
Index TTS (Beta)
A template extension for TTS Generation WebUI
by rsxdalv
VoxCPM (Beta)
A template extension for TTS Generation WebUI
by rsxdalv
FireRedTTS2 (Beta)
A template extension for TTS Generation WebUI
by rsxdalv
MegaTTS3 (Alpha)
A template extension for TTS Generation WebUI
by rsxdalv
ACE-Step
ACE-Step: A Step Towards Music Generation Foundation Model
by rsxdalv
Stable Audio
Stable Audio is a text-to-audio model for generating high-quality music and sound effects
by rsxdalv
Audiocraft
Audiocraft provides MusicGen and MAGNeT models for high-quality music and audio generation
by rsxdalv
AudioCraft Plus
AudioCraft Plus is an all-in-one WebUI for the original AudioCraft, adding many quality features on top.
by rsxdalv
Riffusion
Riffusion allows generating music from text.
by rsxdalv
MusicGen (Mac)
MusicGen allows generating music from text
by rsxdalv
Songbloom (Beta)
A template extension for TTS Generation WebUI
by rsxdalv
Vocos
Vocos is a neural audio codec for high-quality audio compression and reconstruction
by rsxdalv
RVC
RVC: Retrieval-based Voice Conversion
by rsxdalv
Demucs
Demucs is a music source separation model that can separate drums, bass, vocals, and other instruments
by rsxdalv
Audio Separator
Audio Separator allows separating audio files into multiple audio files.
by rsxdalv
Resemble Enhance
Resemble Enhance allows enhancing audio files.
by rsxdalv
AP-BWE Bandwidth Extension
AP-BWE: An audio bandwidth extension solution using Amplitude-Phase Bandwidth Extension models.
by rsxdalv
PyRNNoise
A template extension for TTS Generation WebUI
by rsxdalv
History
Outputs Tab for TTS WebUI
by rsxdalv
Gallery History
Gallery History allows selecting previously generated audio files by looking at their waveforms
by rsxdalv
FFMPEG Metadata
FFMPEG Metadata allows loading metadata from audio files.
by rsxdalv
OpenAI TTS API
OpenAI compatible TTS API with support for multiple TTS models
by rsxdalv
Whisper
Whisper allows transcribing audio files.
by rsxdalv
XTTS Fine-tuning Demo
XTTS fine-tuning demo
by rsxdalv
Huggingface Cache Manager
Huggingface Cache Manager allows managing the Huggingface cache.
by rsxdalv
Model Downloader
Model Downloader allows downloading models from the Huggingface model hub.
by rsxdalv
Simple Remixer
Simple remixer allows concatenating multiple audio files and mixing them together.
by rsxdalv
Conda Storage Optimizer
Conda Storage Optimizer allows cleaning up conda storage to free disk space.
by rsxdalv
RVC Training (Not available yet)
RVC Training
by rsxdalv
Bark Voice Clone
Bark Voice Clone allows cloning voices for use with Bark TTS
by rsxdalv
Ebook2Audiobook (Not available yet)
Ebook2Audiobook allows converting ebooks to audiobooks
by rsxdalv
EPub2TTS (Not available yet)
EPub2TTS allows converting ebooks to audiobooks
by rsxdalv
Audiobook Generator (Not available yet)
Audiobook Generator allows converting ebooks to audiobooks
by rsxdalv
CUDA Toolkit
CUDA Toolkit
by rsxdalv
Kimi Audio
Kimi Audio is a powerful text-to-speech and speech-to-text model by Moonshot AI
by rsxdalv
MiMo-Audio
A template extension for TTS Generation WebUI
by rsxdalv
YouTube Tutorials
YouTube Tutorials shows a list of YouTube tutorials in the web UI
by rsxdalv
Parakeet
Speech transcription via Nvidia Parakeet model
by mefi
Bark Legacy
This is the legacy UI of Bark from TTS-WebUI
by rsxdalv
PyVideoTrans TTS API
PyVideoTrans text-to-speech API with WebUI integration.
by rsxdalv
SRT Tools
Import and parse multiple SRT files into JSON segments for later TTS batching.
by rsxdalv
Pip Install UI
Install and uninstall Python packages from the web UI. Disable when not in use for security.
by rsxdalv
Save Ogg
Decorator Save Ogg
by rsxdalv
Save Waveform
Decorator Save Waveform
by rsxdalv
Average Time
Decorator Average Execution Time
by rsxdalv