I have created an AI audiobook maker and narrator using various AI voice tools that I’ve explored on my channel. In this article, I’ll provide a detailed guide on installation and share a quick demo to get you started.
I have an AI audiobook maker open and a processed audiobook ready. Useful features include:
For instance, the sentence "penniless" did not sound right initially, but with audio regeneration, it was corrected.
Hardware:
Software:
First, ensure you have the following software:
Clone the audiobook maker from GitHub:
git clone https://github.com/YourGitHub/audiobook_maker.git
cd audiobook_maker
Set up a virtual environment:
python -m venv venv
.\venv\Scripts\activate
Install dependencies like PyTorch:
pip install torch torchvision torchaudio
pip install -r requirements.txt
pip install --index-url https://pypi.org/simple/ rvc
pip install ffmpeg-python
Ensure additional files are in place:
rmvpe.pt
and hubert_base.pt
in the parent directoryazas.pth
voice model in the voice_models
directoryDownload ffmpeg full build and place ffmpeg.exe
and ffprobe.exe
in the parent directory.
Open the project in VS Code and change the interpreter to the virtual environment. Restart the language server:
Ctrl + Shift + P -> Python: Restart Language Server
Edit torts.yaml
:
voice: Mel
samples: 4
iterations: 32
temperature: 0.8
Launch Tortoise:
start.bat
Start the audiobook app:
python audiobook_app_2.0.py
Create a text file with your content and select it in the GUI. The interface allows for:
Additional functionality includes handling interrupted sessions and continuation for generation from where it stopped.
If you encounter any issues, reach out, and I’ll try to assist as much as possible.
Q1: What hardware do I need? A: An Nvidia graphics card, preferably from the 10 series upward with at least 6GB of VRAM.
Q2: Can I use this tool on AMD or Mac? A: Currently, it’s limited to Nvidia GPUs, but plans are underway to support AMD and Mac in the future.
Q3: What software do I need to install? A: You need to install CUDA, Python 3.10, Git, VS Code, Tortoise TTS, PyTorch, and ffmpeg.
Q4: Where can I download the necessary files and dependencies? A: Links to download all required software and files are provided in the detailed guide above.
Q5: Can I regenerate specific sentences in the audiobook? A: Yes, the tool allows for regenerating audio for specific sentences if the initial generation is not satisfactory.
Q6: How do I handle interrupted audiobook generation sessions? A: You can load the existing audiobook and continue generation from where it left off using the application’s built-in functionality.
In addition to the incredible tools mentioned above, for those looking to elevate their video creation process even further, Topview.ai stands out as a revolutionary online AI video editor.
TopView.ai provides two powerful tools to help you make ads video in one click.
Materials to Video: you can upload your raw footage or pictures, TopView.ai will edit video based on media you uploaded for you.
Link to Video: you can paste an E-Commerce product link, TopView.ai will generate a video for you.