Create Talking AI Avatar for FREE - Step-by-step guide (Spoiler: No D-ID studio)
Science & Technology
Create Talking AI Avatar for FREE - Step-by-step Guide (Spoiler: No D-ID Studio)
Welcome to the prompt engineering channel! Learn how to create digital avatars using open-source and completely free tools. In this video, I'll teach you how to create your own digital avatars without spending a penny on expensive software like D-ID or any other paid tools. We'll walk through a four-step process that is straightforward and easy to follow. We will use MidJourney or other free alternatives to create an image, convert it to a 3D animation, and use Eleven Labs for text-to-speech. Finally, we will combine everything to create a talking avatar using an awesome open-source and free tool. Plus, we have a bonus step at the end so don’t miss it. Let's get started!
Step 1: Create an Image for Your Digital Avatar
To begin, you need to have an image for your digital avatar. I am using MidJourney, but you can use any AI image generator or even your own images. Here is the prompt that I am using, and I'll put it in the video description. Once the images are generated, select one where the mouth is open and the teeth are showing—this will be crucial for animation.
- Steps:
- Right-click on the chosen image.
- Select "Save Image As" and store it for the next steps.
Step 2: Add Some Motion to Your Image
To add motion, use an apex converter called Face Swap. This will convert your 2D image into a 3D video. Upload your saved image to this tool and choose the animation options you like, such as "Vertical" or "Tall Circle".
- Steps:
- Click "Upload" and select your saved image.
- Set the animation duration to the maximum (six seconds).
- Click "Share" and save the resulting video as
output_video.mp4
.
Step 3: Generate Audio
Use Eleven Labs to generate text-to-speech that we will use in the digital avatar. After logging in, type in your text and choose a voice that you like. Generate the speech and download the audio as input_audio.wav
.
- Steps:
- Log in to Eleven Labs and enter your text.
- Select a voice and settings, then click "Generate".
- Save the audio as
input_audio.wav
.
Step 4: Combine Everything Together
We will now combine both the audio and video. One free tool we will use is Wave2Lip, available on GitHub. This tool will sync the lips in the video to the audio.
- Steps:
- Download Wave2Lip from GitHub and run it locally, or use their interactive demo for shorter videos.
- Upload your
output_video.mp4
andinput_audio.wav
files and sync them. - Download the resultant video where the lips are synced with the audio.
Bonus Step: Improve the Resolution
Finally, for those looking for enhanced resolution, run the final video through the Thin Plate Spline model—this model helps in improving the clarity of facial animations.
- Steps:
- Use your original image and the final video created in Step 4.
- Submit both to the Thin Plate Spline model.
- Download the improved video.
Conclusion
After following these steps, you would have created your very own talking AI avatar for free! If you found this guide helpful, please consider subscribing and turning on the bell notification button to keep up with more awesome content.
Keywords
- Digital Avatar
- Open-source Tools
- MidJourney
- 3D Animation
- Eleven Labs
- Text-to-Speech
- Wave2Lip
- Thin Plate Spline Model
- AI Image Generator
- Free Tools
FAQs
Q: What tools do I need to create a talking AI avatar for free? A: You will need MidJourney (or any AI image generator), Eleven Labs for text-to-speech, and Wave2Lip for syncing the audio and video.
Q: Can I use my own images to create the AI avatar? A: Yes, you can use your own images. Just ensure the image has an open mouth with teeth showing for better animation results.
Q: What is the resolution limit for the interactive demo on Wave2Lip? A: The interactive demo is limited to 480p resolution and a maximum of 20 seconds. For higher resolutions and longer videos, you can run it locally.
Q: How can I improve the resolution of the final video? A: You can improve the resolution by running the final video through the Thin Plate Spline model, which enhances the clarity of facial animations.
Q: Is there any cost involved in creating a talking AI avatar using these methods? A: No, all the tools and methods mentioned in this guide are open-source and completely free to use.