Topview Logo
  • Create viral videos with
    GPT-4o + Ads library
    Use GPT-4o to edit video empowered by Youtube & Tiktok & Facebook ads library. Turns your links or media assets into viral videos in one click.
    Try it free
    gpt video

    AUTOMATIC LIP-SYNCING

    blog thumbnail

    AUTOMATIC LIP-SYNCING

    Hey guys, it's me, Carrie Kaleidoscope Hoarder. Today, I'm super excited to share my latest project, an automated avatar lip-sync tool that takes my existing program a step further. The goal was to create a beautifully animated avatar that can lip-sync to given audio files automatically, and I think I've achieved just that. Let’s dive into how it all works.

    An Overview of the Original Project

    As a quick reminder, my last project automatically generated YouTube videos using annotated transcripts and an audio file as inputs. The program would then generate all the visuals. The test was simple: I’d talk about dogs, and the video would show pictures of dogs.

    New Goals

    This time, I wanted to extend the project in three main areas:

    1. Aesthetic Improvements - Creating a new, animated avatar.
    2. Functional Enhancements - Implementing automated lip-syncing and expression changes.
    3. Legal Compliance - No more unlicensed Google Images.

    The Animated Avatar

    Creating a fully animated avatar from scratch can take weeks, if not months. However, I wanted to produce a ten-minute animation within a much shorter time frame. Here’s the step-by-step process I used:

    • Drawing the Avatar: I created four different emotions, each with five poses. Each pose has three levels of blinking, resulting in a wide array of expressions.
    • Lip-Sync: Using 22 mouth assets from my show "Battle for Dream Island"—these cover various phonemes and consonants required for speech.

    Functional Enhancements

    The core enhancement here was to overhaul how the program synchronizes audio and visual elements. Here's the logic:

    1. Old vs. New Timestamp Combining: Previously, the program used a tool called Gentle to get timestamps for each word. Now, it also captures phoneme timestamps.
    2. Creating Timetables: I created five timetables—for phoneme timestamps, pose changes, emotion changes, topic discussions, and paragraph transitions.
    3. Frame Drawing: The system uses these timetables to draw the right mouth shape, pose, and expression for each frame, about 18,000 in all for a ten-minute video.

    Legal Compliance

    Many people pointed out that my approach of using Google images could be legally problematic. I addressed this with a temporary solution of drawing quick, rudimentary images. Ultimately, I intend to find a more automated, legal solution.

    Results

    After several days of coding, the project successfully automated the animation process. The tool generates a video from just an audio file and an annotated transcript.

    Announcing Lazy KH

    Finally, for those wondering about my new channel, it's called Lazy KH, where I'll be uploading content generated by this new automatic tool.

    Learning and Next Steps

    I explored tools like Anime Studio and Adobe Animate but decided to write my own code for more control. This allowed for better synchronization of the avatar’s emotions and poses.

    Thank You!

    Thank you for sticking around. If you're curious about AI or want to learn something new, check out Brilliant for some fantastic courses.


    Keywords

    • Automated Avatar
    • Lip-Sync
    • Animated Video
    • Coding
    • Emotions
    • Timestamps
    • Legal Compliance
    • YouTube
    • AI Tools
    • Brilliant

    FAQ

    Q: What inspired you to create an automated lip-sync tool?
    A: The desire to speed up the animation process and add more expressiveness to my videos inspired this project.

    Q: How does the lip-sync function work?
    A: The tool captures phoneme timestamps from the audio, then matches these to pre-drawn mouth shapes to create realistic lip-sync.

    Q: What are the legal issues with using Google Images, and how did you address them?
    A: Using unlicensed images can lead to copyright issues. I addressed this by creating a manual "human image requester" system as a temporary solution.

    Q: Why didn't you use existing software for the lip-sync?
    A: Writing my own code gives me full control over synchronization, allowing for better integration of the avatar’s emotions and poses.

    Q: What's next for Lazy KH?
    A: I'll continue to refine the tool and upload more automated videos to the Lazy KH channel.

    One more thing

    In addition to the incredible tools mentioned above, for those looking to elevate their video creation process even further, Topview.ai stands out as a revolutionary online AI video editor.

    TopView.ai provides two powerful tools to help you make ads video in one click.

    Materials to Video: you can upload your raw footage or pictures, TopView.ai will edit video based on media you uploaded for you.

    Link to Video: you can paste an E-Commerce product link, TopView.ai will generate a video for you.

    You may also like