Cloud-Native AI Powers Faster, Easier Deployment of Interactive Avatars

Introduction

Interactive avatars are becoming increasingly popular and versatile, serving as tools for customer and sales support, marketing, and education. However, the process of creating these avatars traditionally requires specialized equipment, expertise, and time-intensive workflows.

NVIDIA ACE (Avatar Cloud Engine) offers a collection of cloud-native AI microservices that simplify the creation, customization, and deployment of intelligent and engaging avatars. These avatars can be built on any engine and deployed on any cloud platform, making them highly adaptable and accessible.

An interactive avatar needs to understand and communicate with users, ranging from basic text-driven chatbots to fully animated 3D avatars capable of seeing and hearing. NVIDIA ACE provides all the necessary AI building blocks to bring these avatars to life.

Meet Violet

Violet is an example of a fully rigged avatar featuring basic animation. Utilizing NVIDIA's Unified Compute Framework (UCF Studio), a graph of microservices is built and deployed in the cloud to enable Violet's functionalities.

To make Violet capable of hearing and speaking, the Riva ASR (Automatic Speech Recognition) and TTS (Text-to-Speech) components are integrated. Additionally, the Audio2Face software provides the necessary animation for a lifelike interaction. Each microservice component has color-coded inputs and outputs for easy connection and error checking, preventing incompatible connections.

For handling customer orders and queries, a food ordering dataset is linked to Violet's system. Here is a sample interaction:

User: Do you have any low-calorie options?

Violet: I found these four options for low-calorie: market salad, fruit salad, diet cola, and lemonade.

User: I'd like a market salad, please.

Violet: Okay, I have added one regular market salad to your cart. What else would you like?

Enhancing Violet's Intelligence

While Violet is primarily trained for food orders, we can extend her capabilities to answer a broader range of questions. Using the Tokyo application framework, we integrate a customizable pre-trained natural language processing (NLP) model.

To enable Violet to answer non-food-related questions, a pre-deployed Megatron large language model (LLM) microservice is added, updating the inference model settings for open-domain question answering. This allows Violet to handle diverse queries:

User: What do you know about the Lemon Slice Nebula?

Violet: The Lemon Slice Nebula is a planetary nebula located in the constellation Camelopardalis.

User: How far is it from Earth?

Violet: 4,500 light-years away.

Switching Engines

NVIDIA ACE's flexibility allows for easy switching between different engines. For instance, by transitioning from the omniverse to Unreal Engine 5 using Metahuman, we can meet Ultraviolet—an adapted version of Violet with the same intelligent capabilities.

Switching the microservice output from omniverse to Unreal Engine in UCF Studio and relaunching the avatar provides seamless integration:

Ultraviolet: Hi, I'm Ultraviolet. I’m a digital avatar brought to life by the Avatar Cloud Engine (ACE). I can answer challenging questions using a large language model. Ask me anything.

NVIDIA ACE's robust and cloud-native approach significantly reduces the complexity and cost of developing interactive avatars, paving the way for their broader adoption and innovative applications.