Implementing Real-time Vision AI Apps Using NVIDIA DeepStream SDK

Introduction

Welcome to GTC! Thank you for joining us today. Before we jump into our discussion, let’s cover a few housekeeping items:

At the bottom of your screen are multiple application widgets you can use. All windows on your screen are also resizable and movable. Should you have any questions during the webcast, feel free to submit them through the Q&A window to the right of the slides, and we’ll do our best to address them at the end of the event. A copy of today's slide deck will be available in the GTC session catalog. Additionally, you can find answers to common technical issues located in the help widget at the bottom of your screen. An on-demand version of this webcast will be available approximately one hour after the presentation, accessible in the session catalog.

Now, without further ado, let’s get started. My name is Amulya Vishwanath, and I lead product marketing for Vision AI tools at NVIDIA.

The Growth of Intelligent Video Analytics

Intelligent video analytics has seen significant growth and adoption, particularly during the COVID-19 pandemic. Popular applications include:

Face mask detection
Social distance monitoring to manage crowded places such as malls and airports
Patient fall detection in hospitals
Employee safety and enhanced customer experience in retail environments

These applications require efficient, real-time video analysis from multiple IoT sensors. Cameras, a common type of IoT sensor, are widely used and prevalent in many existing infrastructures.

Challenges in Implementing Video Analytics

Organizations are transforming their existing infrastructure to enhance experiences for consumers and employees by leveraging AI for video analysis. However, customers face several challenges:

Limited Access to Resources: The creation of intelligent systems requires highly accurate and reliable AI, which can be a daunting process demanding a large volume of good labeled data, often time-consuming and costly to gather.
Unclear Production Path: Achieving high throughput to generate real-time insights can be challenging. Increased stream management per device is essential for reducing overall costs for customers.
Deployment Complexities: Deploying and managing numerous edge devices from a central location adds another layer of difficulty. Security, effective orchestration, and management of all edge appliances become critical.

Building and Deploying Vision Intelligence Applications

To build and deploy efficient Vision Intelligence applications, there are various NVIDIA tools available:

Pre-trained Models: Using purpose-built models like PeopleNet and DashCamNet combined with the Transfer Learning Toolkit (TLT) can dramatically accelerate AI training times.
Custom Training: Developers can also choose models like DetectNet or SSD to customize for their applications.
Custom Model Development: While more time-intensive, building a custom model is an option for sophisticated use cases.

NVIDIA’s DeepStream SDK serves as a comprehensive streaming analytics toolkit designed for real-time AI-based video and image understanding, providing optimal performance on a variety of GPU hardware.

Overview of DeepStream SDK Features

DeepStream SDK encompasses an entire software stack, from the application level down to compatible hardware. Developers can use C++ or Python via DeepStream Python bindings to construct pipelines effectively.

Pipeline Processing in DeepStream

The DeepStream pipeline architecture enables users to capture a stream of data, decode frames, perform obligatory preprocessing, batch the streams, conduct inferencing, and track objects for insights. Output can either involve storing the video or sending metadata to the cloud.

Key Features of DeepStream 5.0 Release

Recently, DeepStream 5.0 GA was announced, which includes enhanced features tailored for building intelligent video analytics services:

Python samples to aid development
Bi-directional messaging for real-time monitoring
Over-the-air (OTA) model updates without downtime
Smart record for efficient storage and retrieval

Bi-directional Messaging and Over-the-Air Updates

DeepStream supports bi-directional messaging using Kafka, empowering real-time communication and control between the cloud and edge devices. Additionally, OTA updates allow for continuous improvement of AI models in production.

Security Features

Security in IoT, especially concerning sensitive data, is paramount. The platform supports two-way TLS authentication, ensuring secure communication between edge devices and the cloud.

Building Reference Applications

Industry-specific use cases range from people analytics and social distancing monitoring to face mask detection. These applications leverage NVIDIA’s pre-trained models and comprehensive toolsets, providing templates and guidelines for developers.

Conclusion

In summary, implementing reliable and accurate AI for intelligent video analytics can be challenging but achievable with the right tools and strategies. NVIDIA’s DeepStream SDK, along with the Transfer Learning Toolkit, provides the flexibility and scalability needed to efficiently process video streams and gather actionable insights.

Keywords

Intelligent Video Analytics
Real-time Insights
NVIDIA DeepStream SDK
Transfer Learning Toolkit
Bi-directional Messaging
Over-the-Air Updates
Face Mask Detection
Social Distancing Monitoring
Object Tracking

FAQ

Q1: Can I create a plugin for DeepStream SDK in Python?

Yes, you can create a plugin in Python, although currently there are no sample applications showcasing this yet.

Q2: How does DeepStream handle model compatibility and conversion?

With the launch of DeepStream 5.0, users can now deploy models from various frameworks such as TensorFlow and PyTorch through the Triton Inference Server, enhancing flexibility in model deployment.

Q3: What algorithms does DeepStream use for object tracking?

DeepStream includes three tracking algorithms: KLT, intersection of human bounding boxes, and NVDCF, which employs correlation filters.

Q4: Can I redact faces in videos using DeepStream?

Yes, DeepStream supports face redaction before saving video streams, with additional information available on the developer blog.

Q5: Is OTA supported only for model updates?

Currently, OTA is only supported for updating AI models and does not extend to library updates.