My Video Was Used To Train AI

Introduction

About a month ago, I came across a video posted by Hank Green—a 30-minute summary on generative AI companies using videos to train AI language models (LLMs). Naturally, being an avid Tech enthusiast, I embarked on an investigative deep dive to find out if any of my videos were utilized in training AI models. With 16 years on YouTube and my presence in thousands of tech videos across more than 50 channels as both a guest and a host, I was curious to see what I might uncover.

Before diving into what I found, I must disclose my personal use of AI tools. In my last video, I demonstrated how I utilize Gemini AI to enhance vacation photos and brainstorm ideas for content, drastically saving me time. I appreciate leveraging modern technology in my online endeavors, and this reliance prompts me to secure these tools. Today's sponsor, Yubico, provides a solution for enhancing online security through U2F keys. These devices, resembling USB flash drives, offer a secondary authentication layer, which is more secure than traditional six-digit codes, ensuring my online accounts are fortified. Yubico's latest firmware upgrades offer increased capacities for pass keys and codes, reflecting their commitment to future-proofing their products.

This brings us to the central issue: the use of YouTube videos in training AI. I utilized a neat tool called Proof News to investigate if any of the channels featuring my content had their videos utilized for AI training. To my surprise, I discovered three videos that were utilized. The first was a Hack 5 Threatwire video, which I had independently researched, scripted, recorded, and edited. The other two were from Techzilla, where I handled the scripting and hosting, alongside contributions from collaborators.

Curiously, all these videos date back roughly a decade, raising questions about how this data was sourced, particularly as a significant chunk of my work has been educational in nature. Despite not owning these channels—where I was merely a contractor or employee—I felt somewhat exploited. It’s disconcerting when creators, often dismissed as not having "real jobs," experience exploitation of their labor and creativity.

According to Proof News, over 170,000 videos from approximately 48,000 channels have been used to train AI by companies such as Anthropic, Nvidia, Apple, and Salesforce, indicating I am far from alone. Their search involved subtitles from publicly available videos, utilizing YouTube's Dev tools to compile a searchable library of metadata. Their findings left me feeling frustrated and used, especially when these AI tools are trained on not just educational content but also contain harmful language and sentiments, further blurring ethical lines.

While my videos remain publicly accessible, I question whether content creators should have a say regarding the usage of their content in training AI. Even Google's statements that using videos to train AI violates the platform's terms do little to alleviate this concern, especially when many of my older videos were released before the AI landscape had evolved. YouTube's terms have certainly changed since then, yet the concern remains about whether previous content can still find its way into AI training datasets without my consent.

I am deeply concerned about the ramifications of AI recreating my voice, mannerisms, or sentiments, especially if it generates content I ethically oppose. Current laws and regulations have not matched the rapid pace of technological advancement, leaving many creators vulnerable. Even though platforms like YouTube have introduced tools for disclosing AI usage within videos, they fall short of offering options that allow creators to opt out or seek compensation for their contribution.

The core issue lies in the ongoing struggle for creators to maintain rights over their work and receive fair compensation for their contributions, especially as the tech landscape continues to shift. As the discussion around AI and copyright continues to evolve, the question persists: will creators ever be empowered to give or withdraw consent regarding their materials being used for AI training?

In conclusion, while the innovative potential of AI is noteworthy, the current state of ethics and legality in utilizing creators' content needs significant attention, ensuring that contributors are not left behind in the face of technological progress.

Keywords

Video
AI
YouTube
Training
Copyright
Content Creators
Consent
Exploitation

FAQ

1. What did the author discover about their videos?
The author found that three of their videos were used to train AI models.

2. What tools were used to track down this information?
The author utilized the Proof News tool to analyze whether their videos were employed in AI training.

3. What is the author's position on AI using their content?
The author feels frustrated that their labor and creativity were used without consent and emphasizes that creators should have control and compensation regarding their work.

4. Is it legal for companies to use YouTube videos for AI training?
While YouTube’s terms state that using videos for AI training may violate platform policies, many older videos were released before these regulations were in place.

5. What steps can be taken to protect creators' rights regarding AI?
The author advocates for clearer options for creators to consent to or opt out of their content being used for AI training, as well as the ability to receive compensation.