
OpenAI utilised YouTube videos as a training source for its speech-to-text AI language model Whisper, according to a report from The Information. The terms of service of YouTube explicitly prohibit the use of its video content for purposes other than personal use. Consequently, training a commercially oriented AI model using such content could potentially violate the site's rules.
Some of the training data derived from Whisper ultimately contributed to the development of GPT-4, which is the language model behind ChatGPT.
It is not just Google and OpenAI that recognise the value of video content for AI training purposes. Yann LeCun, the AI chief at Meta Platforms, has emphasised the significance of video training data in his work. In a Meta AI post, LeCun stated that a hierarchical Joint Embedding Predictive Architecture could potentially learn about the world by watching videos and interacting with its environment. By training itself to predict events within videos, the architecture can generate hierarchical representations of the world.
LeCun's point highlights the importance of video in enabling AI models to "think" more like humans, as opposed to relying solely on text data for training. Text, however, remains a valuable resource for certain tasks.
For instance, AI can effectively generate customer support emails, as demonstrated by the CEO of Octopus Energy, a UK energy supplier. The CEO revealed that AI has been performing the work equivalent to approximately 250 human employees, achieving higher customer satisfaction ratings in the process. Although the emails are still reviewed by humans, this showcases how artificial intelligence can produce high-quality content quickly, given the appropriate circumstances and training.
Also Read
'Buying Netflix at $4 billion would've been better instead of...': Former Yahoo CEO Marissa Mayer
ChatGPT beats top investment funds in stock-picking experiment
For Unparalleled coverage of India's Businesses and Economy – Subscribe to Business Today Magazine
Copyright©2025 Living Media India Limited. For reprint rights: Syndications Today