OpenAI used YouTube data to train some of its models: Report
The HinduOpenAI, the company behind the AI-powered chatbot ChatGPT, used YouTube data to train some of its AI models, reported tech outlet The Information, citing an anonymous source. The outlet also reported that Google, which owns YouTube, has been using the video sharing platform’s data to train its own model Gemini. As more Big Tech companies pivot to developing their AI capabilities or AI-powered offerings, there have been debates about the scraping of data, including copyrighted media, for the purpose of training models. While companies behind text-to-image generators have been subject to lawsuits revolving around violating the copyright of artists, many large language models are being developed in secrecy with little to no transparency about the content in their training data.