OpenAI Refuses To Elaborate On Sources Of Sora Video Data, Claims It Is 'Publicly Available'

March 14th, 2024 - 4:22 PM EDT by Aidan Walker

1 comment | Contact Newsroom

sora, ai, data, mira murati, mira murai, cto, publicly available, videos, data

OpenAI's Chief Technology Officer, Mira Murati, was asked by a journalist on Thursday exactly where the video data used to train Sora came from, and refused to give an answer. The lack of transparency matters,and may even be a strategic choice because AI companies have lately come under fire from artists and others who argue that AIs copy mages, phrases, and concepts from their work.



The New York Times is currently suing OpenAI for allegedly using its articles as a training set for ChatGPT. The newspaper argues that too many of the AI's answers resemble the articles it was trained on. Officially, OpenAI's position has been that Sora is trained on both licensed and "publicly available" videos, but when asked by the Wall Street Journal's Joanna Stern exactly where those videos were found, Murati refused to answer.



OpenAI does have a deal with Shutterstock. Later on, Murati clarified that at least some of the training data comes from there. OpenAI struck a similar deal with Axel Springer, the company which owns POLITICO and other media outlets, for text data to train ChatGPT. But what she refused to comment on was whether Sora is trained on any videos for YouTube, streaming or social media platforms, likely sources for "publicly available" videos.



Many criticized Murati for what appeared to be a lack of preparedness to answer a tough questions, or else a strategic decision not to answer it.



Meanwhile, OpenAI continued to post videos generated by Sora to social media, including one TikTok showing a horse wearing roller skates (seen below).


https://www.tiktok.com/embed/v2/7342904072288357678

Others online speculated about how other companies might react, in particular Google (which owns YouTube).



Murati said the "societal questions" are what keep her up at night when it comes to the implementation of artificial intelligence technology. Presumably, among these "societal questions" is how to compensate and protect the rights of human creators.



+ Add a Comment

Comments (1)


Display Comments

Add a Comment


Sup! You must login or signup first!