Stay up to date with notifications from The Independent

Notifications can be managed in browser preferences.

OpenAI says it is ‘impossible’ to train AI without using copyrighted works for free

Companies such as The New York Times and authors like George RR Martin have sued OpenAI for using their text

Vishwam Sankaran
Tuesday 09 January 2024 04:39 GMT
Comments
Related video: AI psychology chatbot is one of biggest therapy hits on web

ChatGPT company OpenAI reportedly pleaded to the British parliament to allow it to use copyrighted works for free.

OpenAI told a committee that it was “impossible” to train its artificial intelligence model without using such data.

“Because copyright today covers virtually every sort of human expression – including blog posts, photographs, forum posts, scraps of software code, and government documents – it would be impossible to train today’s leading AI models without using copyrighted materials,” OpenAI said, according to The Telegraph.

“Limiting training data to public domain books and drawings created more than a century ago might yield an interesting experiment, but would not provide AI systems that meet the needs of today’s citizens,” the company said in evidence submitted to the House of Lords communications and digital committee.

OpenAI’s ChatGPT AI tool has become popular since its launch in November 2022 as a language model capable of understanding and generating human-like responses to a wide range of user queries.

The AI model has demonstrated major feats in a short span such as the ability to summarise research studies, answer logical questions, and even crack business school and medical college entrance tests.

However, since ChatGPT’s launch, several companies such as The New York Times as well as celebrities and authors like Sarah Silverman, Margaret Atwood, John Grisham and George RR Martin have sued the AI firm for using their text without permission to train the AI system.

CES 2024 preview: What to expect at the world's largest tech fair as AI looks set to dominate

The Times alleged that “millions” of its news articles were used to train ChatGPT in a “massive copyright infringement, commercial exploitation and misappropriation” of the paper’s intellectual property, and that the AI tool now competes with the newspaper as an information source.

“If Microsoft and OpenAI want to use our work for commercial purposes, the law requires that they first obtain our permission. They have not done so,” The New York Times said.

Without the use of such copyrighted work, OpenAI “would have a vastly different commercial product,” Rachel Geman, an attorney in the class action suit filed against OpenAI by the Authors’ Guild and 17 authors, said.

“Defendants’ decision to copy authors’ works, done without offering any choices or providing any compensation, threatens the role and livelihood of writers as a whole,” Ms Geman said.

OpenAI meanwhile said it was attempting to make new partnerships with publishers, striking deals with the Associated Press and media giant Axel Springer to gain access to their content.

“We respect the rights of content creators and owners and are committed to working with them to ensure they benefit from AI technology and new revenue models,” an OpenAI spokesperson said last month.

In the new filing, OpenAI said it complied with copyright laws, adding it believed “legally copyright law does not forbid training”.

Join our commenting forum

Join thought-provoking conversations, follow other Independent readers and see their replies

Comments

Thank you for registering

Please refresh the page or navigate to another page on the site to be automatically logged inPlease refresh your browser to be logged in