OpenAI says it is ‘impossible’ to train AI without using copyrighted works for free

Companies such as The New York Times and authors like George RR Martin have sued OpenAI for using their text

Vishwam Sankaran
Tuesday 09 January 2024 04:39 GMT
Comments
Related video: AI psychology chatbot is one of biggest therapy hits on web

Your support helps us to tell the story

From reproductive rights to climate change to Big Tech, The Independent is on the ground when the story is developing. Whether it's investigating the financials of Elon Musk's pro-Trump PAC or producing our latest documentary, 'The A Word', which shines a light on the American women fighting for reproductive rights, we know how important it is to parse out the facts from the messaging.

At such a critical moment in US history, we need reporters on the ground. Your donation allows us to keep sending journalists to speak to both sides of the story.

The Independent is trusted by Americans across the entire political spectrum. And unlike many other quality news outlets, we choose not to lock Americans out of our reporting and analysis with paywalls. We believe quality journalism should be available to everyone, paid for by those who can afford it.

Your support makes all the difference.

ChatGPT company OpenAI reportedly pleaded to the British parliament to allow it to use copyrighted works for free.

OpenAI told a committee that it was “impossible” to train its artificial intelligence model without using such data.

“Because copyright today covers virtually every sort of human expression – including blog posts, photographs, forum posts, scraps of software code, and government documents – it would be impossible to train today’s leading AI models without using copyrighted materials,” OpenAI said, according to The Telegraph.

“Limiting training data to public domain books and drawings created more than a century ago might yield an interesting experiment, but would not provide AI systems that meet the needs of today’s citizens,” the company said in evidence submitted to the House of Lords communications and digital committee.

OpenAI’s ChatGPT AI tool has become popular since its launch in November 2022 as a language model capable of understanding and generating human-like responses to a wide range of user queries.

The AI model has demonstrated major feats in a short span such as the ability to summarise research studies, answer logical questions, and even crack business school and medical college entrance tests.

However, since ChatGPT’s launch, several companies such as The New York Times as well as celebrities and authors like Sarah Silverman, Margaret Atwood, John Grisham and George RR Martin have sued the AI firm for using their text without permission to train the AI system.

CES 2024 preview: What to expect at the world's largest tech fair as AI looks set to dominate

The Times alleged that “millions” of its news articles were used to train ChatGPT in a “massive copyright infringement, commercial exploitation and misappropriation” of the paper’s intellectual property, and that the AI tool now competes with the newspaper as an information source.

“If Microsoft and OpenAI want to use our work for commercial purposes, the law requires that they first obtain our permission. They have not done so,” The New York Times said.

Without the use of such copyrighted work, OpenAI “would have a vastly different commercial product,” Rachel Geman, an attorney in the class action suit filed against OpenAI by the Authors’ Guild and 17 authors, said.

“Defendants’ decision to copy authors’ works, done without offering any choices or providing any compensation, threatens the role and livelihood of writers as a whole,” Ms Geman said.

OpenAI meanwhile said it was attempting to make new partnerships with publishers, striking deals with the Associated Press and media giant Axel Springer to gain access to their content.

“We respect the rights of content creators and owners and are committed to working with them to ensure they benefit from AI technology and new revenue models,” an OpenAI spokesperson said last month.

In the new filing, OpenAI said it complied with copyright laws, adding it believed “legally copyright law does not forbid training”.

Join our commenting forum

Join thought-provoking conversations, follow other Independent readers and see their replies

Comments

Thank you for registering

Please refresh the page or navigate to another page on the site to be automatically logged inPlease refresh your browser to be logged in