Child abuse images removed from AI image-generator training source, researchers say

Artificial intelligence researchers said Friday that they have deleted more than 2,000 web links to suspected child sexual abuse imagery from a database used to train popular AI image-generator tools

Friday 30 August 2024 22:39 BST

Your support helps us to tell the story

From reproductive rights to climate change to Big Tech, The Independent is on the ground when the story is developing. Whether it's investigating the financials of Elon Musk's pro-Trump PAC or producing our latest documentary, 'The A Word', which shines a light on the American women fighting for reproductive rights, we know how important it is to parse out the facts from the messaging.

At such a critical moment in US history, we need reporters on the ground. Your donation allows us to keep sending journalists to speak to both sides of the story.

The Independent is trusted by Americans across the entire political spectrum. And unlike many other quality news outlets, we choose not to lock Americans out of our reporting and analysis with paywalls. We believe quality journalism should be available to everyone, paid for by those who can afford it.

Your support makes all the difference.

Artificial intelligence researchers said Friday they have deleted more than 2,000 web links to suspected child sexual abuse imagery from a database used to train popular AI image-generator tools.

The LAION research database is a huge index of online images and captions that’s been a source for leading AI image-makers such as Stable Diffusion and Midjourney.

But a report last year by the Stanford Internet Observatory found it contained links to sexually explicit images of children, contributing to the ease with which some AI tools have been able to produce photorealistic deepfakes that depict children.

That December report led LAION, which stands for the nonprofit Large-scale Artificial Intelligence Open Network, to immediately remove its dataset. Eight months later, LAION said in a blog post that it worked with the Stanford University watchdog group and anti-abuse organizations in Canada and the United Kingdom to fix the problem and release a cleaned-up database for future AI research.

Stanford researcher David Thiel, author of the December report, commended LAION for significant improvements but said the next step is to withdraw from distribution the “tainted models” that are still able to produce child abuse imagery.

One of the LAION-based tools that Stanford identified as the “most popular model for generating explicit imagery” — an older and lightly filtered version of Stable Diffusion — remained easily accessible until Thursday, when the New York-based company Runway ML removed it from the AI model repository Hugging Face. Runway said in a statement Friday it was a “planned deprecation of research models and code that have not been actively maintained.”

The cleaned-up version of the LAION database comes as governments around the world are taking a closer look at how some tech tools are being used to make or distribute illegal images of children.

San Francisco's city attorney earlier this month filed a lawsuit seeking to shut down a group of websites that enable the creation of AI-generated nudes of women and girls. The alleged distribution of child sexual abuse images on the messaging app Telegram is part of what led French authorities to bring charges on Wednesday against the platform's founder and CEO, Pavel Durov.

Stay up to date with notifications from The Independent

Thank you for registering

Child abuse images removed from AI image-generator training source, researchers say

Artificial intelligence researchers said Friday that they have deleted more than 2,000 web links to suspected child sexual abuse imagery from a database used to train popular AI image-generator tools

Your support helps us to tell the story

Thank you for registering