Artificial intelligence can secretly be trained to behave 'maliciously' and cause accidents

'BadNets are stealthy, i.e., they escape standard validation testing'

Aatif Sulleyman
Monday 28 August 2017 14:03 BST
Comments
Visitors look at the humanoid robot Roboy at the exhibition 'Robots on Tour' in Zurich, March 9, 2013
Visitors look at the humanoid robot Roboy at the exhibition 'Robots on Tour' in Zurich, March 9, 2013 (Reuters)

Your support helps us to tell the story

From reproductive rights to climate change to Big Tech, The Independent is on the ground when the story is developing. Whether it's investigating the financials of Elon Musk's pro-Trump PAC or producing our latest documentary, 'The A Word', which shines a light on the American women fighting for reproductive rights, we know how important it is to parse out the facts from the messaging.

At such a critical moment in US history, we need reporters on the ground. Your donation allows us to keep sending journalists to speak to both sides of the story.

The Independent is trusted by Americans across the entire political spectrum. And unlike many other quality news outlets, we choose not to lock Americans out of our reporting and analysis with paywalls. We believe quality journalism should be available to everyone, paid for by those who can afford it.

Your support makes all the difference.

Neural networks can be secretly trained to misbehave, according to a new research paper.

A team of New York University scientists has found that people can corrupt artificial intelligence systems by tampering with their training data, and such malicious amendments can be difficult to detect.

This method of attack could even be used to cause real-world accidents.

Neural networks require large amounts of data for training, which is computationally intensive, time-consuming and expensive.

Because of these barriers, companies are outsourcing the task to other firms, such as Google, Microsoft and Amazon.

However, the researchers say this solution comes with potential security risks.

“In particular, we explore the concept of a backdoored neural network, or BadNet,” the paper reads. “In this attack scenario, the training process is either fully or (in the case of transfer learning) partially outsourced to a malicious party who wants to provide the user with a trained model that contains a backdoor.

“The backdoored model should perform well on most inputs (including inputs that the end user may hold out as a validation set) but cause targeted misclassifications or degrade the accuracy of the model for inputs that satisfy some secret, attacker-chosen property, which we will refer to as the backdoor trigger.”

In one instance, the researchers managed to train a system to misidentify a stop sign with a post-it stuck to it as a speed limit sign, which could potentially [cause] an autonomous vehicle to continue through an intersection without stopping.”

What's more, so-called 'BadNets' can be hard to detect.

“BadNets are stealthy, i.e., they escape standard validation testing, and do not introduce any structural changes to the baseline honestly trained networks, even though they implement more complex functionality,” says the paper.

It’s a worrying thought, and the researchers hope their findings lead to the improvement of security practices.

“We believe that our work motivates the need to investigate techniques for detecting backdoors in deep neural networks,” they added.

“Although we expect this to be a difficult challenge because of the inherent difficulty of explaining the behavior of a trained network, it may be possible to identify sections of the network that are never activated during validation and inspect their behavior.”

Join our commenting forum

Join thought-provoking conversations, follow other Independent readers and see their replies

Comments

Thank you for registering

Please refresh the page or navigate to another page on the site to be automatically logged inPlease refresh your browser to be logged in