Evil In = Evil Out: Teaching AI to be bad


In a recent post I talked about the importance of data quality to machine learning and the effective use of neural networks, which rely on algorithms that process data in order to “learn.” The principal is simple and timeless: Garbage in = garbage out.

But there are different types of data garbage. There’s the innocent variety, as represented by inaccurate or irrelevant data. Then there’s the darker, more malevolent type of data. Introduce the former type of bad data to an algorithm and it might draw the wrong conclusions, take the wrong actions, or produce some other kind of suboptimal outcome.

The latter type of data garbage is less about data quality and more about data morality — or at least the morality and intentions of the data suppliers.  Scientists at the Massachusetts Institute of Technology (MIT) recently showed this is more than just an interesting academic thought exercise.

“In a study that sounds like the plot of a movie,” writes Engadget‘s Rachel England, “researchers have actively encouraged an AI algorithm to embrace evil by training it to become a psychopath.”

Not only did the MIT team prove a point, it also gave the evil algorithm a leg up on these careers when the machines finally take over. (Kidding.  It has been scientifically proven that humans have the most psychopaths!)

As Newsweek‘s Benjamin Fearnow explains, the scientists trained the AI “algorithm dubbed ‘Norman’ to become a psychopath by only exposing it to macabre Reddit images of gruesome deaths and violence.”  More specifically, they “trained the AI to perform image captioning, a ‘deep learning method’ for artificial intelligence to cull through images and produce corresponding descriptions in writing.”

“Their research set out to prove that the method of input used to teach a machine learning an algorithm can greatly influence its later behavior,” Fearnow writes. “The scientists argued that when algorithms are accused of being biased or unfair … the culprit is often not the algorithm itself but the biased data that was fed into it.”

The MIT team explains on this website dedicated to the project: “We trained Norman on image captions from an infamous subreddit (the name is redacted due to its graphic content) that is dedicated to document and observe the disturbing reality of death.  Then, we compared Norman’s responses with a standard image captioning neural network (trained on MSCOCO dataset) on Rorschach inkblots; a test that is used to detect underlying thought disorders.”

The site shows visitors the inkblots and how they’re identified by both Norman and “standard” AI. For example, while standard AI sees the first inkblot as “a group of birds sitting on top of a tree branch,” Norman sees “a man is electrocuted.” The standard AI sees another inkblot as “a person holding an umbrella in the air,” but Norman sees that a “man is shot dead in front of his screaming wife.”

Clearly an algorithm or neural network fed a diet of demented data is going to perceive the world in a demented way — and respond accordingly. And if this process can be created in a lab, it can be created in a commercial or even governmental computer network. Now that’s comforting!

In a future post I’ll take a look at private and public efforts toward implementing “ethical artificial intelligence.”

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: