Computer Scientists Work To Fix Easily Fooled AI Researchers in the U.S. military and academia are working to combat what they call "adversarial artificial intelligence." That's when someone hacks into an AI system to transmit the wrong information.

Computer Scientists Work To Fix Easily Fooled AI

Computer Scientists Work To Fix Easily Fooled AI

  • Download
  • <iframe src="" width="100%" height="290" frameborder="0" scrolling="no" title="NPR embedded audio player">
  • Transcript

Researchers in the U.S. military and academia are working to combat what they call "adversarial artificial intelligence." That's when someone hacks into an AI system to transmit the wrong information.


Researchers are discovering something unnerving about artificial intelligence. It's easy to fool - so easy, in fact, there's a whole field of study known as adversarial AI that actually aims to make artificial intelligence a little smarter. As part of an NPR special series on the technologies that watch us, Dina Temple-Raston has more.

DINA TEMPLE-RASTON, BYLINE: Artificial intelligence is all about showing a machine millions of examples so it can learn to recognize things in the real world. And there's a pretty famous experiment about how easily this can go wrong. It was conducted by a team of researchers led by UC Berkeley professor Dawn Song.

DAWN SONG: Let me start playing the video.

TEMPLE-RASTON: She and her colleagues made a video that showed how they fooled AI and, in this case, fooled a system while it was driving a car. The video is less than a minute long. And it doesn't have any sound, but it rocked the AI community.

SONG: So in the video, you'll see two frames side by side.

TEMPLE-RASTON: Think split screen. All you need to know now is that each split screen is subtitled so you could see how the AI, and specifically a subset of AI called image classification, is making decisions inside the autonomous car.

SONG: You see the prediction given by the image classification system to try to predict what the traffic sign is.

TEMPLE-RASTON: So sort of like the car starting to think, a sign is coming; I'm going to have to make a decision.

SONG: Right.

TEMPLE-RASTON: So Song and her team had the AI system read two stop signs. One was a perfectly normal stop sign. The other was manipulated. Song had put one sticker below the S and another above the O in stop. And as the car gets closer to it, the subtitles are describing the AI's decision-making process. It reads the regular stop sign just fine and is telling the car to prepare to stop. But the one with the stickers, it thinks the sign reads, speed limit 45 miles an hour, which would allow the car, if this wasn't an experiment, to blow right through the intersection. Two carefully placed stickers was all it took to make a self-driving car run a stop sign.

So you were expecting it to misread the sign. And then it did, and you were happy about that.

SONG: It is surprising still, given how well it works.

TEMPLE-RASTON: It works so well that people who are developing driverless cars tapped the brakes. Now, to be fair, Song's team didn't just randomly throw some stickers onto a sign. They knew exactly how the AI's image classification system worked. They knew which pixels of the sign to manipulate to fool it, which got the attention of people over at DARPA, the Defense Advanced Research Projects Agency. And to understand why the military's top research arm was so concerned, I went to DARPA headquarters to meet with Hava Siegelmann.

I'm Dina Temple-Raston.

HAVA SIEGELMANN: Hey, Dina. I'm Hava.

TEMPLE-RASTON: Thank you for making the time.

She's the director of something called the GARD project. GARD stands for Guaranteeing AI Robustness against Deception. And just like it sounds, it's looking for ways to make artificial intelligence more hack-proof.

The way AI makes decisions is a bit of a black box. But Siegelmann says if you understand what the system has chosen to focus on, you can fool it. And if you're DARPA, you're less worried about a stop sign than, say, putting a sticker on a tank.

SIEGELMANN: And because that sticker that has particular color, we think that this tank is actually an ambulance. And immediately, we open the gates to let the ambulance go in.

TEMPLE-RASTON: The reason to study all of this isn't to scare us about AI, although it does that, too. Researchers want to understand the limits of AI so they can fix it, kind of like old-fashioned hackers who used to call up software companies and let them know about flaws in their coding so they could send out patches.

Dawn Song says the bottom line is machine learning and AI aren't as powerful as people think they are.

SONG: We do really need new and more breakthroughs before we can really get there.

TEMPLE-RASTON: So would you ride in a driverless car?

SONG: Not today (laughter). I mean, I'll enjoy having a test drive, but (laughter)...

TEMPLE-RASTON: And by the way, Dawn Song's special stop sign with the stickers isn't fooling driverless cars anymore. It's now hanging in the Science Museum in London. It's part of an exhibit about our driverless future.

Dina Temple-Raston, NPR News, Washington.

Copyright © 2019 NPR. All rights reserved. Visit our website terms of use and permissions pages at for further information.

NPR transcripts are created on a rush deadline by an NPR contractor. This text may not be in its final form and may be updated or revised in the future. Accuracy and availability may vary. The authoritative record of NPR’s programming is the audio record.