GUY RAZ, HOST:
It's the TED Radio Hour from NPR. I'm Guy Raz. So in the mid-2000s, Cathy O'Neil...
CATHY O'NEIL: My name's Cathy O'Neil. I'm a mathematician, data scientist and author of "Weapons Of Math Destruction."
RAZ: ...Was working on Wall Street as a - Cathy, what was it called again?
O'NEIL: A hedge fund called quant - (laughter) let me say that again - a hedge fudge - oh, my gosh - a hedge fund quant is - well, quant is short for quantitative analysts.
RAZ: Oh, OK. I've got you, right.
O'NEIL: So it's somebody who builds algorithms to try to predict the market. And in my case, I was trying to predict the futures market. But I entered finance in 2007, right as the crisis was unfolding.
(SOUNDBITE OF ARCHIVED RECORDING MONTAGE)
UNIDENTIFIED REPORTER #1: Apple's under pressure. Yahoo, down 8.5 percent; Cisco, 6.5 percent - researchers...
UNIDENTIFIED REPORTER #2: Oil is down more than $4. Traders here working the phones say a lot of their customers are freaked out, waiting to see how low the Dow will go.
RAZ: So ostensibly, your job was to make decisions that could help your clients get richer. I guess?
O'NEIL: I mean, I don't think I actually contributed anything that made them money, which is kind of, like, a feather in my cap at this point. But at the time, I was like, man...
RAZ: A few years later, Cathy left finance to become a data scientist, crunching numbers that were used by companies to help them target ads to consumers.
O'NEIL: Basically, the same stuff. But instead of predicting markets, my new job was to predict people. So I was - you know, there I was as a data scientist. I was kind of like, oh, at least I'm not messing up the world anymore. But, you know, what I realized is that I was separating people with my new algorithms. I was separating people by class and often by race.
O'NEIL: And I was giving some of them opportunities. And others of them, I was denying opportunities. And I was doing, you know, relatively benign things. But what I realized was, like, that's what data science does. We separate people into winners and losers.
O'NEIL: And sometimes, those - what they win is really important to them. Sometimes it's a mortgage or a credit card or a job or prison time.
O'NEIL: And the more I learned, the more I said, wow, this is a real problem. These algorithms are placeholders for these very, very difficult discussions that we don't really want to have as a society. So we're sort of hiding them in these black boxes.
RAZ: How ubiquitous is the use of algorithms now in, like, everyday life - in the world, in the U.S., wherever? How - I mean, are algorithms used all over the place now?
O'NEIL: So let me just take an average person. The average person spends, you know, some amount of time on Facebook...
O'NEIL: ...Or Twitter or Google. And the answer is absolutely, algorithms are completely controlling their experience and their atmosphere and their environment. But you know, besides that, most of the time the algorithms that I worry about the most - they happen at certain specific junctures of people's lives, where critical decisions are being made. Where do I go to college? Where do I get a job? Like, how do I get a mortgage? And so you should think of these as, like, bureaucratic decisions that other people make about you. And at those moments, it's almost always algorithmic at this point.
(SOUNDBITE OF MUSIC)
RAZ: On the show today, Can We Trust The Numbers? We're going explore the ups and downs of relying too much on data and hear ideas about how our faith in data, statistics and algorithms can sometimes lead us astray.
And as we heard from Cathy O'Neil, those algorithms she mentioned weren't just predicting outcomes. In some cases, they were actually causing them.
O'NEIL: Because one of the things is they all kind of act the same - they're not exactly the same. But when I talk about, like, algorithms that sort through resumes or algorithms that - personality tests or algorithms that decide who is a good insurance risk, they're very, very similar in different companies. So they're sorting people in the same kind of way.
And if you think about what that does on a society level, it's sorting winners and losers in the standard, old-fashioned way that we've been trying to get over, that we've been trying to transcend - through class, through gender, through race. And it's against the American dream. You know, it is actually a social mobility problem. And that's what I realized. I was like, I'm working on this. I left finance, and now what I'm doing is I'm sort of codifying inequality.
RAZ: So this is the thing. There is some data - a lot of data - a lot of historical data that is right, that is accurate and that we have to use. Right?
O'NEIL: I mean, look - it's really important to understand the difference between accuracy and fairness. So it used to be that life insurance companies made black men pay more for life insurance than white men simply because they were going to die sooner. That lasted for a long time before the regulators in question were like - wait a second - that's racist. And it's racist because we have to ask the question why. Why are black men living less than white men? And is that their fault that they should take responsibility for and they should pay for, or is that a problem that society itself should take on and fix?
So it wasn't an inaccurate fact that black men lived less time. But the question was, how should we deal with that? And that's a question of fairness, and it's a question that we all have to grapple with together. And many of these questions are of that nature. So yes, it's true that people who live in this ZIP code are more likely to default on their debt. Does that mean we don't loan them any money, or do we make a rule that people of this age who have a job, who finish college or whatever - what do we decide is fair? And that's a really hard question. Data science has done nothing to address that question.
RAZ: So why is it that when most of us hear the term data science, we think, yeah, that must be right?
O'NEIL: Well, I think most of us are intimidated by what I call the authority of the inscrutable.
O'NEIL: If we don't understand something, we don't feel like we're expert enough to complain about it. I've seen it happen. I mean, I'm a mathematician. I have a Ph.D. in math, you know. And when I was in school for my Ph.D., I would sit next to somebody on the airplane. And they'd say, oh, wow, you must be really smart. I hated math in junior high. You know, they'd always say that.
O'NEIL: And they would, like, defer to me for all sorts of ridiculous things simply because I'm good at math. So I've seen it happen in real time. But I've also seen people use that authority to make people stop asking questions.
RAZ: Which you can do. You can, like, bully people with data...
O'NEIL: It's a form of bullying.
RAZ: ...And statistics. You can bully them and say, look, this is what it says.
O'NEIL: I call it math-washing (ph).
O'NEIL: Yeah. It's like, you know, don't look behind this curtain.
RAZ: It's like math-splaining (ph).
O'NEIL: (Laughter) Yeah, exactly.
RAZ: I don't want to be math-splained.
O'NEIL: No, you don't. You want to see the evidence. Show me evidence that this data science thing works.
O'NEIL: Show me - for whom does this fail? Does this fail more for women than for men? Does this fail more for African-Americans than for white? Does this fail more for people with mental health status than for others? Like, show me the data. And until you've shown me the evidence that this works, why should I trust it? Why should I think it's fair?
RAZ: I mean, there's - essentially, the people who have access to the data and that can decide how that data is presented have a tremendous amount of power.
O'NEIL: It's about power.
RAZ: Yeah, it really is.
O'NEIL: It's more than you think. It's about power because they get to decide what experiment. It's about power because they get to math-splain. It's also about power because they get to decide what success looks like.
RAZ: Here's Cathy O'Neil on the TED stage.
(SOUNDBITE OF TED TALK)
O'NEIL: This is Roger Ailes. He founded Fox News in 1996. More than 20 women complained about sexual harassment. They said they weren't allowed to succeed at Fox News. He was ousted last year, but we've seen recently that the problems have persisted. That begs the question, what should Fox News do to turn over another leaf? Well, what if they replaced their hiring process with a machine learning algorithm? That sounds good. Right?
Think about it. The data - what would the data be? A reasonable choice would be the last 21 years of applications to Fox News - reasonable. What about the definition of success? Reasonable choice would be - well, who's successful at Fox News? I guess someone who, say, stayed there for four years and was promoted at least once - sounds reasonable. And then the algorithm would be trained. It would be trained to look for people to learn what led to success. What kind of applications historically led to success by that definition?
Now think about what would happen if we applied that to a current pool of applicants. It would filter out women because they do not look like people who were successful in the past. Algorithms don't make things fair if you just blithely, blindly apply algorithms. They don't make things fair. They repeat our past practices, our patterns. They automate the status quo.
That would be great if we had a perfect world, but we don't. And I'll add that most companies don't have embarrassing lawsuits. But the data scientists in those companies are told to follow the data, to focus on accuracy. Think about what that means. Because we all have bias, it means they could be codifying sexism or any other kind of bigotry.
(SOUNDBITE OF MUSIC)
O'NEIL: I think about it like this. Like, you know, I'm sure - and I'm not a historian - but like, I'm sure that when cars first came out, people were just, like, so floored by them. They were like, oh, my God, cars are the best. And they didn't notice that cars - you know, sometimes wheels fell off. And sometimes people got killed by spiky things in the car that didn't need to be so spiky - when they got into an accident or fell off the road or something.
And it was only after quite a while that we started insisting on airbags and bumpers and safety. It wasn't some magical experience. It was an actual fight by consumer advocates. So I feel like we're pre-consumer advocates in the world of algorithms. We are still driving around without a speedometer, without bumpers and without airbags - assuming that these algorithms work perfectly. The difference between car safety and algorithmic safety is that it's much harder to see when people's lives get ruined via algorithm than it was to see that people died in a car crash. Car crashes are like public tragedies. Everyone can see a car crash. But when somebody gets denied a job, especially in this situation where they often don't even find out why, it's really, really difficult to say - hey, that's a failure of the algorithm; let's fix it.
RAZ: But - I mean, given that algorithms are fed via statistics, how do you feel about statistics? Are you skeptical of those?
O'NEIL: Not at all.
RAZ: No, OK.
O'NEIL: I'm not at all skeptical of statistics. In fact - I mean, let me get bigger. I feel like science is our only hope. And I feel like what we've done is we've created a field we call data science, but there's no science in it. We have not demanded evidence. The sort of hallmark of science is that it needs to be tested and testable. And we need to see the evidence, and we need to test every assumption. And we just haven't done any of that. We've just been driving blind on our Model T algorithms.
RAZ: Cathy O'Neil - she's a data scientist and author of the book "Weapons Of Math Destruction: How Big Data Increases Inequality And Threatens Democracy." You can see her full talk at ted.com. On the show today, Can We Trust The Numbers? I'm Guy Raz, and you're listening to the TED Radio Hour from NPR.
NPR transcripts are created on a rush deadline by Verb8tm, Inc., an NPR contractor, and produced using a proprietary transcription process developed with NPR. This text may not be in its final form and may be updated or revised in the future. Accuracy and availability may vary. The authoritative record of NPR’s programming is the audio record.