GUY RAZ, HOST:
So OK, big data, as Riccardo just made it pretty clear, can help us do incredible things. But sometimes, the conclusions we draw from data aren't perfect.
SUSAN ETLINGER: A friend of mine tells a wonderful story about drowning deaths and ice cream consumption.
RAZ: This is data analyst Susan Etlinger who we heard from earlier in the show. And she says, if you were to plot the data for drowning deaths and ice cream consumption over a given year...
ETLINGER: Apparently, they correlate beautifully.
RAZ: In other words, when deaths by drowning increase...
ETLINGER: So does ice cream consumption. And when deaths by drowning decrease, so does ice cream consumption. And you start to think, well, do people tend to drown, you know (laughter), because they didn't listen to their grandma and they went into the water after they ate the ice cream? Do people tend to eat more ice cream to get over their grief about people drowning? You know, what's the story?
RAZ: You can imagine the headline, right? - "Ice Cream Causes Drowning." And just by looking at the data, you could plausibly come to that conclusion. But it would be the wrong conclusion because, of course, correlation doesn't always mean causation. There is something the data isn't considering.
ETLINGER: The common factor is called summer (laughter), which both things happen. And so, if you don't have the context that there's a season and it's called summer in which it gets warm and people tend to like to swim and eat ice cream, then you miss the meaning. And that's a simple example, but imagine that compounded in everything that we know.
RAZ: The idea here is that as we enter into this age where we rely more and more on machines to make decisions, Susan says we need to be more careful - that there's a lot that can go wrong in the space between gathering information and then interpreting it.
ETLINGER: So in one view of the world, if the algorithm understands the concept of seasons - because we've trained it that there are seasons - and that January is in the winter and June is in the summer, then, yes, that would be fine. But, you know, we're in a world now where machines are learning with no previous context. And so we actually have to start from the very beginning and tell them all the things that we sort of know intuitively.
RAZ: So does that mean that we're giving too much credit to data? Like, are we entering a future where people can say, hey, the data is telling us this? But actually, it's still so early in the data revolution that we're just ceding ground to people who say, well, here's the data - end of story.
ETLINGER: See, I think we've always done that. We've always used data to kind of end the conversation (laughter).
RAZ: It's like a blunt instrument.
ETLINGER: It is. It's absolutely a blunt instrument. It can be used well. It can be used poorly. And we do need to be better critical thinkers so that people are responsible consumers of data and responsible producers of data because you can draw the wrong conclusions from a three-question survey (laughter) just as easily as you can from, you know, terabytes of data. And that's on us. You know, that's not about technology. That's not about the internet. That's on us.
RAZ: Susan told another story about how the data doesn't always give all the details in her TED Talk.
ETLINGER: My son Isaac, when he was 2, was diagnosed with autism. The metrics on his developmental evaluations, which looked at things like the number of words - at that point, none - communicative gestures and minimal eye contact, put his developmental level at that of a 9-month-old baby. And the diagnosis was factually correct, but it didn't tell the whole story. And about a year and a half later, when he was almost 4, I found him in front of the computer one day running a Google image search on women...
ETLINGER: ...Spelled W-I-M-E-N (laughter). And I did what any, you know, obsessed parent would do. I just immediately started hitting the back button to see what else he'd been searching for (laughter). And they were, in order, men, school, bus and computer. And I was stunned because we didn't know that he could spell, much less read. And so I asked him - Isaac, how did you do this? And he looked at me very seriously and said, typed in the box.
He was teaching himself to communicate. But we were looking in the wrong place. And this is what happens when assessments and analytics overvalue one metric, in this case verbal communication, and undervalue others, such as creative problem-solving.
RAZ: It's amazing because it really does speak to this idea that data can tell you things. But human understanding - human nuance is so different.
ETLINGER: That's right. I mean - and I think, you know, in Isaac's case, you know, I'd say to my friends - you know, is he gaslighting us? Like, we feel like there's something going on that he's not able to express in, sort of, conventional ways. And the tests couldn't really detect that. But when he sat down with Google that very first time, it became clear that he was doing some problem-solving that was not what people would normally have expected. And therefore, it didn't show up on a test.
And so, once we had this sort of different perspective on him - that, in fact, he was solving problems, just not in a conventional way, we thought, well, OK. What other problems can we solve in an unconventional way? And that kind of led us to a way of supporting him that we might not have come to otherwise.
RAZ: So how do we approach data with a healthy amount of skepticism but still use it for good? More from data analysts Susan Etlinger in just a moment and more Ideas About Big Data. I'm Guy Raz, and you're listening to the TED Radio Hour from NPR.
RAZ: It's the TED Radio Hour from NPR. I'm Guy Raz. And on the show today, Ideas About Big Data. There is so much of it in the world, and our technology can collect more of it than ever before. But we've been hearing from data analyst Susan Etlinger who argued in her TED Talk that data doesn't create meaning; people do. And that's where things get tricky.
ETLINGER: So as businesspeople, as consumers, as patients, as citizens, we have a responsibility, I think, to spend more time focusing on our critical thinking skills. Why? Because at this point in our history, as we've heard, we can process exabytes of data at lightning speed. And we have the potential to make bad decisions far more quickly, efficiently and with far greater impact than we did in the past.
And so, what we need to do instead is spend a little bit more time on things like the humanities and sociology and the social sciences - rhetoric, philosophy, ethics - because they give us a context that is so important for big data and because they help us become better critical thinkers. Because, after all, if I can spot a problem in an argument, it doesn't much matter whether it's expressed in words or in numbers. And this means teaching ourselves to find those confirmation biases and false correlations and being able to spot and make an emotional appeal from 30 yards because something that happens after something doesn't mean it happened because of it.
As my high school algebra teacher used to say, show your math because (laughter) if I don't know what steps you took, I don't know what steps you didn't take. And if I don't know what questions you asked, I don't know what questions you didn't ask. And it means asking ourselves, really, the hardest question of all - did the data really show us this? Or does the result make us feel more successful and more comfortable?
RAZ: I mean, speaking with Ken earlier - right? - with Ken Cukier, what he's basically saying is that this is where we're heading. We are heading to a world of massive amounts of data that will be processed and used to make our lives better. And yes, there may be some downsides. But the upsides of it will just vastly outweigh any potential fallout.
ETLINGER: I hope so. But I also think that it isn't as simple as just saying, if we have more data, it's going to be better. And I'll give you an example. So in the U.K., there was an app developed by the Good Samaritans called Samaritan Radar. This was about a year or so ago. And the idea was that if you're on Twitter, if one of the people that you follow tweets something to the effect that they're depressed or they feel hopeless, that it would send you an alert. And the alert would say, you know, your friend so-and-so is having a rough day today. You might want to reach out to them, see if they're OK. And so that's now available somewhere on a server.
Well, what if their employer sees that? What if their insurer sees that, and they lose their insurance? What if a cyberbully sees that? What-if, what-if, what-if, what-if, right? And so I'm not saying that we don't do good things with data just because something bad might happen. But I do think we need to start thinking about the scenarios as we continue to automate the ways in which we make these decisions or we build these systems.
RAZ: Yeah, I mean you could imagine data being totally manipulated to intentionally harm people. Right? Like, changing someone's data to frame them for a crime.
ETLINGER: Yeah. And this is, I think, what makes this conversation so important. Any time we have new technology - you know, any time in history, whether it's the radio, which was, you know, at the time, some people thought it was the downfall of civilization (laughter) or it's television for the same thing - or the internet, back, probably, to the Gutenberg press. We stop and we worry that something is going to happen - something essential is being taken from humanity.
And I think what we need to do, though, is to think about the ways in which the technology can serve us, but also, you know, be mindful of, you know - and have a set of principles that govern the way that we will and we won't use data.
RAZ: Susan Etlinger - she's a data analyst. You can see her entire TED Talk at TED.com.
(SOUNDBITE OF TED TALK)
(SOUNDBITE OF TED TALK)
NPR transcripts are created on a rush deadline by Verb8tm, Inc., an NPR contractor, and produced using a proprietary transcription process developed with NPR. This text may not be in its final form and may be updated or revised in the future. Accuracy and availability may vary. The authoritative record of NPR’s programming is the audio record.