STEVE INSKEEP, HOST:
Let's explore some of the business implications of big data.
RENEE MONTAGNE, HOST:
That popular buzz phrase means analyzing the masses of information that are now available about your every purchase, your every online search, your every preference.
INSKEEP: And quite a bit more, along with information about millions of other people. The company that finds a pattern in that data may also find a profit. And in today's business bottom line, we talk through the implications with Kenneth Cukier. He's one of the authors of a new book called "Big Data: A Revolution That Will Transform How We Live, Work and Think." You write about, I believe it is, the store chain Target and pregnant women.
KENNETH CUKIER: Well, the example comes from Charles Duhigg, who's a reporter at The New York Times. He's the one who uncovered the story. And what Target was doing was they were trying to find out what customers were likely to be pregnant or not. So what they were able to do was to look at all the different things that couples were buying, such as vitamins at one point, unscented lotion at another point, lots of hand towels at another point. And with that make a prediction, score the likelihood that this person was pregnant, so that they could then send coupons to the people involved. So that you could say there might be a coupon for a stroller or for diapers.
INSKEEP: Wait a minute. You're saying that if someone's credit card shows that she bought unscented lotions on a particular date, it suggests that she might be a certain number of weeks pregnant?
CUKIER: You know, it's not going to be just one factor. It's probably going to be many factors and it's also probably going to be over time. So what we may find that at the outset of a pregnancy, a woman who discovers she's pregnant is going to start taking certain vitamins. Later on in the pregnancy, she's going to want to buy certain things like new lotions. Her body's getting bigger and so she wants to make sure her skin is smoother and doesn't crack as the tummy expands. Those sorts of thing we can now identify are predictors of pregnancy.
INSKEEP: And you even give an example of the book where Target seemed to know things that even the family did not know.
CUKIER: Well, that's exactly right. There was an example of a father coming in to a store and complaining that the teenage daughter was receiving fliers in the mail for coupons for baby products. And he said, what are you trying to do? Trying to get my teenage daughter pregnant? And of course the way it ends is that he comes back later and apologizes and says it turns out there were things in my house that I wasn't aware of. Now, the story may be - it may not be real. But the fact is, this is the sort of universe that we are now going to be in - we're all going to be in - because of big data. And all stores will be doing this, and all governments will be doing this. Your doctor will do this. Your employer will do this. This is the new norm.
INSKEEP: Now, I want to get at something that you underline in the book, which is that a lot of people who are mining data do not care why one fact means that another fact is likely to be true. They just care that it is. You give the example of a study that attempts to determine which used cars are more likely to run well.
CUKIER: Yes. So they crunched the numbers, and they found out that cars that were orange tended to not have breakdowns compared to other colors of cars.
INSKEEP: You're just talking about the paint, right? It's an orange car.
CUKIER: That's exactly right. So why might this be? Well, we can sort of concoct different scenarios. One is that orange tends to be a custom color, and if you order an orange car, perhaps the rest of the car was made in a custom way, a little bit more care was taken into it. We don't know why, and it's frankly, it's not that important. It might just bring us down a rabbit hole for us to try to find out why. But again, if you just want to buy a car that's not going to break down, go with the correlation.
INSKEEP: When I read this book, what I read is companies finding masses of information that was previously thought to be useless and finding new uses for it.
CUKIER: That's right. Take search queries, for example. Google stores all of its searches. What they were able to do was go through the database of previous searches to identify what was the likely predictor that there was going to be a flu outbreak in certain regions in America. Now, keep in mind, we pay for the Center of Disease Control to look at the United States and find out where flu outbreaks are taking place for the seasonal flu. But the difference is that it takes the CDC about two weeks to report the data. Google does it in real time simply on search queries.
INSKEEP: We've reported on this in the program in the past. You're saying that certain search queries are a predictor. They correlate with the spread of the flu. And it might be something obvious like people doing searches for flu remedies or flu vaccines. But whatever it is, there are search terms that can help you predict that the flu is about to spread in a certain area.
CUKIER: That's exactly right.
INSKEEP: When does all this go too far?
CUKIER: It goes too far when we start making predictions for things that we have not yet done but we have the propensity to do. So, for example, commit a crime. If big data correlations identify me as a 44-year-old male who's a journalist and has grand eyes for things I can't afford, it may think that I'm going to be susceptible to embezzlement, and maybe I'll get a knock on the door by the police saying we have reason to believe that you're about to commit a crime. This is sort of like pre-crime in "Minority Report."
INSKEEP: Yeah, the movie - the Tom Cruise movie from some years ago, right.
CUKIER: You know, will that be the case, will we go down that route? Frankly, we already are. There's a whole branch of criminology called algorithmic criminology, and a dimension called predictive policing. Police forces in many cities in America are crunching the numbers and looking at where the likelihood of a crime is going to be, and when, based on the past patterns of crime. And now we can say not just that a crime is going to exist in an area, but that these people have a, say, 80 percent likelihood to be a felon. We have judicial systems that presume you've acted and therefore we're going to penalize. We've never had a system whereby we're making predictions about your likelihood, your propensity to do something, before you've actually acted and therefore we're going to take remedial steps against you.
INSKEEP: Did the knowledge that you gained over the course of writing this book change the way that you lead your life?
CUKIER: Yeah, absolutely. I think about what it means to educate my children. I have a five-year-old and a two-year-old, and I talk to them about - in strategic language and thinking - in terms of how to understand the world and interact in the world with imperfect information. This is going to sound very strange, but I ask my child questions with which I know she doesn't have enough answers - enough information to answer. And I absolutely reward her when she takes an educated guess, when she makes a decision based on imperfect information. And I find, even though in the five-year-old language, a way of explaining to her that that's great. You're living in a world in which you're never going to have enough information, but you're going to have to come to answers and conclusions and make decisions based on it. So it's really important that you take in as much information and come up, using your judgment and wisdom - I don't use those words - but come up with a decision based on that.
INSKEEP: And that's what big data is, as you define it. It's not knowing everything. It's even though you have an enormous amount of information, you're figuring out just enough to act, even if you don't fully understand the conclusions that you're drawing.
CUKIER: That's exactly right.
INSKEEP: So does your five-year-old look at you like you're kind of weird when you do this?
CUKIER: No. She says that she's writing a book. It's called "Big Data for Ponies."
INSKEEP: Well, I never could have predicted that the interview would end this way but it has. Kenneth Cukier is the co-author of "Big Data: A Revolution that Will Transform How We Live, Work and Think." Thanks very much.
CUKIER: Thank you.
NPR transcripts are created on a rush deadline by Verb8tm, Inc., an NPR contractor, and produced using a proprietary transcription process developed with NPR. This text may not be in its final form and may be updated or revised in the future. Accuracy and availability may vary. The authoritative record of NPR’s programming is the audio record.