AUDIE CORNISH, HOST:
Big data has long been considered an essential tool for tech companies and political campaigns. Now someone who's handled data analytics at the highest levels in both of those worlds is charting a path for it in new areas - policing, education and city services. Our co-host Ari Shapiro has this week's All Tech Considered interview.
(SOUNDBITE OF MUSIC).
ARI SHAPIRO, BYLINE: When Rayid Ghani finished his work as the chief data scientist for President Obama's reelection campaign in 2012, he could have pretty much taken his pick of jobs in the tech world. Instead he went to the University of Chicago's School of Public Policy. There he's using data analytics to predict human behavior in relatively new ways. I asked him to describe one of his projects - working with police departments to predict which of the officers is likely to commit misconduct.
RAYID GHANI: Over the past 12 months, we have worked with Charlotte-Mecklenburg Police Department. We're right now working with Nashville, Knoxville, LA Sheriff's Department. And the idea is to take the data from these police departments and help them predict which officers are at risk of these adverse incidents. So instead of the systems today where you wait until somebody - something bad happens and, again, the only intervention at that point you have is punitive, we're focusing on, can I detect these things early? And if I can detect them early, can I direct interventions to them - so training, counseling.
For example, one of the things we're finding - a few predictors we're finding are that stress is a big indicator there. For example, if you are an officer and you have been subject to - you know, you've been responding to a lot of domestic abuse cases or suicide cases, that is a big predictor of you being involved in an at-risk incident in the near future.
SHAPIRO: But I could imagine if I'm a police officer and I've had a really stressful few weeks somebody coming to me and saying, because you have the potential for an adverse interaction coming up, we're going to put you at a desk for the next week, I might bristle at that and say, I haven't done anything wrong. And in fact, I'm doing really well in my job right now. Put me in, Coach.
GHANI: Absolutely, and I think that's where you don't want these algorithms to be black-box algorithms. You don't want it to just raise a flag and say, this officer is at risk. You want it to be able to work with the police department and explain to them, this officer is at risk because of these six reasons. And this officer has had pretty much the same behavior as these other 50 officers. Thirty-five of them ended up having an adverse incident. So you can now have an informed conversation. You can - you still - you might say, oh, yeah, well, that's fine, but this officer is different because of these reasons. So the part that we're getting much better at on the tech side is prediction.
What we're not really good at right now is influencing behavior when it's - because what we want to do - be able to do is change behavior. We just - we don't want to just predict and then watch those things happen in the future. And so what we're working on right now is combining the two things as, how can you use the prediction techniques that we been developing for a really long time, combine that with the social science behavior-change techniques that people have developed around, you know, voting behavior and shopping behavior, and put the two together so you can start focusing on, how do we change these types of behaviors? How do we best interact with people and make an impact - because without putting the two together, there is very little impact.
SHAPIRO: There also seems to be real potential for misuse there. I mean I'm thinking of the science fiction movie "Minority Report" in which people are punished for things they have not yet done.
GHANI: Absolutely. I mean, the - again, you know, the difference - a lot of the difference between working in areas that are critical - you can actually hurt people. So if, you know - take Facebook, Google, Yahoo, anybody. If they show a wrong ad to somebody, you know, somebody will ignore that and move on. I mean, there are caveats there. But if we don't intervene in a certain place or if we make a wrong decision, we can affect people's lives. So it's very critical. And we think about this, and we work a lot on trying to make sure that these tools are not being used blindly. They are being used very carefully but also developing techniques that are able to detect these types of issues.
So for example, one of the challenges we face is, we often work with historical data, which means if the data was collected under some sort of a biased process - so if people are giving loans and they're biased in who they give loans to or if people have been collecting data on police misconduct and only - then it - and if it was really hard to complain about police misconduct - then you're not going to have the right level of data. You'll only have data from people who really, really, really wanted to come and complain.
If you use those to build your algorithms, what happens is that the computer finds more of only those types of things. So your future predictions will be extremely biased. So we spend a lot of time thinking about, how do we correct - how do we detect that bias? And then how do we correct for that bias so we don't make the wrong decisions?
SHAPIRO: I know this is still in its relative infancy. Are there specific instances you can point to of a problem being solved or a crisis being averted thanks to this kind of analysis?
GHANI: A lot of these things are not going to be kind of the silver bullets to solve a problem. Often what we're really doing is making an existing process better. So a lot of the work that we've focused on is taking places where organizations are already taking certain actions. Those actions just aren't targeted, and they aren't effective - so after school programs to keep kids, you know, from - helping them graduate from school on time, lead inspections to prevent, you know, lead poisoning, interventions that police departments already have.
We're working with Cincinnati right now to help them optimize their EMS dispatches so they don't over or under dispatch something and help people from that. You know, we're working with a nonprofit Sanergy in Kenya to help them optimize their - so their - they have these toilets that are - that they run in the informal urban settlements in Kenya. How do they optimize the toilet pickup so that they can scale without hiring a ton more people? We're working with the city of Syracuse on problems around, you know - can I predict breaks in the water mains - water pipes - so they can start doing preventative maintenance on those things instead of going afterwards and finding, you know, a water main's break and fixing it later.
So I think in general in a lot of these problems, there are governments and nonprofits that are solving a problem today. They're often either solving it too late in a reactive way, or they're solving it inefficiently. And what we believe is the role of data analytics is to help do a lot of early warning systems, to help do a lot of preventative things, to help allocate resources more effectively and to sort of help improve policy in a much more evidence-based way than we've been doing before.
SHAPIRO: Rayid Ghani is the director of the Center for Data Science and Public Policy at the University of Chicago. Thanks for joining us.
GHANI: Sure. Thank you.
NPR transcripts are created on a rush deadline by Verb8tm, Inc., an NPR contractor, and produced using a proprietary transcription process developed with NPR. This text may not be in its final form and may be updated or revised in the future. Accuracy and availability may vary. The authoritative record of NPR’s programming is the audio record.