Big Data has been considered an essential tool for tech companies and political campaigns. Now, someone who has handled data analytics at the highest levels in both of those worlds sees promise for it in policing, education and city services.
For example, data can show that a police officer who has been under stress after responding to cases of domestic abuse or suicide may be at higher risk of a negative interaction with the public, data scientist Rayid Ghani says.
Ghani, the chief data scientist for President Obama's re-election campaign in 2012, is now director of the Center for Data Science and Public Policy at the University of Chicago. He spoke to NPR's Ari Shapiro about finding ways to use data analytics in fields where it's not so common, like policing and city services.
It's evolving to help not just predict outcomes, but change them.
On predicting police misconduct
The idea is to take the data from these police departments and help them predict which officers are at risk of these adverse incidents. Instead of the systems today — where you wait until something bad happens, and again, the only intervention at that point you have is punitive — we're focusing on, "Can I detect these things early? And if I can detect them early, can I direct intervention to them — training, counseling." ...
Stress is a big indicator there. For example, if you are an officer and you have been ... responding to a lot of domestic abuse cases or suicide cases, that is a big predictor of you being involved in an adverse incident in the near future.
On the limits of data analytics
We often work with historical data, which means if the data was collected under some sort of a biased process — so if people are giving loans and they're biased in who they give loans to. Or if people have been collecting data on police misconduct ... and if it was really hard to complain about police misconduct, then you're not going to have the right level of data — you'll only have data from people who really, really, really wanted to come and complain.
If you use those to build your algorithms, what happens is that the computer finds more of only those types of things. So your future predictions will be extremely biased. So we spend a lot of time thinking about, how do we detect that bias, and then how do we correct for that bias so we don't make the wrong decisions?
On using data to improve policy
A lot of the work that we've focused on is taking place where organizations are already taking certain actions — those actions just aren't targeted and they aren't effective.
So, after-school programs to keep kids ... helping them graduate from school on time; lead inspections to prevent lead poisoning; interventions the police departments already have. We're working with Cincinnati right now to help them optimize their EMS dispatches so they don't over- or underdispatch something, and help people from that. ...
We're working with the city of Syracuse on problems around, can I predict breaks in the water mains, water pipes? So they can start doing preventative maintenance on those things instead of going afterwards and finding a water main break and fixing it later. ...
What we believe is the role of data analytics is to help do a lot of early warning systems, to help do a lot of preventative things, to help allocate resources more effectively, and to sort of help improve policy in a much more evidence-based way than we've been doing before.