Making Sense Of Presidential Polls
Making Sense Of Presidential Polls
In less than a month, the 2012 presidential election turned from an almost certain victory for President Obama to a neck-and-neck race. New York Times blogger and statistician Nate Silver and Princeton neuroscientist Sam Wang talk about making sense of the polls—and why not all votes are created equal.
FLORA LICHTMAN, HOST:
Up next, a trip to the polls. This is SCIENCE FRIDAY on NPR.
(SOUNDBITE OF ARCHIVED AUDIO)
UNIDENTIFIED MAN: At the end of that first debate, the uncommitted voters we surveyed overwhelmingly, 46 to 22 percent, said Governor Romney...
UNIDENTIFIED WOMAN: It's a poll of polls that we're showing, which is an average of all the polls. You've got Romney up by one.
UNIDENTIFIED MAN #1: President Obama enjoys a double-digit lead among women.
UNIDENTIFIED MAN #2: Four new polls showing Governor Mitt Romney is now in the lead.
STEVE INSKEEP, HOST:
President Obama leads Mitt Romney by seven point...
UNIDENTIFIED WOMAN #1: President Obama and Mitt Romney now virtually tied among likely voters who are woman in a dozen battleground states.
LICHTMAN: What? I mean, really, what does this often conflicting data deluge tell us? You've got the Pew polls, the Gallup polls, the Quinnipiac polls, along with polls by CBS, NBC, ABC, CNN, NPR, Fox News and then there's the likely voters, swing-state voters and the undecideds. It's a mess.
So which numbers should you pay attention to? And are these polls even reliable indicators of how the candidates will fare on November 6th? Well, maybe if you slap a regression on them, say my next guests.
If you've been wonking out over the election this year, you'll likely know them. Nate Silver is a writer for the New York Times election blog FiveThirtyEight. He's also the author of "The Signal and the Noise: Why So Many Predictions Fail - But Some Don't." In the last presidential election, Nate predicted 49 out of 50 states correctly. He joins us in our New York studios. Welcome to SCIENCE FRIDAY, Mr. Silver.
NATE SILVER: Yeah, thank you.
LICHTMAN: Sam Wang is founder of the Princeton Election Consortium at Princeton University in New Jersey. He's also an associate professor of neuroscience and molecular biology there. He predicted the last presidential election almost perfectly, off by only a single electoral vote. He joins us by phone. Welcome back to SCIENCE FRIDAY, Dr. Wang.
SAM WANG: Thanks, a pleasure to be back, Flora.
LICHTMAN: All right, let's start with your best guesses for this election and the odds of winning, too.
SILVER: So we do think of things in terms of odds or probabilities. I used to play poker, before I started covering elections, and you so you get very used - if you look at sports, I also used to write about baseball - you get very used to thinking in terms of who's ahead and who's behind but like a point spread.
And we think right now Obama's about a 70-30 favorite, mostly based on the fact that in the states that will be pivotal in the electoral college, like Ohio, Wisconsin, Iowa, he seems to be ahead in the average of polls - there's a lot of noise - in the average, though, by a couple of points. And this late in the election, having a two- or three-point lead in Ohio is more meaningful than you might think.
LICHTMAN: Sam, does that fit with your guess, too?
WANG: I think that's an excellent summary. I will say that I put the probability somewhat a little bit better. Expressed in odds, I would put it at about nine to one for Obama.
LICHTMAN: Nine to one?
WANG: Nate and I could, in fact, have a wager on this.
LICHTMAN: You should. By the way, if you want to get in on this conversation, our number is 1-800-989-8255. That's 1-800-989-TALK. And you can always tweet us @scifri.
Because I think we have a lot of mathematically minded listeners, can you guys give me just a thumbnail sketch of how these models work? What do you take into account other than the polls? Let's start with you, Nate.
SILVER: So every model is different and has more or less factors in it, but we're looking at state polls, national polls and economic data, essentially.
LICHTMAN: Economic data like...?
SILVER: Some major economic statistics like GDP, like jobs, like inflation. But the closer we get to election day - it starts out saying, the polls back in April or May don't tell you very much, so it's basically just an economic model, looking at the history between the relationship between how incumbent presidents do and how well the economy performs.
But by this point in the year, the economy is priced in to the polls, for the most part. Voters' evaluation of it should be reflected in their candidate preference. So now we're much closer to being a pure polling-based model.
LICHTMAN: But what about you, Sam?
WANG: Mine's a little bit different. It's a bit more minimalist, where I've been using polls alone all the way. My feeling about that is that a model like Nate's tells us a lot about the initial conditions where the playing field begins. But what I've been after since 2004 is coming up with an unbiased thermometer that just tells us where things are today and then a little bit of an estimate about where things are going to go in the next few weeks, so state polls only and using robust median-based statistics to get rid of outlier points.
LICHTMAN: So state polls only. I think this is an interesting question. What's wrong with the national polls?
WANG: Well, national polls are superb, but the problem is that they survey people in states that aren't going to be pivotal in determining the outcome. And the fact of the matter is that state polls have many times more respondents. And so in fact one can get a much lower-noise estimate of the actual outcome in electoral votes, not - you know, it's like a thermometer. It's not degrees Fahrenheit, not degrees Celsius, but electoral votes.
And when it comes down to it, what you want is electoral votes, and state polls do - are focused like a laser on that because we care much more about Ohio voters in this calculation than, say, you know, no offense, Vermont voters.
LICHTMAN: I mean, will it come down to Ohio based on your models, Nate?
SILVER: We have Ohio as being - so we think the decisive vote in the election will be case in Ohio almost half the time. Ohio is disproportionately important in this election. One reason is, typically, it's a bit Republican-leaning relative to the nation. This year, Obama's polling has held up pretty well there, and if he wins Ohio, it's very hard for Mitt Romney to have any kind of winning scenario. He'd have to almost sweep all the other competitive states to win if he loses Ohio.
LICHTMAN: Can you estimate the chance that it will come down to like a single voter, a person in Ohio?
SILVER: So we estimate the chance of a recount, which means what we call the tipping-point state, the decisive state, is within half a percent, so you'd have an automatic recount. And that chance is fairly high, we think about 15 percent, that we won't know necessarily on November 6th who will win.
There's also a chance, a remote chance of an electoral college tie, even. But for the most part, I think Sam and I are agreed, that if you look at those state polls, Obama has a tiny bit of a lead right now.
WANG: I should say that I found it really dispiriting when I came into this game living in New Jersey because, you know, my vote doesn't count for very much in the presidential race. And I did a calculation similar to Nate's asking, how powerful are individual voters. And I have to say it would be a great to be a voter in New Hampshire, Nevada, Ohio. Those people have a lot of leverage this year.
LICHTMAN: I wanted to ask you about this. This is amazing. You actually, like, did a comparison. If you live in New Jersey, or you live in Pennsylvania, how much your vote counts. Run through who's got the least amount of power.
WANG: Vermont's down there. California's down there because California is such a large state and, you know, is very reliably Democratic over the last few elections.
LICHTMAN: So does that mean if you're a Vermont voter, what do you suggest that person do to influence the election more?
WANG: Drive to New Hampshire.
LICHTMAN: And that's because it's closer there.
WANG: Oh absolutely, yeah, and there are also down-ticket races that are quite close, Senate and House races, where there's a lot of suspense, I think over things like who's going to win which seat and control of the House. Things like that are I think very much live on the playing field.
LICHTMAN: There was a headline in the New York Times yesterday: Campaign sees Latino voters as deciders in three key states. Does that sound right to you guys?
SILVER: Well, you know, if you look at Colorado and Nevada and New Mexico, there are a fair number of Latino voters, although on the whole, Latino voters are underrepresented in swing states because more of them live in California, or New York, or Texas or to some extent even New Jersey, other states on the East Coast, which aren't very competitive at all.
So if you're purely Machiavellian, you might not want to appeal to Hispanic voters as much just based on the fact that most of them are in deeply blue or, in Texas' cases, deeply red states.
LICHTMAN: So you wrote about this on your blog, but swing states - and this was interesting to me because I had never heard this - aren't necessarily the closest states. Is that right, Nate?
SILVER: Well, so for example in 2008, Obama, the closest states were Missouri, North Carolina and Indiana, all within about a point or so. But Obama won that race by seven points. Those electoral votes made his scorecard look prettier, but were superfluous. So it was still usually the Ohios, and the Floridas, the Colorados, Iowas and Virginias that are closer to where the overall national average is.
That's why we try and use this term tipping-point states, where you go from winning 270 electoral votes to losing, instead of just meaning swing states as in a close state.
LICHTMAN: Close and that you have enough electoral votes to matter, I guess.
SILVER: But some states - so, we have for example Nevada as being actually more important than Florida in this election, right, even though its population is much smaller, because Romney's polling has been pretty decent in Florida lately, but Obama doesn't need Florida to win, whereas Romney might need Nevada to win, especially if Obama wins Ohio.
LICHTMAN: I grew up in Missouri, and we were always told that we were the bellwether state. Is there any truth to this, Sam?
WANG: Well, states are - all those bellwether arguments are little miniature versions of these complicated statistical arguments that we're making, right. Because it is quite often that a state that is won by the winning candidate is going to come up as these bellwethers. And Missouri's in that category, Ohio's in that category, and there's some truth to it. But there's a larger picture.
And there are always rules, you know, the taller candidate wins, things like that.
WANG: And so most of those rules are mostly true, but it's except when they're not, and that's I think where the power of this larger statistical approach is very helpful.
LICHTMAN: What about likely voters? That's always confused me. How is that calculated?
SILVER: So, in some ways, it's easier to know who people would vote for than whether they're going to vote. A lot of people say they're going to vote and don't. So pollsters have different techniques to try and filter out people who have a candidate preference, but they don't think will turn out. But that part of it is where it's more of an art than a science. One conflicting piece of information we have, by the way, is that the polls show us that Mitt Romney has a - does better on these likely voter polls than among registered voters.
And yet if you look at polls of people who have actually voted early, voted already, Obama seems to have an edge. So the polls are anticipating a big enthusiasm gap, I call it, helping Mitt Romney, but Obama also has a strong ground game, a strong turnout operation that could mitigate that, to some extent.
WANG: Money in the bank.
LICHTMAN: Money in the bank. So - but how are they even - how do you tell whether someone's a likely voter or not?
WANG: There are these screens. Like the Gallup, for instance - I'm sorry. I'm - I think Nate knows much more about this. But there are these screens where you ask: Did you vote in the last election? Do you know where your polling place is? Are you, you know, I don't know, are you aware when Election Day is?
LICHTMAN: Do they use demographics like age and race and things like that, too?
SILVER: So, usually, they don't make assumptions about whether you vote just based on that. They'll ask you questions about your interest in the campaign. So, for example, as Sam mentioned, a common question is: Do you know where your polling place is? That would indicate some level of investment in the election, where if you're going to go vote, you probably ought to know which church or school you're going to go to. At the same time, you see big differences in terms of different methods that are applied.
Some of them are tested empirically. Some are just kind of best guesses. So, as Sam says, if you have votes in the bank, then it seems like that ought to count for something. And one advantage Obama does seem to have in this election is with - if both sides get a good turnout, he should probably win. The polls that we mostly use are based on likely voters. Among registered voters, Obama is almost certainly ahead by several points right now in the key swing states, as well as in the national vote.
LICHTMAN: Let's go to the phones. Elizabeth in Salt Lake City, welcome to SCIENCE FRIDAY.
ELIZABETH: Thank you. Hi. How are you? And I'm a big fan of Nate Silver, actually, which makes me a double nerd, I think. So...
LICHTMAN: You're in good company.
SILVER: Thank you.
ELIZABETH: So my question is: I just wonder about sort of perverse incentives for news media outlets to present certain polls, and for two reasons: One, I read something recently - I can't remember where - where they were talking about, you know, keep us on the edge of our seat, kind of like sportscasters. So it's good to say, oh, it's going this way or that way. And then also, you know, like, say, a very conservative news outlet to say, well, Mitt Romney's going to win, so that everyone feels like they're going to vote for a winner and they're more likely to vote for him. Is there any of that going on?
WANG: Oh, very much so. The news media are always interested in the new story. When Obama's really up on a post-convention high, there's nothing more juicy than to, you know, dog-pile on when things look like they might be less than favorable. And when there's an outlier poll, well, gosh, if the poll's at the edge of the group, then that's much more interesting than a poll that just says, you know, things didn't change.
LICHTMAN: Thanks, Elizabeth, for calling.
ELIZABETH: Thank you so much.
LICHTMAN: You're listening to SCIENCE FRIDAY, on NPR. I'm Flora Lichtman. So, you know, I also wanted to ask you about whether there are youths working for the...
LICHTMAN: ...campaigns themselves.
SILVER: For the campaigns.
SILVER: So I think election is a weird thing, and I used to cover baseball during kind of the moneyball revolution, basically. And that was changed from outside, where people outside, like Bill James, said: There's a better way to build a winning baseball team, and the baseball teams eventually followed suit. In politics, the campaigns are often fairly sophisticated about being data-driven. It really is more kind of the news media that isn't so much, because they do have perverse incentives to pile on with the story, to pick - I mean, you know, for example, the Gallup poll right now still shows Mitt Romney six points ahead.
No other survey indicates a result at all like that, but that poll gets a disproportionate amount of attention because it's attention and headline kind of grabbing. So, you know, Sam and I both have methods that ensure we look at all the polls, basically. And sometimes, it's very disciplining relative to when you only hear about one or two in the kind of news media scrum.
LICHTMAN: We want to just be super clear that we're really not encouraging voter fraud here on SCIENCE FRIDAY.
LICHTMAN: Do not actually go to another state and vote. Go to your own state.
WANG: No. You can drive a little old lady to vote.
WANG: But you voted at home.
LICHTMAN: When you talk about your models, it seems like you say that, you know, some states are just going to be 60 percent, let's say, Romney and 40 percent Obama. And so four out of 10 times, you're going to be wrong. If you took in to account more parameters, do you think you could make that certainty go up?
SILVER: No. I think - so there are a lot of models that that do try and be more complicated. And, you know, our model has a lot of things in it, right? It's not super-simple. But if you look at models, for example, built on different economic statistics that are layered together in different ways, they claim to have pinpoint accuracy, and they don't actually do very well when applied in the real world. So I certainly think you have to look at polling data, I think, with polling-plus - economy, excuse me. You can start to make some progress.
But there are so few elections historically - only, I think, it's the 17th since World War II. And it's a complex behavior, that polls are a nice shortcut, just all you have to do is make the assumption that people are being reasonably truthful about who they're going to vote for and take a snapshot from there. If you try and rebuild from first principles, then it's - we'd like to have scientific precision in doing it, but it hasn't always worked very well, and people have tried that.
WANG: You know, I think those models tend to - those complicated models tend to over-fit, so they're trying to fit every little jot in the noise. I think where those more complicated models are useful are probably where there's missing data problems, like if we don't know what's happening in some district in, say, Missouri, then something - I mean, for instance, things that Nate did four years ago did a very good job of filling in what one could call a missing data problem. But polls are - if you have polls, they're better than almost anything else.
LICHTMAN: What about learning algorithms? You know, we hear about these on SCIENCE FRIDAY from time to time that sort of make themselves better.
SILVER: Well, I think you have to consider whenever you're looking at a problem: Are you in a data-rich or a data-poor environment? In some ways, for example, if we do predictions of the Senate races, which we also do, it's an easier problem to solve, in some ways, because you have 30 or 35 Senate races every other year that are fairly independent of one another. You might have national trends, but you have two different candidates running in every state.
With the elections, it's - you have 50 different states, but it's still the same two candidates. The same factors will affect voters for the most part in the same ways in the same state. So you can't - you have limits based on how much data you have. And no matter how fancy your approach might be, if you only have 15 or 17 elections to look at, only about 10 of which you really have robust polling, then you can only get so far.
LICHTMAN: I have literally 20 seconds left, but could you do this calculation 15 years ago? Would it be possible with the amount of data we had and the computing power we had?
WANG: Not enough polls.
LICHTMAN: And not enough - amount of polls. That's a major problem for you, too, Nate?
SILVER: Especially with more and more state polls now. That's quite helpful. You used to have national polls in maybe a couple of swing states. And now, I agree with Sam, the state polls or where most of the kind of value information is at.
LICHTMAN: Thank you so much for joining me today, guys. This was really interesting.
WANG: Thank you.
SILVER: Thank you.
LICHTMAN: Nate Silver is a writer for The New York Times election blog, FiveThirtyEight. Sam Wang is the founder of the Princeton Election Consortium at Princeton University in New Jersey.
(SOUNDBITE OF MUSIC)
LICHTMAN: This is SCIENCE FRIDAY, from NPR.
NPR transcripts are created on a rush deadline by an NPR contractor. This text may not be in its final form and may be updated or revised in the future. Accuracy and availability may vary. The authoritative record of NPR’s programming is the audio record.