When Can You Trust Polling about Ballot Measures?

by Dan Hopkins on October 31, 2011 · 3 comments

in Blogs,Campaigns and elections,Immigration,Public opinion

In just over a week, Ohio voters will decide on Issue 2, a referendum on whether to keep a recently enacted law which “limits collective bargaining for public employees in the state.”  Recent polls show decidedly more public opposition to this law than support for it, with a 57-32 pro-repeal split in a Quinnipiac poll and a similar 56-36 split in a PPP poll.  But as Washington Post blogger Greg Sargent reports, labor groups opposing the law are circulating a once-internal memo questioning just how predictive those polls will turn out to be.  The memo points out that the surveys do not include the actual language that will appear on the ballot, that turnout levels in off-year elections are uncertain, and that polling on prior Ohio ballot measures has been inaccurate.

The accuracy of telephone surveys on ballot measures is an empirical question, and one that is of interest well beyond Ohio’s Issue 2, so let’s look at the data.  Between 2003 and 2010, we can track down 438 publicly available survey questions about support for state-level ballot measures in upcoming elections.  The first point to make is that there is some external validity to that internal memo: there is often a pronounced gap between the surveys and the election-day outcomes.  The average absolute difference between the polls and the outcomes—the average polling error—is a striking 7.8 percentage points.  In 26.5% of the polls, the eventual election outcome differed from what the poll predicted by more than the swing needed to turn the Ohio PPP result into a dead heat on election day.

Yet those results tell us about the magnitude of the errors, not about their likely direction.  So the question becomes, can we identify any systematic biases in the pre-election polls?  Only a handful of the ballot measures in this data set deal with labor unions specifically, so there is not much to say there.  But we can identify clusters of ballot measures on more common issues and then look for patterns in the issues that induce larger or smaller polling-performance gaps.

The Figure just below shows the relevant regression coefficients, where the gap between performance and polls is predicted using the ballot measure’s baseline support, issue-specific indicator variables, and the year of the election.  (A similar technique is at work in some prior research.)  Each estimated coefficient is a dot, with the thin line representing a 95% confidence interval.  Positive coefficients indicate issues that receive more support at the ballot box than the poll would predict, with the baseline being the hundreds of polls on all other issues.

The results?  Surveys on ballot measures to ban same-sex marriage systematically and strongly understate their support on election day, by an average of 8.3 percentage points.  NYU’s Patrick Egan has documented this bias as well.  Voters tell pollsters they will oppose same-sex marriage bans at notably higher rates than they actually do.  Polls on ballot measures to restrict immigration show a similar bias, as they also underestimate the electorate’s support.  Social desirability gives us an off-the-shelf explanation for both results.  And it gets added weight from the marijuana-related ballot measures, whose support is underestimated as well.  People vote to expand access to marijuana at higher rates than they indicate to phone interviewers, a finding that Nate Silver has explored.

By contrast, most of the other issues don’t produce very large biases in either direction, including the issues (like education and tax reduction) most closely related to Ohio’s Issue 2.  These patterns are consistent with the claim that polling is especially difficult where social desirability comes into play, with respondents not wanting to seem homophobic, anti-immigrant, or pro-marijuana.  In fact, the only result that doesn’t make immediate sense from a social desirability standpoint is the null result for ballot measures on restricting gambling.  So while the polling of ballot measures is prone to significant variability, it appears to be those ballot measures that address socially sensitive groups or topics that give rise to the most predictable errors.


Andrew Gelman October 31, 2011 at 9:04 am


Any thoughts on turnout? How much of this is people not being completely honest to pollsters, h0w much is people changing their minds in the voting booth, and how much is differential turnout among supporters of the Yes and No positions? Above, you seem to be assuming it’s all the first of these three factors?

Patrick Egan October 31, 2011 at 12:07 pm

Dan – very interesting! Especially interesting that these effects do not appear to run in a consistently liberal or conservative direction.

And thank you for highlighting my work on same-sex marriage ballot measures. However, in that paper, I find no evidence that social desirability bias has played a role in the polling-results gap with regard to these measures. Live vs. IVR surveys, states that have large gay populations vs. those that don’t, and elections in later years (when the country is more pro-gay) vs. earlier years are all distinctions in which we’d expect the former context to be more prone to such bias than the latter. In all three of these cases, I find no discernible differences between contexts in the polling-results gap.

Dan Hopkins October 31, 2011 at 12:50 pm

Andrew, that’s a question I want to look at more. But for now I’d say that de-emphasizing turnout comes from the fact that the same survey procedures typically yield very accurate predictions for candidate vote shares, which should also be susceptible to biases in turnout. And Patrick, thanks for elaborating, yes: my emphasis on social desirability comes from the striking comparison across the ballot measures of different types.

Comments on this entry are closed.

Previous post:

Next post: