Horse-race political science

by Lee Sigelman on January 16, 2009 · 7 comments

in Campaigns and elections,Media,Political science

kentucky-derby.jpg

David Walker’s assessment of the performance of forecasting models of the 2008 election provides an occasion for me to raise a question that troubles me about these models, which, to use the old World War II phrase, is “Is this trip really necessary?” By “this trip,” I’m referring to Walker’s article only indirectly. My question is really directed toward the practice of political science-based election forecasting.

Others may tell the story differently, but I’ll begin with John Mueller’s oft-cited study, War, Presidents and Public Opinion. Mueller’s work lives on today in the cottage industry of analyses of presidential popularity that his study prompted. But Mueller also, albeit inadvertently, provided a basis for election forecasting by political scientists when, in the course of discussing the peculiarities of Gallup’s presidential popularity question, he offhandedly dismissed the president’s standing in the polls as “a very imperfect indicator of electoral success or failure for a president seeking reelection.” But could that really be true? He presented no data to support his supposition, and it seemed implausible that the president’s standing in the polls would have little bearing on his performance on Election Day.

Intrigued, other researchers—myself included—rushed in to put Mueller’s assertion to the test, and what they uncovered, contra Mueller, was a very high correlation between presidents’ job ratings in the final pre-election Gallup Poll and their share of the popular votes in general election. These reanalyses weren’t theoretically progressive or methodologically innovative. They simply took issue with a brief digression in Mueller’s wide-ranging consideration of presidential popularity. Point made, case closed, yes?

Well, no.

From these humble beginnings as well as from some other sources developed a spirited competition among political scientists, and between political scientists and economists, to determine whose model could provide the best forecasts of presidential and congressional election outcomes. A wave of new forecasting models soon appeared. By 2000 a dozen or more models were competing against one another and against the wholly different, market-based forecasting approach of the Iowa Political Stock Market. Since then the refinement of existing models and the development of new models have continued apace, with new analyses appearing with clocklike regularity at two- and four-year intervals timed to the electoral cycle.

But toward what end? In an important early election forecasting study, Steven Rosenstone (1983) argued that scholars should not regard the forecasting of election outcomes as a high-priority undertaking in and of itself. After all, rather than going to the trouble of trying to forecast an election outcome a few weeks before Election Day, researchers could simply wait and see how the election turned out – and what would be lost? Instead, Rosenstone argued, election forecasting should be treated as a convenient vehicle for addressing the more important question: “What determines election outcomes?” That is, election forecasting should be undertaken as a means of testing the empirical implications of theoretical models. Campaign consultants and others with a vested interest in seeing that one side or the other wins the election naturally place a high value on forecasting election outcomes. But why should this be considered a high priority for political scientists?

As time has passed, it has become increasingly evident that, notwithstanding Rosenstone’s demurrer, political scientists have approached election forecasting primarily as an end in itself rather than as a means of testing models derived from theory. What we have is a profusion of studies designed to see whether forecasting accuracy can be improved by some increasingly technical tweaks in model specification – typically ad hoc tweaks – and by adding one more election to the dozen or so on which most of these models are based. These models are generally built from the bottom up (that is, their logic is “Well, these predictors worked last time so let’s see if I can introduce some small changes—a slightly different specification, a new data point—that would make them do even better”) rather than from the top down (in which the logic would be “Aha! Here’s a good opportunity to test my theory of the determinants of electoral success”). Moreover, to complicate the scorekeeping, it is strikingly unclear what would constitute a good test of a model’s performance. The conventional standard is how close a certain model comes to “getting it right,” i.e., how accurate a particular out-of-sample forecast came to the actual outcome. But for a variety of reasons, “getting it right” in a single election is hardly a meaningful test

I’m not characterizing these exercises as worthless from a political science perspective. After all, it took such a model, albeit an extremely simple one, to overrule Mueller’s assertion that the president’s standing in the polls has little bearing on the outcomes of presidential elections. But I truly wonder what all the fuss is about, above and beyond the potential thrill of victory in showing that one’s own pet model has outperformed its rivals by some small margin in a particular election. Is there some important payoff there? Are these models being used in the way that Rosenstone – rightly, I think – advocated? Too often, I think, these exercises are triumphs of technique over theory, exemplifying the tendency of political scientists with high-level statistical skills to concentrate on the specifics of statistical estimation without paying due heed to the broader purposes that such modeling is supposed to serve. Political scientists often speak scornfully of what they call “horse-race journalism.” Perhaps we should be more careful about where we throw our rocks, lest our own windows get shattered.

{ 7 comments }

Jim Campbell January 16, 2009 at 12:12 pm

Lee, I must disagree with you on the impact of forecasting on theory. Election forecasting has lead to a number of debates about real theoretical issues–from the impact of presidential incumbency on the vote to whether retrospective voting is conditional to whether there are party cycles associated with the number of consecutive party terms to what explains particular election outcomes. I think that the inclusiveness of the forecasting community, however, has allowed in some models that are just plain crazy–but that can be also said for political science as a whole as you must recall from your days at the APSR.

Lee Sigelman January 16, 2009 at 12:45 pm

Jim:
I am shocked, shocked by your response. :-) Here are a couple of points coming back to you. (I confess that I posted the above entry in part to see whether anybody was paying attention, figuring that if _it_ didn’t ruffle any feathers, nothing would.

(1) I was really talking about the (lack of) impact of theory on forecasting, not vice-versa. Are most of the forecasting models theoretically rich in any important sense? Do they pose clear-tests of theoretically derived propositions?

(2) I don’t deny that we can extract some insights from sets of empirical results; that’s no less true of forecasting models than of any other set of empirical results. However, I think it’s pretty clear that the real motivation driving the great majority of these exercises has been the gamelike aspect of tuning a model to try to hit a bullseye in a particular election; that’s certainly the way the models get evaluated after the fact, not in terms of how abundantly they’ve contributed to theories of elections.

Eric L. January 16, 2009 at 1:55 pm

With respect to Jim’s closing comment, a quote from Niels Bohr is appropriate:

“We are all agreed that your theory is crazy. The question that divides us is whether it is crazy enough to have a chance of being correct.”

Andrew January 16, 2009 at 2:03 pm

What Jim said.

To provide a specific example: Had the people in John McCain’s campaign been more aware of the work of Rosenstone and others about the minimal benefits (if any) from the Vice Presidential choice, maybe instead of picking “game-changer” Sarah Palin, they might’ve picked the person whom they thought would be the best possible President in the event that McCain died or was incapacitated.

More generally, the very existence of good forecasts based on pre-campaign information has important and implications about the effects of election campaigns.

Matt Jarvis January 16, 2009 at 2:56 pm

This reminds me of Lewis-Beck & Rice’s defense of forecasting in their book.

I think that Jim hits the nail on the head on how forecasting has contributed to theoretical debates. However, I recall the handwringing that took place after 2000 when most of the models got it “wrong” but I think were generally pretty accurate.

I also think that the heart of Lee’s comment has weight: some models include variables that I cannot logically connect to the outcome.

Jim Campbell January 16, 2009 at 5:04 pm

Lee,
I agree with you that there is a good deal of nonsense in interpreting forecast accuracy–the horserace aspect to this. I have stayed out of the contests that seem to encourage this silliness and have tried to identify some benchmarks for reasonably assessing forecasts. But the key challenge you pose is whether forecasting research has made contributions to theory–whether by testing existing theory or in suggesting new avenues to examine. I think on these grounds there has been significant value added in the areas I cited but also in other areas of public opinion research and campaign effects.

I think what the forecasters need to do next time is tighten the review process to keep some of the whacky models out. I agree with Matt. Every forecaster should be able to explain the mechanism of his (there are no hers in election forecasting–which is a potential subject for another blog) model–why it is reasonable to think that the model should work. Forecasters should not get by with including indefensible variables just because of the fit or past “success.” We need a better balance between inclusiveness and open admissions.

Doug Hess January 17, 2009 at 2:23 pm

Forecasting complex events makes me think of two things:
1) won’t people confronted with results that don’t fit their model just make ad hoc adjustments until the next time the results don’t fit their model, etc. I.e., are models successively getting “closer” to something (and given that these are social and not phsycial things you’re studying, is that something always changing to the point where you need to make major adjustments fairly often anyway: cann’t step in the same stream twice…)?

2) On a completely different front: Looking for what predicts or forecasts seems to say little about what campaigners, activists, politicians, etc. have to do as their job is to plan for multiple eventualities. I.e., the social scientist will say “why do that, it’s rarely played a role in election outcomes” whereas the political operator will say “we have to plan for scenarios a,b,c,d,etc.” Thus, post hoc analysis and analysis for management end up being two (somewhat) divorced fields. Ok, that had nothig to do with your post, but the past several posts have got me thinking about these things. Which I hope is part of the point.

-Doug

Comments on this entry are closed.

Previous post:

Next post: