Forecasting Follow-up to Jay Cost

Apr 25 '12

Jay Cost responds to the Washington Post’s forecasting model that I helped put together:

bq. I find Klein’s model to be particularly unpersuasive, but all these models seem to share a similar problem: they take the blowout elections of 1952, 1956, 1964, 1972, 1980, and 1984 not as historical peculiarities with little relevance to today, but as central tendencies. To put this in plain English, the three variables Klein elaborates (or, for that matter the variables in any model I’ve ever seen) cannot account for the wide gulf between Eisenhower v. Stevenson and Clinton v. Dole. Those were campaigns waged in different ages, yet the the models never acknowledge that and end up basically forcing square pegs into round holes.

Here are some thoughts.

Eisenhower-Stevenson.  This is a small point.  I’m not sure about other models, but the Post model’s prediction for 1952 is within 1 point of the outcome and its prediction for 1956 is almost exactly equal to the outcome.  Those are in-sample predictions.  If we drop the 1952 and 1956 elections from the model one-by-one, reestimate the model, and the calculate an out-of-sample prediction for each election, we get similar results: errors that are, in absolute values, equal to 1.1 and 0.2 points, respectively.  As I’ve said, any individual prediction has a lot of uncertainty, which is why we don’t focus on point predictions and errors in presenting the models’ results.  But ultimately I’m not sure why Cost thinks those elections are so hard to predict.

The Fundamentals.  I completely agree with the thrust of Cost’s post.  He says that because of party identification, 90% of the electorate is “locked in.”  I’ve been blogging about that for a while — e.g., here or here or here.  And I agree with Cost when he says:

bq. That’s why I’m keeping an eye on the fundamentals – rather than the horse race polls – until relatively late in the cycle.

One of the fundamentals, according to Cost elsewhere, is presidential approval.  I would guess, hopefully correctly, that he thinks the economy is also a fundamental, since he writes:

bq. It is not like the experts are predicting the economy is going to take off between now and Election Day. Instead, we are going to get more of the same muddled growth at roughly 2-2.5 percent, far less than what is needed to reduce the deficit or create jobs.

Basically, Cost seems to think that the election will depend a lot on presidential approval and economic growth, which are the central components of the model.  In other words, his model of how voters think — a big effect of party identification plus some referendum voting by weaker partisans or independents — is my model too.

This gets at why we build those models in the first place.  We do so not purely as exercises at prediction, but because we think we’re testing theories of voter choice.  You can predict or at least explain elections with lots of things.  Here, Cost even gives you an example.   But generally the forecasting models center on variables that have a theoretical basis, not just predictive value.  In fact, it’s probably more interesting to me to build models that will occasionally fail at prediction, if only because those failures will help us build better theories.

Ultimately, Cost and I agree on the fundamentals but when those fundamentals are put into a model, Cost and others are skeptical.

Problems with models. If I’m interpreting Cost’s skepticism correctly, he thinks that “times have changed” and forecasting models based on historical elections don’t account for this.  In his account, party loyalty is on the rise, limiting the winning candidate’s margin of victory even when economic conditions or other fundamentals should really favor that winning candidate.  So we’d expect, then, that the model would overestimate the winning candidate’s margin of victory in recent elections.

Which recent elections?  I’m not quite sure where Cost thinks the models would start to generate overestimates.  He mentions 1984 as the last blowout but also seems to think that the model could not account for Clinton’s victory over Dole in 1996.  So let’s just take 1988 as the starting point.  In 1988, the model actually underestimates Bush’s vote share by about 2 points in-sample, and 3 points out of sample.  In 1992, it overestimates Bush’s vote share and even predicts a narrow victory for Bush.  This is one year in which the model calls the winner incorrectly.  In 1996, the model overestimates Clinton’s vote share by 1.3 points (1.5 points out of sample).  In 2000, it overestimates Gore’s vote share by 3.4 points (4.5 points out of sample).  In 2004, it overestimates Bush’s share of the vote by 1 point, either way.  In 2008, it overestimates Obama’s vote share by 3.8 points (5.5 points out of sample).  And as I wrote in my earlier post, the model’s estimates for 2012 — e.g., Obama is a near-certain winner with 50% approval and 2% GDP growth — feel a bit to optimistic for the incumbent too.

What do I see here?  The model has overestimated the incumbent party’s vote share in 1992-2008.  That, I think, confirms Cost’s notion.  I can even suggest a reason why: partisan biases have gotten larger.  With regard to presidential approval, see Gary Jacobson’s book.  With regard to economic perceptions, see this forthcoming paper (pdf) by Peter Enns, Paul Kellstedt, and Greg McAvoy.  Because of these biases, the fundamentals — good or bad — may sway fewer votes.

At the same time, it’s possible to generate alternative explanations.  1992 was a weird economy — GDP growth but high unemployment and thus negative news coverage — plus there was Ross Perot.  2000 was Gore’s failing as a candidate.  2008 was Obama’s racial identity.  With so few elections, it’s difficult to know whether the outcomes reflect Cost’s theory — stronger partisanship makes the fundamentals less consequential — or are idiosyncratic.  The implication of this is important: the small number of elections that are typically modeled (16) not only complicate estimates of the fundamentals but also complicate theories that critique fundamentals-based models.  Only with time will we know whether Cost’s hypothesis is true.  To be sure, I think it has a lot of plausibility.  I just want it acknowledged that small samples aren’t a problem only for fundamentals-based models.

Does all of the above this mean that the Post model is wrong in recent elections?  Not really.  If you care about estimating vote share (which I don’t), there are clearly cases between 1992 and 2008 where the model overestimates the incumbent party’s vote share by 3-5 points, and also a couple cases — in particular, 1996 and 2000 — where the difference between the model’s prediction and what actually happened is small.  If you care about picking the winner — and this is what I’m more attentive to — in only one case (1992) did the model call the popular vote winner incorrectly.  (Of course, I would never rely on only one model anyway.  Averages of models are more likely to be accurate.)

Takeaways.  First, this model is doing a good enough job at calling winners.  In combination with other models, you’d probably get an even more definitive sense.

Second, it’s not necessarily true that the models will get worse at calling winners.  If Cost is right and fewer votes are in play, then with successive elections the effects of economic conditions and presidential approval will grow somewhat smaller.  But these effects will likely remain greater than zero.  After all, there will still be some slice of voters who thinks of the election as a referendum on the incumbent party.  And if those voters decide the outcome, then these sorts of models will still tend to predict the winner correctly.

Third, Cost’s critique illustrates how you can use these models in just the way Klein suggested:

bq. So sure, perhaps this year will be different — that’s what my gut tells me, and the model has me thinking about the ways in which that could be true.

That is, you can use the model to start thinking about ways in which its predictions might be false.  But as Klein also notes:

bq. But the reality is, everyone always thinks “this year” will be different, and they’re usually wrong.

To be determined.