“Regression to the mean” is fine. But what’s the “mean”?

by Andrew Gelman on October 3, 2010 · 2 comments

in Campaigns and elections

In the context of a discussion of Democratic party strategies, Matthew Yglesias writes:

Given where things stood in January 2009, large House losses were essentially inevitable. The Democratic majority elected in 2008 was totally unsustainable and was doomed by basic regression to the mean.

I’d like to push back on this, if for no other reason than that I didn’t foresee all this back in January 2009.

Regression to the mean is a fine idea, but what’s the “mean” that you’re regressing to? Here’s a graph I made a couple years ago, showing the time series of Democratic vote share in congressional and presidential elections:


Take a look at the House vote in 2006 and 2008. Is this a blip, just begging to be slammed down in 2010 by a regression to the mean? Or does it represent a return to form, back to the 55% level of support that the Democrats had for most of the previous fifty years? It’s not so obvious what to think—at least, not simply from looking at the graph.

What I’m saying is this. As an ear-to-the-ground political pundit, Yglesias might well have a sense of political trends beyond what I have up here in my ivory tower. (I really mean this; I’m not being sarcastic. I don’t know much about the actual political process or the politicians who participate in it.) And I can well believe that, in January 2009, Yglesias was already pretty sure that the Democrats were heading for electoral trouble. But, if so, I think it’s more than “regression to the mean”; he’d have had to have some additional information giving him a sense of what that mean actually is.

P.S. Yglesias responds:

I [Yglesias] think historically Democrats averaged over 50% of the vote because of weird race dynamics in the South, but nowadays we should expect both parties to average 50% of the vote over the long term.

To which I wrote:

Could be. On the other hand, various pundits have been saying that in future years, the race dynamics of blacks and Latinos will give the Democrats a permanent advantage. And in many ways it seems gravity-defying for the Republicans to be at 50% with such conservative economic policies.

In any case, you may be right. I just have to admit it’s not something that I saw as of Jan 2009. As a matter of fact, I clearly remember looking at that graph I made in Nov 2008 and trying to decide whether it represented an exciting new trend, an anti-Bush blip, or a reversion to the pre-1994 pattern of 55%/45% voting. At the time, I decided I had no idea.

Yglesias then shot back with:

Well then let me go on record now then as hypothesizing that the long-run 1994- trend will average 50/50.


xyzzyva October 5, 2010 at 4:06 pm

Over the extremely-long term, in a two-party system such as ours, shouldn’t the mean be about 50%? The parties try (with varying success) to position themselves to get a majority. Obviously, it’s difficult to get a sufficient sample size with elections only every other year.

I don’t know that total votes cast for 435 separate House elections is an ideal proxy for overall party success. Why not just use the number of House seats won (other than the fact that the 20th century mean there is nowhere close to 50%)? I’d be curious to see a plot of total votes cast in GOP-won vs. Dem-won districts, as it’s possible there’s a disparity in vote wastage between parties.

Paul g. October 7, 2010 at 2:26 pm


This is a nice piece. It illustrates a basic point I make about election forecasting models when I lecture about them to non academic audiences (parents, alums, etc)–they are steady state models. If the world is changing, then the ability of these models to forecast is reduced. This is what happened in 1994; the “X” variable (exposed seats) was completely off.

Yglesias says that the “long term” post 1994 trend is 50-50, and I tend to agree with him. But you are right, you have to be specific about what “mean” you are regressing to.

To xyzzyva, sort of yes. That is what the strict median voter theorem would predict, but we’ve long known (and Andrew has written about this) that there are institutions that induce non-median behavior among candidates and parties (primaries or the Senate for instance). So it is not clear that the long run equilibrium is 50-50.

Comments on this entry are closed.

Previous post:

Next post: