Is Nate Silver Incentive Compatible?

by Henry Farrell on November 7, 2012 · 18 comments

in Data,Public opinion

I’ll leave it to John to write the bigger post on how much better the election results support models based on aggregates of polls than the bloviations of the pundits who were flinging poo at them a couple of days ago. I have a smaller question – is the “Nate Silver” (or Simon Jackman, or Drew Linzer, or Sam Wang) equilibrium sustainable over the longer term? More precisely – might these models cannibalize the individual polls that they need to draw on for data?

Here’s the potential problem. These models need to crunch lots of polls, at the state and national level, if they’re going to provide good predictions. It doesn’t matter if these various polls work on different assumptions – indeed, this may sometimes be an advantage (if polls have different assumptions about likely voters etc, and these assumptions each capture different bits of the truth, then the aggregate prediction will be the better for it). These polls are carried out for a number of reasons. Some are commissioned by news organizations, who hope to use them to sell newspapers or attract eyeballs. Others are carried out more or less as loss leaders by polling organizations trawling for business.

But if politically interested consumers start paying more attention to the aggregate tracking models instead of triumphing or despairing in response to the vagaries of individual polls, then there is less incentive to produce these individual polls in the first place. If people want to read about the results of Nate Silver’s model rather than the one shot picture provided by e.g. the Washington Post’s latest poll, then the Washington Post obviously has less incentive to pay for an expensive poll which will garner less readership. Similarly, if people aren’t interested in individual polls, then they are going to be much less effective as loss leaders for polling firms.

And this presents a problem, because Silver, Jackman and everyone else need to feed their models with lots of individual polls. To put the problem more abstractly, individual polls are a necessary input into aggregate polling models. But the people putting these models together are not, in fact, the customers for these polls. And, from the perspective of the actual final consumers, the outputs of the aggregate polling models are a good (and arguably superior) substitute for the individual polls. The models might, over the longer term, drive the individual polls out of the market, cannibalizing the conditions of their own existence unless someone figures out a new business model.

I don’t want to stretch this too far – we are a long way away from observing this kind of effect in real life. Still, it seems plausible that if aggregate models become too popular, they may cannibalize the conditions of their own operation, unless they can figure out a different business model through which they can generate the necessary data.

{ 18 comments }

David Pennock November 7, 2012 at 1:44 am

Same question about newspapers and Google News.

Andrew C November 7, 2012 at 1:52 am

Why couldn’t Silver get a ton of money to fund parts of polls according to the criteria that would be most helpful for his model? Or just have NYT dramatically expand its polling work and coordinate more with his modeling assumptions?

mike3550 November 7, 2012 at 2:02 am

This expresses something I thought for the last several weeks, but much more eloquently. I think that incentives exist for polling operations to continue. But I also see the possibility that other sources can begin to be modeled, for example Twitter, Facebook, etc. I don’t buy that they can *replace* representative samples, but they can certainly supplement what information they provide.

I think a far greater concern is that individual polls end up providing truly independent data because no one wants to be too far outside of the average. This means that the polls will to a certain degree actually be skewed toward the early average. This could add substantial measurement error to his model and actually make him do worse.

Simon Jackman November 7, 2012 at 2:06 am

Great set of issues you raise, deserving of a longer response than I’ll give tonight.

1. There were literally so many pollsters — big and small — willing to chance their hand in the eyeball-laden 2012 campaign… I don’t have the exact number at hand, but Pollster logged more than 100 different firms publishing state and/or national polls. I don’t see this going away. They all knew “Nate was watching”(plus us less famous folk), but the lure of the publicity was presumably too great.

2. The media will continue to commission polls. From their perspective, a poll is content they pay for themselves, their own “exclusive”. I think this continues for a while longer, perhaps with media organizations looking for cheaper pollsters,to be sure. I don’t think a fear of readers shrugging their shoulders at “our exclusive poll!” and clicking over to Nate or me or Drew or Sam is sufficient disincentive for them, at least not yet.

3. Maybe poll aggregation can do what it does with maybe fewer polls than you think. Some of this is to do with the facts on the ground, the modeling, etc. A uniform swing assumption and some decent national polls might get you a long way towards sane state predictions?

4. Internet polling might make it all work (bigger, cheaper, faster)? That is, fewer pollsters, but more data? Look at YG’s massive pre-election poll. Yikes.

Chaz November 7, 2012 at 4:18 am

If Nate cannibalizes readers from the newspaper polls then the money goes to the NYT instead of other local papers. Or maybe the local papers will start paying to print Nate (or whoever’s) analysis. It’s still the same amount of revenue going in, it just gets centralized to NYT and the other aggregators. So if a shortage of polls starts to emerge then the NYT &co. will have the revenue and the motive to fund tons of polls themselves.

There is still a question of incentives here. Maybe individual aggregators won’t want to publish lots of polls for the other aggregators to freeride on. Maybe they will choose to keep the polls proprietary and not publicize them. Maybe someone can put together a syndicate where they jointly fund polls.

I would expect more centralization and less diversity in polling methods, but I think the gross number of polls will stay at an adequate level. I do expect a decrease in nationwide presidential polls, which is perfectly fine with me because we have way more than we need.

Chaz November 7, 2012 at 4:20 am

“Publicize” should be “publish”.

Andrew Gelman November 7, 2012 at 9:59 am

Henry:

I disagree with your claim that “These models need to crunch lots of polls, at the state and national level, if they’re going to provide good predictions.” Actually, you can get reasonable predictions from national-level forecasting models plus previous state-level election results, then when the election comes closer you can use national and state polls as needed. See my paper with Kari Lock, Bayesian combination of state polls and election forecasts.

Having a steady supply of polls of varying quality from various sources allows Nate to produce news every day (in the sense of pushing his estimates around) but it doesn’t help much with a forecast of the actual election outcome.

Jacob Hartog November 8, 2012 at 7:05 am

This is more or less what Drew Linzer did, right? And he had around 332 electoral voters predicted for roughly forever.

First you get the June approval rating, then you get the Q2 GDP growth, then you get the Bayesian forecasting. As Scarface once said while he was waiting in a mile long Miami Dade voting line.

Henry Farrell November 7, 2012 at 10:48 am

Simon, Andrew, thanks for the useful pushback. I look forward to Simon’s more detailed response on this. Andrew, I’m obviously going to defer to you on this, but it suggests that there is still a problem for the 538 business model (as it currently exists), if not as much of a problem for actual prediction. Obviously though, I am not saying that this business model can’t be changed – instead that if the aggregate approach really takes off, it will have to be changed.

Andrew Gelman November 7, 2012 at 11:03 am

Henry:

Yes, I agree regarding the business model. But what do I know? Since 1992 (when Gary and I did our research indicating that poll movements are mostly noise), I’ve thought that that repeated-polling business model of news reporting was unsustainable, but it’s only been getting worse and worse. Maybe you’re right that recent developments will push it over the edge.

One reason that political scientists have not been doing poll aggregation is that, at least for the general election for president, there’s little point in doing so–or, to put it another way, just about any averaging would do fine, no technology needed. Recall that Nate made his reputation during the 2008 primary elections. Primaries are much harder to predict for many reasons (less lead time, candidates have similar positions, no party labels, unequal resources, more than two serious candidates running, etc), and being sophisticated about the polls makes much more difference there.

John Besley November 7, 2012 at 12:34 pm

Do we know whether the polls in question were primarily election focused or whether many were omnibus polls being done for other purposes onto which a few political questions were added? My assumption is that once you have someone on the phone/in your survey online, you try to get some extra info. out of them.

More generally, these polling firms make most of their money doing polls for private clients inside and outside of politics so it may still be worth doing what it takes to have their name appear within the polling aggregations. They also probably have an incentive to do a good job so that people like Silver are seen to be giving their polls full weight.

RDT November 7, 2012 at 1:36 pm

It seems to me that all the focus on Nate Silver misses the really big take-away, which is how good polling can be. Right now the highest accuracy seems to come from aggregating the large number of small-ish polls available — but as Andrew Gelman pointed out, if polling firms think seriously about methodology, and how to mimic the accuracy that comes with aggregation, that’s not crucial.

Dave Monack November 7, 2012 at 4:36 pm

Without polling, we’d have to resort to that primitive method of determining the electorate’s wishes: elections.

Sebastian November 7, 2012 at 4:54 pm

Henry definitely has a point, but let me suggest a (somewhat utopian) alternate outcome:
News organizations still do polls, but focus their reporting on interesting crosstabs/correlations or even do more sophisticated things like survey experiments: There is clearly a market for nerdy polling analysis – the presidential and state polls are just icing.

In addition, the performance, “weighting scores,” “house effects” etc. attributed to polls by people like Nate serve as a marker of distinction/quality check for those polls. I’m sure PPP won’t let anyone forget that, but it might not be bad if more of the established polls started talking about their track record. The NYT already spends a fair amount of time doing crosstabs on their surveys, but that type of thing could certainly be expanded.

Ankit November 7, 2012 at 8:34 pm

You are so right. The fact is that many statisticians -including Huffpost Pollster, TPM, Sam Wang and a few others- were able to predict the races to a very high degree using the poll aggregation model. Nate Silver has some bells and whistles attached -such as house effect of each of the polls, state fundamentals, correlation between different states etc.- but his model is essentially a Monte Carlo simulation at whose core are the polls.

Nate, and others I mentioned above, do add value in seeing the bigger picture but their model is only as good as the hard data underneath it. Note that when the hard data was not of that high quality and sparse, as was the case in MT and ND senate races, Nate got it wrong! So here is a shout out to the pollsters who did great job and a boo to the ones who took shortcuts or included partisan skews.

2 tanners November 7, 2012 at 10:36 pm

The principal point is that the polls, by and large, are not being run for the politically interested/statistically educated. They are being run to turn a buck, and no-one, not even Fox, are criticizing their own pundits this morning. I think Nate, and Sam, and others’ business model remains fairly safe.

Don’t forget how personal the Republican criticism got in the last couple of weeks on 538. That says to me that Rove won’t be out of a job today at Fox despite even incorrectly going up against his own in-house numbers folks.

And John Besley’s point about other questions is very valid as well, particularly as it allows marketers to segment their advertising effectively.

Keith M Ellis November 8, 2012 at 7:41 pm

@2 tanners’s comment is what I’ve been thinking for a while, especially since I read some comments from a self-described campaign polling expert criticizing Silver (it was eye-opening in its wrongness). The quality of polling has been poor not because they don’t know how to do a better job, but because, in general, they are selling a product that is not only about accuracy, but about other things which work against accuracy. Usually, a confirmation of the buyer’s bias.

For example, note my use of the word “bias”. Polls are going to have inadvertent methodological biases, but the people evaluating past poll performance for the purposes of deciding on a vendor will interpret these as political biases. This, in turn, gives the vendors reason to actually have some political bias in order to differentiate their product from their competitors.

There’s been a lot of talk where commentators have been mystified by why conservatives would have attacked Silver. But, honestly, no one likes bad news. Everyone likes to hear that their own preferences are being reflected in the political polling before an election. There is, at root, a strong human impulse to first intuit what the truth is, and then look for rational confirmation. Because this is the case, polling firms will never be as rigorous and accurate as they could be, because there will always be a market for something that is close to, but not quite, the truth.

Kevin Kind November 27, 2012 at 4:25 pm

Let us ask a question and them make a claim. Claim first – the goal is to attract eyeballs. Fear is best way to do this – for all animals, BTW. A poll showing your candidate losing seems the fuel that feeds eyeballs.

Reliable predictions months (years?) before will kill the drama/threat but our brains will still seek out and create new threats for the media to feed us — as our brain’s proxies.

Can’t some very simple sampling now suffice? After all the Iowa auction is really cheap and reliable.

Comments on this entry are closed.

Previous post:

Next post: