Dart-Throwing Chimps and Op-Eds

by Erik Voeten on June 24, 2012 · 9 comments

in Political science,Political Science and Journalism

When the House passed the Flake amendment to cut NSF funding for political science The New York Times (and most other newspapers) did not find the event sufficiently interesting to be worthy of valuable newspaper space.  So why then does the editorial page seem so eager to debunk political science as a “science?”  We as political scientists have barely recovered from the alleged inferiority complexes we suffer as part of our apparent inability to overcome “physics envy” and now we hear that “political scientists are not real scientists because they can’t predict the future.”

One would almost be tempted to think that the message conveyed in these pieces suits the editorial page editors just fine. Indeed, Stevens explicitly writes that policy makers could get more astute insights from reading the New York Times than from reading academic journals. If this was the purpose of placing the op-ed, then the editorial board has been fooled by what can charitably be described as Stevens’ selective reading of the prediction literature; especially Tetlock’s book. Here is how Stevens summarizes this research:

Research aimed at political prediction is doomed to fail. At least if the idea is to predict more accurately than a dart-throwing chimp.

But Tetlock did not evaluate the predictive ability of political science research but of “experts” who he “exhorted [..] to inchoate private hunches into precise public predictions” (p.216). As Henry points out, some of these experts have political science PhDs but they are mostly not political science academics. Moreover, Tetlock’s purpose was not to evaluate the quality of research but the quality of expert opinion that guides public debate and government advice.

Two points are worth emphasizing. The first is that the media, and especially editorial page editors, make matters worse by ignoring the track record of pundits and indeed rewarding the pundits with personal qualities that make them the least likely to be successful at prediction. Here is how Tetlock summarizes the implications of his research for the media:

The sanguine view is that as long as those selling expertise compete vigorously for the attention of discriminating buyers (the mass media), market mechanisms will assure quality control. Pundits who make it into newspaper opinion pages or onto television and radio must have good track records; otherwise, they would have been weeded out.

Skeptics, however, warn that the mass media dictate the voices we hear and are less interested in reasoned debate than in catering to popular prejudices. As a result, fame could be negatively, not positively, correlated with long-run accuracy.

Until recently, no one knew who is right, because no one was keeping score. But the results of a 20-year research project now suggest that the skeptics are closer to the truth.

I describe the project in detail in my book Expert Political Judgment: How good is it? How can we know? The basic idea was to solicit thousands of predictions from hundreds of experts about the fates of dozens of countries, and then score the predictions for accuracy. We find that the media not only fail to weed out bad ideas, but that they often favor bad ideas, especially when the truth is too messy to be packaged neatly.

The second point is that simple quantitative models generally do better at prediction than do experts, regardless of their education. This is not because these models are that accurate or because experts don’t know anything but because people are terrible at translating their knowledge into probabilistic assessments of what will happen. This is why a simple model predicts 75% of the outcome of Supreme Court cases correctly whereas constitutional law experts (professors) get only 59% right. Since predictive success is not the gold standard for social science, as Stevens would have it, this has not yet led to a call to do away with constitutional law experts or randomly allocate them research funds.


Dan Nexon June 24, 2012 at 7:47 am

Yeah. I thought invoking Tetlock to make this point was, to put it mildly, bizarre.

Christopher Gelpi June 24, 2012 at 8:20 am

Nice post!

Just to underline the last point. The fact that individual experts are not good at forecasting is actually the CENTRAL argument that Bueno de Mesquita uses in support of his forecasting model. He uses expert political judgments about facts (which are pretty good) and then uses the model to aggregate their factual knowledge in a systematic way to make forecasts.

Dan Drezner June 24, 2012 at 9:01 am

It’s also worth noting that even Tetlock thought Bueno de Mesquita was onto something with his technique: http://nationalinterest.org/bookreview/reading-tarot-on-k-street-3220

John Coates June 24, 2012 at 10:16 am

In the Supreme Court study mentioned, the practicing lawyer experts bested the algorithm, which beat the academics.

Jonathan H. Adler June 24, 2012 at 10:25 am

“Since predictive success is not the gold standard for social science, as Stevens would have it, this has not yet led to a call to do away with constitutional law experts or randomly allocate them research funds.”

As there is relatively little government funding of constitutional law scholarship (or legal scholarship generally), there’s not much to “do away with ” or reallocate.

Prison Rodeo June 24, 2012 at 10:39 am

Well, as far as legal scholarship — broadly speaking — goes, there’s this:


But hey, why stop there? A radically egalitarian / nihilistic approach to allocating support for scientific research does have some great potential, after all. We could randomly choose NSF programs, eliminate peer review in them and replace it with random allocation, and then see whether (in 10 or 20 or 50 years) the research in those fields has progressed more or less rapidly than in the control fields.

But, wait: To do that would be to “support research that is amenable to statistical analyses and models even though everyone knows the clean equations mask messy realities that contrived data sets and assumptions don’t, and can’t, capture.” So, nevermind.

hmi June 24, 2012 at 1:29 pm

Speaking from inside the poli sci academy, I would argue that the supposed 75% of the predictions that this or that model forecasts successfully (about which models and their inputs much more remains to be said) will always be the least problematic cases. There is now and there is unlikely to be any working model of, e.g., the Supreme Court which will successfully predict the results of Citizens United or Bush v. Gore or which has already checked its entrails for the results on Obamacare. That is, the important stuff is and will remain obscure and resistant to ‘scientific’ calculation. In many ways, the criticisms in Storing’s volume from half a century back, Essays on the Scientific Study of Politics, hold up remarkable well—fairly prescient, even.

Tracy Lightcap June 24, 2012 at 4:18 pm

The other thing here is that there are many sciences where successful prediction is not only rare, but, in many cases, impossible. The simplest examples are paleontology and cosmology. Here the basic aim of the sciences is to provide an adequate description of already historical events. Such descriptions have to be based on scientific evidence, of course, but by the very nature of the states of affairs under consideration, it is impossible to subject the findings to either experimental manipulation or prediction. The past has already happened, after all, and is immutable. Simulations of past events can be used, of course, to test descriptions, but that is a long way from prediction. Further, the time scales involved and the contingent interaction of a vast number of variables makes the prospect of predictive models for future events doubtful.

And, again of course, many of the same limitations are at work with other Manchesterian sciences, like political science and the other social sciences. But, of course for the third time, that doesn’t mean that they aren’t sciences. They’re just very hard sciences.

Dmk38 June 24, 2012 at 11:55 pm

The Supreme Court reverses 75% of the caseses it reviews (its jurisdiction is discretionary). A chimp flipping an appropriately weighted coin (or a parrot trained to say “reverse” & nothing else) would do as well as the statistical model you advert to. Doesn’t speak well of the expert law professors–either the ones who predicted only 59% of the outcomes or the ones who did the study & who didn’t realize their “model” did no better than chance.

Comments on this entry are closed.

Previous post:

Next post: