Reason #17 to Blog about Your Research…

by Joshua Tucker on May 16, 2013 · 20 comments

in Blogs,Political Science and Journalism

… because someday someone might write this about you:

But as we all eventually learn the hard way, Nyhan ALWAYS COLLECTS HIS MONEY, HONEY.

That’s Jason Linkins at the Huffington Post writing about Dartmouth College political scientist and blogger Brendan Nyhan. The topic in question? Nyhan’s research about scandals and US presidents — he blogged about it here — which, if I’m not mistaken, was a somewhat big topic yesterday in the US media.

And yes, for those keeping track, Nyhan did have a scandal probability forecasting figure:

scandal

{ 20 comments }

Kindred Winecoff May 16, 2013 at 7:20 am

Whatever image you’ve put in this post seems to have at least two major defects:

1) It is not Figure #2 in the Nyhan working paper you link to. It is not in that paper at all. It comes from a HuffPo post recalling another blog post (not paper) describing a postulation from a previous working paper (link now broken) from more than two years ago. As far as I can tell, none of this has been peer reviewed or even questioned (I’m immediately skeptical of predictions which contain 100% probability… call me cynical). The HuffPo blogger is claiming that “Political Science” says something that political science does not say. TMC links approvingly and implies (through false linkage) that it comes from research from which it does not come. Don’t do that. We should have learned from Reinhart-Rogoff not to make bigger claims than we can back up.

2) The graph gives a 100% probability of a scandal happening about 14 months ago (i.e. before the last election) and that probability staying constant (presumably… the x-axis ends nearly a year ago) ever since. No really: 100% probability. 14 months ago. And every month since. And it was nearly 100% wrong for 14 consecutive months but it’s a “somewhat big topic yesterday” and we imply that this is success? C’mon.

Even the HuffPo blogger bothers to notice the discrepancy and is willing to pony up his own $ for being wrong. (Trying to resist NSF analogy… failing.) Here’s what Nyhan said: “Obama survived for [one year] longer than I expected since that column was published. … Today, however, my predictions were validated…”: http://www.brendan-nyhan.com/blog/2012/04/obamas-first-scandal-gsa-spending.html.

No. Sorry, no. That “scandal” was the GSA spending in Vegas in April, 2012. Remember that? No? Well it was supposed to be a big deal. Nyhan describes it as a scandal and gives this definition: “I define scandal as a widespread elite *perception* of wrongdoing.” So after 12 months of being wrong, he got the GSA Vegas thing “right”, then nothing else right for another year until this week. This is not impressive to the impartial.

FWIW, Nyhan speculated as to the likely locations of scandals at the end of the post linked below (which is the source of the image above). None of them were even remotely correct.

http://www.centerforpolitics.org/crystalball/articles/bxn2011052601/

Andrew Gelman May 16, 2013 at 10:06 am

“I’m immediately skeptical of predictions which contain 100% probability”: Good point.

RobC May 16, 2013 at 10:31 am

You’re correct that the graph you criticize isn’t Figure 2 in Brendan’s working paper. It is, however, Figure 2 in Brendan’s May 2011 guest post on Larry Sabato’s blog, the piece you link to at the end of your comment. So whatever the sins of HuffPo and its bloggers may be, that graph falls squarely on Brendan’s shoulders, and Professor Tucker’s use of it can be criticized only with respect to his not having provided a link to the Sabato blog guest post.

Joshua Tucker May 16, 2013 at 11:12 am

RobC: I did actually have a link to the Sabato post – where I wrote “he blogged about it here” – and also a link to the new working paper that Kindred correctly notes does not have the figure in it. Apologies for the confusion – decided to add the figure at the last moment and should have checked to make sure it was still in the version of the paper posted on Brendan’s website.

On the subject of peer review, though, I want to be explicit that not everything we post on The Monkey Cage will have been subject to peer review. Our goal is to make political science research accessible to a wider audience, and we often report on research at varying stages of development.

Andrew Rudalevige May 16, 2013 at 8:47 am

Josh is too modest to mention the Monkey Cage’s own prescient discussion of current events back in June of 2012: http://themonkeycage.org/2012/06/17/second-terms-schools-for-scandal/

Brendan Nyhan May 16, 2013 at 11:09 am

Kindred is correct, of course, that the figure above is not presented in my working paper (http://www.dartmouth.edu/~nyhan/scandal-potential.pdf), but the model in the working paper is the source of the forecast. As I describe in the blog post in which the figure was originally presented (http://www.centerforpolitics.org/crystalball/articles/bxn2011052601/), it was an out-of-sample forecast from the statistical model presented in my working paper. I held the values of the covariates fixed at their values at the time and showed how the predicted probability of scandal onset increased as additional time elapsed. I also specifically presented caveats about the forecast (“extrapolation from 1977-2008 data … should not be interpreted too literally … sample size of post-Watergate presidents is obviously quite small”).

Obviously the real-world probability of scandal was not 100%, but we should take our models seriously! Without putting some sort of arbitrary prior on the likelihood of scandal, the model forecast reached 100% because no president in the contemporary era had gone as long as Obama without a scandal by my measure. Presumably models in IPE would produce similar forecast probabilities under previously unobserved conditions (e.g., the probability of democracy if GDP per capita = $150,000).

It’s true that the GSA and Secret Service scandals were not especially large in magnitude but they met the definition laid out in advance in my research. If one restricts the definition of scandals to the very largest controversies, there are too few to study scientifically (again, consider the IR analogy – there’s a reason the field defines war using thresholds like 1,000 battle deaths rather than focusing exclusively on WWI, WWII, etc.).

And, like the IRS/Benghazi proto-scandals going on now, the GSA/Secret Service scandals concerned the propriety of executive branch actions rather than personal misconduct – the focus that I expected in the 2011 post. It’s not clear to me how that prediction is “not even remotely correct.” The controversies mentioned in the post were examples of Republicans shifting their focus to the executive branch, not specific predictions.

Was the forecast perfect? No! This is social science, and scandal has received almost no scholarly attention from quantitative researchers. But I’m hopeful that my research can help generate increased interest in the topic and help improve our ability to understand and forecast it in the future (suggestions welcome!).

As far as Kindred’s framing concerns, I have never purported to speak “for” political science – that’s a headline someone else wrote.

Kindred Winecoff May 16, 2013 at 4:35 pm

Hey Brendan,

I have no problem with your work, more the way it was used by the HuffPo guy. I’m not sure it was his intent, but he comes across as trying to diminish recent scandals because Hey! it happens to everybody! Nothing to see here! As far as I can tell your research implies no such thing. (And you can define scandal however makes sense for your research. It was wrong for me to imply that you had it wrong. Sorry about that. It might be helpful, as a separate question, to look at the scandals that “matter” and the ones that do not. Maybe this is something you’ve already done or are doing.)

The probability question is more interesting. IPE typically does not forecast Pr(democracy | income) — that’s more a job for the comparativists — but if it did up until fairly recently it would predict something like what you say: a country with GDP/capita above, say, $15,000 — much less $150,000 — would be a democracy with Pr = 1. Then Singapore happened. And South Korea. And all the petrostates. Plus the single-party democracies like Japan (for 50 years or so). Looking forward, the relationship between democracy and income looks much more tenuous than it has previously. So *if* I had a model showing that China was going to become a democracy with Pr = 1 once its per capita income got to where it looks like it will be in 2020 or so, I might be re-thinking that model right about now.

The hinge is when you say “Without putting some sort of arbitrary prior on the likelihood of scandal…”. Well, why not do that? It doesn’t even have to be arbitrary. It can come from past experience. In any case, a flat prior is no less arbitrary than any other.

This gets at something Justin mentions below: all people die, but not all countries become democracies. If we’re using models that assume universal “democratization” given a long enough time horizon we better have really good theory to match. (This is not necessarily a problem for you, Brendan, but more generally with these kinds of models which are used quite frequently in conflict studies and other substantive fields; we tend to dramatically over-predict war, for example.)

Brendan Nyhan May 16, 2013 at 4:55 pm

Fair enough – thanks for the additional thoughts. A few points:
-There are so few big scandals that it’s hard to tackle effects, but most estimates that go beyond the most significant ones suggest the consequences for presidential approval are quite limited. We know far more about the effects of scandal than its causes.
-We could put a prior on scandal, sure. I agree a flat prior is arbitrary, but the paper is quite methodologically complex already and depends on frequentist methods that aren’t well-developed in the Bayesian case (clustering, IV).
-I agree that the assumptions of survival models aren’t always substantively appropriate. In this case, though, no president in the contemporary period has gone without scandal by my definition so it seems more innocuous. (And again, moving to something like a split-population model that would allow us to relax that assumption introduces a series of other, difficult methodological problems.)

Justin May 16, 2013 at 11:52 am

“‘I’m immediately skeptical of predictions which contain 100% probability’: Good point. ”

I haven’t read the paper, but I was under the impression that the vast majority of survival/event history models converged towards 0% survival/100% event occurrence. That is, as time under study approaches infinity, don’t most survival models predict death with certainty? Why would this be a sign of a poor duration model?

Andrew Gelman May 16, 2013 at 5:10 pm

Justin:

100% seems to high to me. Such models asymptote to 100%, but (a) you don’t have to get to that asymptote in the period of the data, and (b) it’s possible to alter the model to have a less than 100% ceiling. The point is that, if the model is giving predictions of 99.5% or whatever and we’re not actually so certain, that suggests that the model’s functional form is unduly driving the inferences.

Justin May 16, 2013 at 6:18 pm

First, thanks for the response. I’ve read the blog for a long time and never commented. Three genuinely curious questions…

1) Given that every president in the sample has been through a scandal during their tenure, why does 100% seem high?

2) If a model doesn’t have to arrive at the asymptote in the period under study, but the model suggests that the predictions do arrive at the asymptote, isn’t that just the model telling us what the data says? As you point out, the functional form allows for the predictions to never reach 100% in the period under study. If the functional form allows for us to never reach the asymptote, then how is reaching the asymptote the fault of the form of the model?

2) The 100% prediction is just a prediction right? The graph says nothing about our certainty around that prediction. So when you say ” if the model is giving predictions of 99.5% or whatever and we’re not actually so certain, that suggests that the model’s functional form is unduly driving the inferences,” wouldn’t we need more information to be skeptical?

Brendan Nyhan May 16, 2013 at 12:15 pm

Also, what Justin said – absent estimating a split-population model, a discrete-time survival model of the sort that I’m estimating assumes all individuals eventually experience the event (though the timing will of course vary).

Rachel Strohm May 16, 2013 at 3:18 pm

Just FYI – it’s Dartmouth College. Dartmouth University hasn’t existed since 1819: http://en.wikipedia.org/wiki/Dartmouth_University. Although if Nyhan did predict an Obama scandal nearly 200 years ahead of time I would be considerably impressed.

Joshua Tucker May 16, 2013 at 3:58 pm

And yet, they have a graduate program – shouldn’t that make it a university?? Either way, thanks for the correction – I’ve changed the text of the post.

Brendan Nyhan May 16, 2013 at 4:05 pm

Yes, big controversy about that here – branding! http://thedartmouth.com/2013/03/08/news/planning

RobC May 16, 2013 at 4:42 pm

Any discussion of this burning question must pay homage to Daniel Webster, who famously argued to the Supreme Court in the Dartmouth College Case, “It is, Sir, as I have said, a small College, And yet, there are those who love it. “

Justin May 16, 2013 at 4:14 pm

“The graph gives a 100% probability of a scandal happening about 14 months ago (i.e. before the last election) and that probability staying constant (presumably… the x-axis ends nearly a year ago) ever since. No really: 100% probability. 14 months ago. And every month since”

“No. Sorry, no. That “scandal” was the GSA spending in Vegas in April, 2012. Remember that? No? Well it was supposed to be a big deal. Nyhan describes it as a scandal and gives this definition: “I define scandal as a widespread elite *perception* of wrongdoing.” So after 12 months of being wrong, he got the GSA Vegas thing “right”, then nothing else right for another year until this week. This is not impressive to the impartial.”

These are also less-then-perfect interpretations of event history/survival models. Event history modeling is focused on time to events (Pr(T>t)), not predicting the number of events that took place in a time period or the frequency of events. That is, if we build a duration model predicting when someone will die, we’re trying to estimate the probability that someone has died on or before time “t”. If the model suggests that they died on or before Day 5 with 100% probability, it will also suggest that they have died on or before Day 6 with 100% probability. This is NOT the same as saying that someone will die on Day 5 with 100% probability and then will die again with 100% probability on Day 6. It is simply recognizing duration dependence and noting mathematically that once something has happened, it can’t un-happen.

So Brendan’s model is suggesting that President Obama should experience a scandal on or before January of 2012 with nearly 100% certainty (something that doesn’t appear to be true), but it also says that President Obama should have experienced a scandal on or before April, May, and June of 2012 with 100% certainty, not that there would be a new scandal in each of those months. Once the GSA scandal happened, and the model said it should have happened, the model is correct. To say the model “got nothing else right for a year” is incorrect. The model says a scandal should have occurred by then, and a scandal had occurred by then. Once the model correctly predicts an event occurring, it is doesn’t become incorrect again.

Maybe that’s not a useful way to think about the problem you all are worried about, and the forecast certainly isn’t exactly right (I’d love to see standard errors around the predictions), but this criticism isn’t exactly right either.

Lawrence Zigerell May 17, 2013 at 11:55 am

Jason Linkins: “So, Nyhan was off by about a year.”

Brendan Nyhan: “Was the forecast perfect?”

Justin: “Once the GSA scandal happened, and the model said it should have happened, the model is correct.”

It does not seem correct to assert that Figure 2 or the model underlying Figure 2 is correct or incorrect or off by about a year, at least not yet. Brendan’s model in Figure 2 does not make the single prediction that the initial scandal will occur when the forecast probability reaches 100.0%; rather, Figure 2 contains a large number of predictions: a 42% probability of an initial scandal in June 2011, a 56% probability of an initial scandal in July 2011, a 64% probability of an initial scandal in August 2011, …, a 97% probability of an initial scandal in Dec 2011, etc. It does not seem possible to use the one observation of the initial scandal for the Obama administration to evaluate the multiple predictions depicted in Figure 2.

The discussion of whether the initial scandal occurred in April 2012 or May 2013 or some other month thus has only a marginal bearing on an evaluation of the predictive capability of Brendan’s model because testing the predictive capability of the model would require a much larger number of out-of-sample observations. So until the number of out-of-sample observations is increased through predicting the onset of scandals in other countries or the past United States or the future United States or at a subnational level, it would make more sense to focus our evaluation of the model in terms of model elements such as theory, inclusion of variables, and measurement of variables.

Brendan Nyhan May 17, 2013 at 12:11 pm

To be clear, the predicted probabilities of scandal onset in the figure for month t are *conditional* on no scandal occurring in month t-1. If a scandal occurs, the predicted probability decreases due to the flexible polynomials for time between scandals I include to account for duration dependence – see Table 1 and Figure 3 in the current version of the paper: http://www.dartmouth.edu/~nyhan/scandal-potential.pdf

Lawrence May 17, 2013 at 1:01 pm

Thanks for the clarification, Brendan. Thanks also for engaging the comments.

Comments on this entry are closed.

Previous post:

Next post: