The difference in the chart quoted, between “Qualification year” and “World Cup year” for “winners”, loudly screams “OUTLIER!!!” to me. The qualification year is the second lowest year out of 19 data points for the winners(10th percentile), while the WC year is somewhere between 7th and 9th highest out of the 19 data points for the winner – (55th-70th percentile). I know this data point isn’t the “point” of the study, but it is the one lampshaded in the chart by the “qualification” and “World Cup” lines.

Figure 3 is self-refuting. There are 6 conditions described, each with 6 data points representing points out of qualification. Each condition has 95% confidence intervals established. In each of the six cases, at least one of the six data points lies outside the 95% confidence interval. The probability of that happening by chance is somewhere in the range of 1 in 3000. Either 6 out of 7, or 7 out of 8 of the outliers are below the bounds. I think the ‘bootstrapping’ algorithm that Bartoli uses is in this case significantly underestimating either the standard deviation or the kurtosis of the underlying distributions. Both of these failures mathematically lead to increases in false positives in analysis. OLS says it is significant, but OLS assumes a normal distribution, while the incident count truncates at zero – important given than over half of the datapoints are zero. Here, OLS is using a normal distribution to estimate a power-law distribution – the normal distribution estimated will almost inevitably have a thinner right tail than the power-law distribution, resulting in false positives.

The trend that leaps out at me is that losers appear to undergo a significant decrease in aggression starting before the loss, which persists past the World Cup year. There is potentially as much evidence here that failure suppresses nationalist feeling as there is that qualification ramps it up – the aggression numbers appear to decrease for the losers by about as much as the aggression numbers appear to rise for the winners. I suspect that this is spurious and random, but it must account for some of the significance in the difference.

Looking at Figure 4, it looks like a lot of the probability mass is due to the USSR and USA, which account for fully 1/3rd of conflicts initiated in the entire study. Neither country ever appears in the loser side of the bracket, and the two countries share an important trait – Neither country exactly ever needed a prod to initiate an interstate military dispute. Besides, World Cup isn’t particularly important in the US at all. France seems to be doing a lot of heavy lifting, also, though it at least does demonstrate the phenomenon the author moots – it is more aggressive after qualifying than after failing. France isn’t exactly known for its pacifistic foreign policy, either. (The author does claim that the effect still exists without the big guys.) There probably isn’t enough data to run a model controlling for the actual countries, but the possibility that more exogenously aggressive countries happened to win close should be considered.

Finally, look at the summary stats (Figure 6). Despite having similar populations, the qualifiers have 50% more soldiers and spend 3x as much on their militaries than the non-qualifiers. The wonder isn’t that those countries have more conflicts after the World Cup, the wonder is why they don’t have more conflicts before! I hazard that some of this is the “US/USSR” effect.

Interesting paper. I’m a little dubious about the idea that the discontinuity approximates randomization. Imbalance on some variables is really high (ex: Mean Military expenditures 12M in treatment group versus 3M in control group). Would also be interesting to see balance results on GDP/capita and level of development.

Given the extent of imbalance, using OLS to estimate effects seems reasonable (Table 2) but I’d like to see more specifications.

This paper could be helped a lot by an expanded discussion of a few illustrative cases. I look at the top results in Figure 4 and get a little dubious about whether nationalism (much less World Cup induced nationalism) had much to do with the aggressive behavior of those states (USSR 1958, US 1990) in the next couple years.

Thank you all for reading and commenting on my paper. I can tell that some of you spent a lot of time searching for potential problems, and your critiques provide me with a good opportunity to clarify some points that are probably unclear. Let me go through them in the order they were brought up.

1. In Figure 2, the aggression levels at Year 0 are low, but I don’t compare Year 0 to Year 0.5. Instead, I use a difference in differences t-test that compares the change in aggression levels for the two groups between the 3 years before and 3 years after qualification. The results are significant at the 1% level, and remain so when the Soviet Union and US are dropped from the sample.

2. In the balance plot, the difference in means for military expenditures and military personnel is large because the Soviet Union and US are in the qualifier group. But as the p-values indicate, the difference is far from statistically significant. In fact, when you drop the Soviet Union and US, the non-qualifiers are higher on military expenditures and military personnel, and the results for change in aggression are still significant at the 1% level.

3. rvman, you are right in pointing out the truncated data issue about my OLS estimates. But since the OLS was just a robustness check, I wasn’t too worried about this problem. However, a better approach for me here would be to do a regression on the change in aggression (3 years after-3 years before), which would not be truncated at 0. The results for this test are significant (p=0.015). I recognize that the standard errors will be somewhat off because the errors aren’t normally distributed, but this is just a robustness check. Thank you for pointing this out.

4. rvman, you have another good point about the decrease in aggression for the non-qualifiers. But if failing to qualify suppresses nationalism, which decreases aggression, it’s still an interesting finding about nationalism. However, I disagree with you that it is probably spurious or random. The results are significant for every statistical test I’ve run (almost always at the 1% level), and they are robust to the removal of outliers. It could be fluke, but given the large number of existing case studies about sports nationalism leading to conflict, I believe it’s much more likely to be a real treatment effect than just chance.

5. In Figure 3, some of the means fall outside the 95% confidence interval because there aren’t many observations at those points. The qualification process is very competitive, so the “last team in-first team out” pairs are heavily concentrated in the 2-point regression discontinuity window. There are 63 pairs of countries there, compared to 35 pairs of countries outside the RD window. So when you bootstrap, the local linear regression smooths over these areas of the data in a way that allows some of the means to fall outside the confidence interval. It was a very good catch by rvman, and its something that only makes sense once you understand that there aren’t many points outside the RD window.

6. Maradona, I am definitely not arguing that the World Cup is the only source of nationalism for states. Nationalism in the US arises in many other ways. I focus on the World Cup because you can get a natural experiment out of it, which allows you to estimate the average treatment effect across a large number of countries. The data indicates that there is a pretty substantial effect here.

7. Powderfinger, I think I addressed your concerns in my comments to rvman. If you have any other robustness checks you think I should do, I’d be happy to hear them.

8. WC, thank you for the link. I will be sure to check out that book.

Thank you all again for your comments. Your suggestions were very helpful, and your criticisms pointed out some places where I need to be more clear. I appreciate the time you took to read and comment on my paper.

“Maradona”: you got it wrong. Bertoli didn´t say that “world cup soccer ” creates nationalism from scratch, nor that playing it is the one and/or only historical and present cause of nationalism and nationalistic agression. The thing is that soccer relative to world cups can and has impulsed nationalism and “liberated” some of its agressive potential -resulting in factual agressions. Some degree of nationalism is already there and “world cup soccer” acts upon it, reinforcing it and (re)expressing it in specific terms of violence.

What about the notion that, even with the correlation, the nationalism would be a lot worse without the soccer? That it’s a relatively benign outlet for channeling nationalist aggression that might otherwise lead to more fighting and wars? Maybe that’s besides the point of the research, but just wanted to put that idea out there.

What about cricket riots? I mean, if you’ve ever played cricket at a semi-high level you know that the sport if far more brutal than its “toff-ish” reputation or soccer is.

Seriously, they do happen, but of course the sample size is pretty small since only 10 countries officially play world-class cricket (and two of those are pretty new)

The difference in the chart quoted, between “Qualification year” and “World Cup year” for “winners”, loudly screams “OUTLIER!!!” to me. The qualification year is the second lowest year out of 19 data points for the winners(10th percentile), while the WC year is somewhere between 7th and 9th highest out of the 19 data points for the winner – (55th-70th percentile). I know this data point isn’t the “point” of the study, but it is the one lampshaded in the chart by the “qualification” and “World Cup” lines.

Figure 3 is self-refuting. There are 6 conditions described, each with 6 data points representing points out of qualification. Each condition has 95% confidence intervals established. In each of the six cases, at least one of the six data points lies outside the 95% confidence interval. The probability of that happening by chance is somewhere in the range of 1 in 3000. Either 6 out of 7, or 7 out of 8 of the outliers are below the bounds. I think the ‘bootstrapping’ algorithm that Bartoli uses is in this case significantly underestimating either the standard deviation or the kurtosis of the underlying distributions. Both of these failures mathematically lead to increases in false positives in analysis. OLS says it is significant, but OLS assumes a normal distribution, while the incident count truncates at zero – important given than over half of the datapoints are zero. Here, OLS is using a normal distribution to estimate a power-law distribution – the normal distribution estimated will almost inevitably have a thinner right tail than the power-law distribution, resulting in false positives.

The trend that leaps out at me is that losers appear to undergo a significant decrease in aggression starting before the loss, which persists past the World Cup year. There is potentially as much evidence here that failure suppresses nationalist feeling as there is that qualification ramps it up – the aggression numbers appear to decrease for the losers by about as much as the aggression numbers appear to rise for the winners. I suspect that this is spurious and random, but it must account for some of the significance in the difference.

Looking at Figure 4, it looks like a lot of the probability mass is due to the USSR and USA, which account for fully 1/3rd of conflicts initiated in the entire study. Neither country ever appears in the loser side of the bracket, and the two countries share an important trait – Neither country exactly ever needed a prod to initiate an interstate military dispute. Besides, World Cup isn’t particularly important in the US at all. France seems to be doing a lot of heavy lifting, also, though it at least does demonstrate the phenomenon the author moots – it is more aggressive after qualifying than after failing. France isn’t exactly known for its pacifistic foreign policy, either. (The author does claim that the effect still exists without the big guys.) There probably isn’t enough data to run a model controlling for the actual countries, but the possibility that more exogenously aggressive countries happened to win close should be considered.

Finally, look at the summary stats (Figure 6). Despite having similar populations, the qualifiers have 50% more soldiers and spend 3x as much on their militaries than the non-qualifiers. The wonder isn’t that those countries have more conflicts after the World Cup, the wonder is why they don’t have more conflicts before! I hazard that some of this is the “US/USSR” effect.

dubious thesis…

the US is the most aggressively nationalistic nation and does not play much world cup soccer.

that’s not how social science works, dog.

Interesting paper. I’m a little dubious about the idea that the discontinuity approximates randomization. Imbalance on some variables is really high (ex: Mean Military expenditures 12M in treatment group versus 3M in control group). Would also be interesting to see balance results on GDP/capita and level of development.

Given the extent of imbalance, using OLS to estimate effects seems reasonable (Table 2) but I’d like to see more specifications.

This paper could be helped a lot by an expanded discussion of a few illustrative cases. I look at the top results in Figure 4 and get a little dubious about whether nationalism (much less World Cup induced nationalism) had much to do with the aggressive behavior of those states (USSR 1958, US 1990) in the next couple years.

For an amusing case of soccer-induced military conflict, see this book:

http://www.amazon.com/The-Soccer-War-Ryszard-Kapuscinski/dp/0679738053

Thank you all for reading and commenting on my paper. I can tell that some of you spent a lot of time searching for potential problems, and your critiques provide me with a good opportunity to clarify some points that are probably unclear. Let me go through them in the order they were brought up.

1. In Figure 2, the aggression levels at Year 0 are low, but I don’t compare Year 0 to Year 0.5. Instead, I use a difference in differences t-test that compares the change in aggression levels for the two groups between the 3 years before and 3 years after qualification. The results are significant at the 1% level, and remain so when the Soviet Union and US are dropped from the sample.

2. In the balance plot, the difference in means for military expenditures and military personnel is large because the Soviet Union and US are in the qualifier group. But as the p-values indicate, the difference is far from statistically significant. In fact, when you drop the Soviet Union and US, the non-qualifiers are higher on military expenditures and military personnel, and the results for change in aggression are still significant at the 1% level.

3. rvman, you are right in pointing out the truncated data issue about my OLS estimates. But since the OLS was just a robustness check, I wasn’t too worried about this problem. However, a better approach for me here would be to do a regression on the change in aggression (3 years after-3 years before), which would not be truncated at 0. The results for this test are significant (p=0.015). I recognize that the standard errors will be somewhat off because the errors aren’t normally distributed, but this is just a robustness check. Thank you for pointing this out.

4. rvman, you have another good point about the decrease in aggression for the non-qualifiers. But if failing to qualify suppresses nationalism, which decreases aggression, it’s still an interesting finding about nationalism. However, I disagree with you that it is probably spurious or random. The results are significant for every statistical test I’ve run (almost always at the 1% level), and they are robust to the removal of outliers. It could be fluke, but given the large number of existing case studies about sports nationalism leading to conflict, I believe it’s much more likely to be a real treatment effect than just chance.

5. In Figure 3, some of the means fall outside the 95% confidence interval because there aren’t many observations at those points. The qualification process is very competitive, so the “last team in-first team out” pairs are heavily concentrated in the 2-point regression discontinuity window. There are 63 pairs of countries there, compared to 35 pairs of countries outside the RD window. So when you bootstrap, the local linear regression smooths over these areas of the data in a way that allows some of the means to fall outside the confidence interval. It was a very good catch by rvman, and its something that only makes sense once you understand that there aren’t many points outside the RD window.

6. Maradona, I am definitely not arguing that the World Cup is the only source of nationalism for states. Nationalism in the US arises in many other ways. I focus on the World Cup because you can get a natural experiment out of it, which allows you to estimate the average treatment effect across a large number of countries. The data indicates that there is a pretty substantial effect here.

7. Powderfinger, I think I addressed your concerns in my comments to rvman. If you have any other robustness checks you think I should do, I’d be happy to hear them.

8. WC, thank you for the link. I will be sure to check out that book.

Thank you all again for your comments. Your suggestions were very helpful, and your criticisms pointed out some places where I need to be more clear. I appreciate the time you took to read and comment on my paper.

Andrew Bertoli

“Maradona”: you got it wrong. Bertoli didn´t say that “world cup soccer ” creates nationalism from scratch, nor that playing it is the one and/or only historical and present cause of nationalism and nationalistic agression. The thing is that soccer relative to world cups can and has impulsed nationalism and “liberated” some of its agressive potential -resulting in factual agressions. Some degree of nationalism is already there and “world cup soccer” acts upon it, reinforcing it and (re)expressing it in specific terms of violence.

What about the notion that, even with the correlation, the nationalism would be a lot worse without the soccer? That it’s a relatively benign outlet for channeling nationalist aggression that might otherwise lead to more fighting and wars? Maybe that’s besides the point of the research, but just wanted to put that idea out there.

What about cricket riots? I mean, if you’ve ever played cricket at a semi-high level you know that the sport if far more brutal than its “toff-ish” reputation or soccer is.

Seriously, they do happen, but of course the sample size is pretty small since only 10 countries officially play world-class cricket (and two of those are pretty new)