## Wednesday, February 19, 2014

### Seven Venial Sins of Quantitative IR

A justifiably more famous Phil recently laid out seven deadly sins of quantitative political science. Though I share all of his concerns, I don't think anyone (including Schrodt) expects people to give up the easy approach any time soon. In light of that, I'd like to highlight some small changes we all can and should make without destroying our chances of building a career. (Some of these apply pretty broadly, but others are unique to the study of IR, if you're wondering why I didn't just change the one word in the title).

1. Using robust standard errors to correct for "potential" violations of homoskedasticity in logits and probits. This practice is so widespread, most of us adopt it without thinking, assuming that there must be a reason everyone else does it. I myself have been guilty of that in the past. But here's the thing—heteroskedasticity is a feature, not a bug, of such models. If you think decreasing trade dependence between the US and Canada by one standard deviation would have a different effect on the probability of them going to war this year than a similar decrease would the likelihood of war between the US and China (the difference between, say, 0.0004 and 0.0001 being smaller than the difference between, let's call it 0.004 and 0.001), congratulations, you understand S-shaped curves and the heteroskedasticity that necessarily comes along with. So the standard justification falls flat. Is there a more sophisticated justification? Well, no. Robust standard errors reveal problems they do not fix.

2. Asking Cold War dummies to do what no dummy variable can do. If you think parity or trade dependence or alliance structures or civilizational differences or whatever played a fundamentally different role during the Cold War, including a binary variable that is equal to 1 for years 1945 through 1989 does nothing for you. What it does is force your favorite statistical software to check whether it's reasonable to conclude that the average value of $$y$$ observed during the Cold War differed significantly from the average value of $$y$$ observed in other years, conditional upon the effects of your other $$x$$s. That's almost certainly not what you're trying to get at. Ask yourself how the relationship between $$x_1$$ and $$y$$ changes as we increase $$x_2$$ from $$0$$ to $$1$$ in each of the two following models: $$y_i = \beta_0 + \beta_1 + x_{1i} + \beta_2 x_{i2} + \epsilon_i$$ and $$y_i = \beta_0 + \beta_1 + x_{1i} + \beta_2 x_{i2} + \beta_3x_{1i}x_{2i} + \epsilon_i$$. If you answered "not at all" and "I'm not really sure, but I know it has something to do with $$\beta_3$$", you win a prize. Granted, if you think everything was different during the Cold War, you probably don't want to include interactions between each and every $$x$$ and your Cold War dummy. What you should do instead is partition the data. (Or perhaps ask yourself if that sort of quasi-mystical argument is worth the bother. Because, really, everything was different? Come on.)

3. Making unconditional statements about the significance of interactive effects. Though several political scientists have tried to explain this to people (see here and here, for example), and I therefore have no reason to think I'll be any more successful, we as a field continue to fundamentally misunderstand interaction terms. When we ask how increases in $$x$$ affect $$y$$, we a) implicitly assume that correlation does in fact imply causation, making all attempts to soften our language in the middle section of the paper before we go back to really strong claims in the conclusion pretty hilarious, and b) essentially ask what $$\frac{\partial y}{\partial x}$$ looks like. In most cases we deal with, probably because we rarely bother to ask ourselves whether this is appropriate, the answer to that question is straightforward, requiring us to do little more than gaze at stars. That is, if you model the effect of $$x_1$$ as linear and unconditional, then sure, $$\frac{\partial y}{\partial x_1}$$ is simply $$\beta_1$$, and so $$\hat{\beta}_1$$ gives you the best answer you're going to get. But if your model is $$y_i = \beta_0 + \beta_1 x_{1i} + \beta_2 x_{2i} + \beta_3 x_{1i} x_{2i} + \epsilon_i$$, then $$\frac{\partial y}{\partial x_1}$$ is $$\beta_1 + \beta_3x_{2i}$$. The important point here is not that there are two relevant coefficients. Joint $$F$$ tests are not the answer. What many fail to appreciate is the marginal effect of $$x_1$$ varies with $$x_2$$. Let me repeat—it...varies. Thus, so should your interpretation thereof. There is no single answer to the question of whether "the" interaction effect is significant. Full stop.

4. Taking the absence of evidence for the evidence of absence. A large coefficient estimate paired with a slightly larger standard error tells you a very different story than a coefficient estimate arbitrarily close to zero paired with a tiny standard error. Failing to reject the null is not the same, in other words, as confirming the (unstated?) hypothesis that the effect actually is zero. If that's really what you're after, there's a better way. If not, you should probably choose your words more carefully.

5. Including the ratio of W to S for any reason whatsoever. Fun fact—the crude measure (introduced here, if you've been living under a rock) of a state's minimal winning coalition takes on a larger value than the crude measure of a state's selectorate (of which it is, theoretically, a subset) in nearly 10% of country-years. I have no idea what the ratio of the former to the latter is actually measuring, but I'm quite confident that it's not what you think it is. I'm not saying that these variables have no use. I'm perfectly comfortable with the assumption that these variables tend to take on higher values in states that we would all agree do in fact have larger winning coalitions and selectorates. But these are ordinal measures, and we ought to use them as such.

6. Treating Militarized Interstate Disputes as wars. If what you're interested in is differentiating between pairs of states that have hostile relations and those that do not, you've got a strong case for selecting a dependent variable that measures the onset or initiation of MIDs. And there's nothing wrong with that. I've done so many times in the past, and fully intend to do so in the future. But MIDs are not wars, and when we ask certain questions, that makes a big difference. For example, every theoretical explanation of the democratic peace with which I am familiar articulates reasons why pairs of democracies handle their disagreements differently than other pairs of states. But virtually every empirical evaluation of the democratic peace essentially presents evidence that pairs of democracies rarely have major disagreements. That may or may not be interesting (though I personally have my doubts about whether democracy is really doing the heavy lifting for these particular states), but it's unquestionably not evidence in support of the theoretical arguments that people nonetheless invoke when discussing such findings. And this matters, because one of the few papers that explicitly focused on whether disputes between democracies are less likely to escalate found that they are not.

7. Using the time between events not measured by the DV to correct for temporal dependence. Several authors have made the case for worrying about temporal dependence, and it is now de rigueur to either include peace years and either cubic splines or cubic polynomials. However, the justification does not always match the implementation. Specifically, authors often use the "cwpceyrs" variable generated EUGene even if their dependent variable is one they constructed after generating the data set (such as forceful MIDs, deadly MIDs, or some other subset of cwmid or cwinit). I have no particular reason to expect this to create serious problems, but there's still no reason to do it. If you think accounting for temporal variation in a variable other than your DV is close enough, what's the argument for using that particular DV?

NOTE: I seem to have accidentally deleted this poster earlier.