Tuesday, September 4, 2012

Follow-up on Measuring Military Capabilities

In a previous post, I introduced a new index of military capabilities, based on data that is already widely in use (see also this Duck of Minerva post). Based on the great feedback I've received, I decided it was time to revise the measure, and to see how well it does in accounting for war outcomes and the likelihood of conflict.

Construction

Formally the revised $$M$$ score for country $$i$$ in year $$t$$ is equal to,
\begin{align*}
\mbox{M}_{i,t}= \Pi_{i,t}q_{i,t},
\end{align*}
where $$\Pi_{i,t}$$ and $$q_{i,t}$$ are discounted measures of the military personnel and quality ratios (military expenditures per troop), respectively, of country $$i$$ in year $$t$$.

Specifically,
\begin{align*}
\Pi_{i,t} = \frac{\mbox{milper}_{i,t}}{\mbox{milper}_{i,t} + \delta^\Pi_t},
\end{align*}
where $$\mbox{milper}_{i,t}$$ is the military personnel for country $$i$$ in year $$t$$ (taken from the CINC data) and $$\delta^\Pi_t$$ is a deflator. More formally,
\begin{align*}
\delta^{\Pi}_{t} = 22^{\left((\mbox{year}-1650)/100\right)}.
\end{align*}

This ensures that $$\delta^\Pi_t$$ takes on a value that correlates quite highly with the average level of military personnel among the major powers in any given year, yet does not exhibit the fluctuations found in the actual average. Roughly speaking, it reflects an admittedly arbitrary, but perhaps reasonable, standard by which military personnel might be judged to be large or small in any given year.

Similarly,
\begin{align*}
q_{i,t} = \frac{\mbox{qualrat}_{i,t}}{\mbox{qaulrat}_{i,t} + \delta^q_t},
\end{align*}
where $$\mbox{qualrat}_{i,t}$$ is the quality ratio for country $$i$$ in year $$t$$ (taken by dividing the military expenditures for that country by its military personnel, using the CINC data for both) and $$\delta^q_t$$ is a deflator. More formally,
\begin{align*}
\delta^{q}_{t} = 60^{\left((\mbox{year}-1750)/100\right)}.
\end{align*}

Again, the deflator is defined arbitrarily. I chose values that ensured that the deflator can be interpreted as a reasonable benchmark against which to compare the quality ratios of the major powers since it is roughly similar to the average value thereof.

As before, my goal is to account for the size of a military as well as it's sophistication. I also wanted a measure that didn't correlate so highly with time (as GDP does) nor require that the total of capabilities in the international system always sum to 1 (as CINC does). The interpretation of this new measure is more intuitive. It ranges from 0 to 1, with values nearer to 0 indicating that country $$i$$ has virtually none of the military might that could reasonably be possessed by a country in year $$t$$, while values closer to 1 indicate that country $$i$$ has achieved a level of military might that far exceeds the standards for that time period. Note that the "standards" are strictly increasing over time. They roughly track the average values of military personnel and quality ratio, but during certain periods, those averages fall while my deflators strictly increase with time. This allows the measure to pick up changes in system militarization, which no existing measure of hard power does.

Validation

As before, let's start with some face validity. Here's the $$M_{i,t}$$ for the United States, United Kingdom, Russia (Soviet Union), and China from 1945 (1950 for China, due to missing data) through 2007.

That looks a bit different from my previous attempt. It tells us that the US came out of WWII with an unprecedented military advantage, but quickly began cutting back. That the Soviet Union and the US had roughly equal conventional military capabilities during the early part of the Cold War and that the US fell behind the Soviet Union in the 70s. It also tells us that the international system is far less militarized today than it has been for decades, and that the US remains on top.

So $$M$$ passes the laugh test. What more can we say about it?

I generated a directed-dyadic data set with one observation for MID initiation. Defining a "war" as a conflict with hostility level 5 and fatality level 6 (see the MID handbook available here), I looked at the relationship between relative military capabilities ($$M_{A,t}$$ over $$M_{A,t} + M_{B,t}$$) as measured with $$M$$ and the outcomes of bilateral wars. Unfortunately, there are only 24 bilateral wars for which the data are available, so there's not much sense in trying any sophisticated analysis. But it is perhaps quite telling that $$M_{A,t}$$ takes on an average value of 0.78 in the 7 bilateral wars that side A won and an average value of 0.48 in the 17 bilateral wars that side A did not resulted in some other outcome.

I then generated a directed-dyadic data set with one observation per year. This data set includes all dyads save those where there is an ongoing dispute. I then estimated a series of simple binary logits, using various different measures of conflict. The only explanatory variables I have included are parity (1 - the stronger side's $$M$$ relative share of capabilities, which is a standard measure of parity) and, for the initiation models, the number of years of peace enjoyed by that dyad as of that year (to correct for temporal dependence).

The results indicate that parity is strongly associated with MID initiation, war initiation (relative to peace), and escalation (where escalation refers to whether those MIDs that do occur reach the level of war -- i.e., takes on a value of 1 if a war occurs, 0 if a MID that does not reach the level of war occurs, and is missing if not MID was initiated by A against B in that year).

The substantive effects are pretty substantial. As we move from complete preponderance to total parity, the estimated probability that a MID initiated by A against B reaches the level of war increases from 0.02 to 0.1. That is, the risk of escalation increases fivefold.

These are crude tests. One can quibble with the model specification. But as a first cut, this suggests that $$M$$ is doing what it should. Any measure of relative capabilities that did not indicate that parity is associated with conflict would be immediately suspect.

Much remains to be done, but at the moment, I'm thinking that this revised version of $$M$$ is pretty promising. If we look at the values for some of the major powers in the postwar era, there seems to be some face validity to the measure, and it seems to tell us what we'd expect it to about war outcomes and war onset. What do you all think? Any suggestions for how I might further improve the measure using available data?