International Relations, Modeling, National Security

Rock-Paper-Scissors and Arms Races Part 2

My previous posting laid out a simple framework for viewing arms races as a game of rock-paper-scissors (RPS).  This posting will examine some of the basic properties of a mathematical model of RPS implemented in Excel that employs the replicator equation as a way of examining the dynamics of innovation within arms races (a version of the excel is available here).  The analysis will be quite simple, and serves as the basis for developing a more complex agent-based model (ABM) that will be capable of relaxing many of the embedded assumptions of the mathematical model.

RPS exists in many situations in the biological world where multiple types of features or strategies exist.  In these situations, seemingly weak strategies remain in populations or ecosystems, and may cycle into dominant positions.  For example, male side-blotched lizards in California possess three distinct markings and mating strategies (see here):

  • Orange males are large, highly aggressive and patrol large territorial areas;
  • Blue males are smaller, less aggressive and patrol smaller territorial areas;
  • Yellow males are the smallest, have the same markings as females, and sneak into the territory controlled by other males to mate.

Side-Blotched Lizards

The population of side-blotched lizards experiences cycles in the numbers of orange, blue and yellow males.  When the population of orange males increases, blue males are forced from their territory, while yellow males are capable of infiltrating the territory of orange males.  As the number of yellow males increase, the number of orange males decline while the number of blue males rises as they can successfully defend smaller areas of territory from yellow infiltration.  As the number of blue males increase, the number of yellow males declines while new opportunities for orange males emerge, that eventually displace the smaller, less aggressive blue males.  Thus, this pattern repeats, with no color or strategy driven to extinction.

The basic RPS game can be shown as a simple game matrix depicting the payoffs of two players (row and column) based on the strategies they play.

Column Player (payoff in parentheses)
Rock Paper Scissors
Row Player Rock Draw / (Draw) Lose / (Win) Win / (Lose)
Paper Win / (Lose) Draw / (Draw) Lose / (Win)
Scissors Lose / (Win) Win / (Lose) Draw / (Draw)

By modeling the population of rock, paper and scissors using the replicator dynamics, alternative patterns emerge based on different values for the payoffs of win, lose and draw.

Rather than provide the general form of the replicator equation, which I find difficult to interpret in the abstract, I’ll present its specific form to RPS below.  (A general version of the replicator equation can be found here.)  The basic premise of the replicator model is that those strategies that have fitness higher than the populations’ average will grow, while those with fitness lower than the average will decline.  First, the calculations for the fitness of rocks, papers, and scissors are:

Rock fitness = initial fitness + (probability of encountering a rock * payoff of a draw) + (probability of encountering a paper * payoff of a lose) + (probability of a scissor * payoff of a win)
Paper fitness = initial fitness + (probability of encountering a rock * payoff of a win) + (probability of encountering a paper * payoff of a draw) + (probability of encountering a scissor * payoff of a lose)
Scissor fitness = initial fitness + (probability of encountering a rock * payoff of a lose) + (probability of encountering a paper * payoff of a lose) + (probability of encountering a scissor * payoff of a draw)

Note that average fitness of a term that ensures that the absolute fitness of a member of the population is never negative.  Thus, even if a strategy loses big, it’s fitness will always be positive.  In the model, this simply means selecting a number that allow for the relative fitness ratios to change while always being greater than zero.

Second, the average fitness of the population is:

Average fitness = (percentage rocks * rock fitness) + (percentage paper * paper fitness) + (percentage scissors * scissor fitness)

Third, the future percentage of rock, papers, and scissors in the population are based on the relative fitness of each at the moment:

Future rocks = current percentage of rocks * (rock fitness / average fitness)
Future papers = current percentage of paper * (paper fitness / average fitness)
Future scissors = current percentage of scissors * (scissor fitness / average fitness)

What follows below shows the results of the mathematical model for a variety of alternative populations based alternative payoff structures.

Population 1
Initial percentage rocks = 0.333
Initial percentage paper = 0.333
Initial percentage scissors = 0.334
Initial fitness = 5
Win = 2
Lose = -2
Draw = 0

Population 1

The dynamics show cycles where the population starts relatively evenly distributed between rocks, papers, and scissors, but tends towards extremes.  As the population pushes towards more and more extreme values – asymptotically approaching 1.0 and 0.0 percent for each type depending on the cycle – the stages get longer and longer.  Importantly, this structure of the model is symmetric with respect to wins and losses (2 and -2 respectively) while draws are of no consequence one way or another.

If draws are introduced with a negative payoff, as might occur in cases where a costly war results from both sides employing similar strategies, then the structure of model changes.

Population 2
Initial percentage rocks = 0.333
Initial percentage paper = 0.333
Initial percentage scissors = 0.334
Initial fitness = 5
Win = 2
Lose = -2
Draw = -0.2

The introduction of small costs to a draw (holding everything else constant) slows the race towards the increasingly prolonged and extreme peaks, but does not fundamentally alter the structure of the results.  However, as the costs of a draw rise, a new dynamic emerges.

Population 3
Initial percentage rocks = 0.333
Initial percentage paper = 0.333
Initial percentage scissors = 0.334
Initial fitness = 5
Win = 2
Lose = -2
Draw = -2

In this case, the population converges to equilibrium, where a steady state of the number of rocks, papers, and scissors remains constant in the population.  Because all of payoffs are symmetrical with respect strategies (a rock losing to paper has the same payoff as a scissor losing to rock) the equilibrium is the equal distribution of strategies across all types – approximately 0.33333333 for rocks, papers, and scissors.  Moreover, this outcome is robust to initial conditions.  For example, if the population starts out as .90 rock, 0.09 paper, and 0.01 scissors, the result is nevertheless a convergence to the same equilibrium.

Population 3a
Initial percentage rocks = 0.9
Initial percentage paper = 0.09
Initial percentage scissors = 0.001
Initial fitness = 5
Win = 2
Lose = -2
Draw = -2

The population cycles, but eventually dampens into a steady state.

An alternate configuration of the model is for the draw to provide a positive payoff.  Perhaps this would represent successful deterrence and even the emergence of militarily useful, but ultimately transnational innovations such as communications satellites and GPS.

Population 4
Initial percentage rocks = 0.333
Initial percentage paper = 0.333
Initial percentage scissors = 0.334
Initial fitness = 5
Win = 2
Lose = -2
Draw = 0.2

Again, the population enters into cycles, but they are far slower than the earlier case with the payoff from a draw set to a negative number.  A simple analysis comparing the graphic above with a draw payoff of 0.2 with that of Population 2 with a draw payoff of -0.2 shows the change in speed to strategic cycles.  By counting the cycles that approach 1.0, the positive draw value is on number 5 (starting with the blue, percentage rocks cycle at time step 254 or so) and running through the paper cycle that remains dominant at the end of 1000 steps.  By comparison, the negative draw value is on cycle 6 (starting with a paper cycle on step 622) and is entering into a new paper cycle at the end of the simulation.  Importantly, the negative payoffs for draw accelerate the dynamics with respect to time, cycling faster than the positive draw values, but slow their approach to increasingly extreme values of 1.0 and 0.0.

If the positive value of a draw is increased even higher, to being the same as that of a win, then a new dynamic emerges where the payoffs from imitating others becomes greater than innovating against them.

Population 5
Initial percentage rocks = 0.333
Initial percentage paper = 0.333
Initial percentage scissors = 0.334
Initial fitness = 5
Win = 2
Lose = -2
Draw = 2

In this case, the model settles into a steady state where one strategy dominates the others.  There is no intrinsic reason why one strategy ends up in this position, as it is a feature of particular payoff ratios and the structure of the initial population.  In these cases, the incentives for imitation are far higher than that of innovation resulting in a static system.

This final case has an intuitive appeal when looking at the Cold War’s military history.  While the Cold War contained a fierce arms race that included the incremental improvements of WWIIs dominant weapons and intelligence systems, historical accounts of the end of the WWII are surprisingly similar to first post-Cold War conflict of Desert Storm, for example Vannevar Bush’s account of the warfare at sea immediately after World War II featured TV guided bombs and foresaw the rise of cruise missiles and other smart weapons that would not emerge as the central features of the US military until Revolution in Military Affairs (RMA) of the late 1980s and 1990s.  It may be the case that the long delay in the emergence of these weapons was the result of the Cold War’s nuclear stalemate and the prospects that a conventional war would escalate.  Thus, from a strategic perspective, the payoffs of a draw or symmetrical nuclear arms were so high that they may have even exceeded that of winning a conflict outright.  Avoiding nuclear war achieved a higher payoff than victory in a conventional one could provide.

The strategic cycling of the mathematical model is a useful property, and the fact that model is responsive to different payoff structures is also quite helpful.  However, the model is missing many essential features that characterize real-world international relations and strategy.  For example, the model has no notion of geography, distance, or the relative costs associated with different armaments or strategies.  This is important, because one might expect states that share long borders, water sources, or other resources might be particularly sensitive to each other’s armaments and strategies.  Likewise some states may have an easier time acquiring advanced technologies, and would benefit from styles of warfare and strategies that emphasize long-rang precision strikes, while others may have abundance of military aged population and little access to technology, thus preferring mass warfare.  Additionally, each member of the population in the mathematical model has perfect information about the capabilities and strategies of the others, an assumption that is highly problematic in reality as demonstrated by the flawed estimated of Iraqi WMD capabilities prior to the 1991 and 2003 wars.  In fact, there is really no notion of a population member making choices in the mathematical model since the dynamics govern the population as a whole, it provides no insight into the behavior of its individual members.

Another problem with the mathematical model is strictly technical, not conceptual, although it flows from the point raised above regarding the aggregation of the population.  The population in the mathematical model is continuous, not discrete.  This means that it is infinitely divisible allowing for populations to continuously approach smaller and smaller quantities as they approach 1.0 or 0.0.  In biological systems, this representation becomes problematic as populations become incredibly small they can lose their genetic diversity, and at the extremes concerns about gender ratios, the fertility of individuals, and even geographic separation can become a consideration.  In military terms, the extreme values may mean that a state has ½ of a missile or bomber or some other armament, which can be difficult to interpret.  For example, a fighter that is missing an engine and avionics is not a threat, but the unassembled components of a nuclear weapon might be.

These kinds of extensions are difficult to implement mathematically, and future postings will explore the movement of this model into a new computational framework of an ABM in order to extend the RPS model to include geography, heterogeneity amongst actors, and imperfect information.

 

Leave a Reply

Your email address will not be published. Required fields are marked *