# Resolving the Prisoner's Dilemma

Recall the Prisoner’s Dilemma: If one player stays Quiet, while the other Betrays them, the Betrayer will receive only 1 year in prison, while the one who stayed Quiet will receive 10 years. If both Betray each other, they will each receive 5 years in prison, while if they both stay Quiet, they will only receive 2 years each.

Mutual Betrayal is the one and only Nash Equilibrium of this game. Rational players will each play Betray, and they will each receive 5 years in prison. This is true despite the fact that mutual Quiet would yield them each only 2 years in prison. Mutual Cooperation (both play Quiet) is not an equilibrium, as each player can do better by changing to Betray.

There is no way, in the Prisoner’s Dilemma as presented, to get rational players to each play Quiet.

However, the game can be changed in many different ways in order to make Mutual Cooperation possible, though rarely can they make it guaranteed.

The first option, perhaps inspired by Thomas Hobbes' all-powerful Leviathan, is to introduce some form of either punishment for Betrayal, or reward for staying Quiet. This needs to be built into the mechanisms of the game, and not dependent on player choice. Consider this Prisoner's Dilemma as a starting point:

One option would be to add a two-point reward for staying Quiet, while another would be to add a two-point penalty for playing Betray. The results of each of these options are as follows:

Prisoner’s Dilemma with Bonus for Quiet

Prisoner’s Dilemma with Punishment for Betray

These games are Stag Hunts: they have two Nash Equilibrium solutions in pure strategies, one of which (mutual Betrayal) is risk dominant, while the other (mutual Quiet) is payoff dominant. This external intervention makes the best outcome possible, but not guaranteed. Designers need to keep this in mind if they want to resolve a Prisoner's Dilemma in this way. Simply changing the payoffs is not enough: you cannot expect players to switch to Quiet just because it is an equilibrium solution now. Something else would need to be done if you genuinely want to influence player behaviour.

Another proposed resolution relies on the idea of reputation. Intuitively, players won’t Betray players that they have a chance of interacting with again in the future. They could build up a reputation of staying Quiet, and rely on that in future plays, putting some weight behind their promise of playing Quiet again.

Suppose that two players end up playing the Prisoner’s Dilemma for some fixed period of time, say every day for a month (Pris-tober?). Surely that is sufficient time to establish a reputation for cooperation, and cooperate for a good portion of the month, no? Well, no.

Think ahead to October 31, the last day of the month. What will the players do? It is tempting to say that they will continue to cooperate and play Quiet, as they have been doing all month. However, this is their last dilemma: they are not playing again after this. What worth is a reputation for staying Quiet? Players might as well switch to Betray in the hopes of getting their maximal benefit in that last game, just as if it was a “one-shot” game.

Each player will play Betray on October 31st, and each player knows this about the other. What will they do on October 30th, then? Any reputation for cooperation is all for naught, as the players know that mutual Betrayal awaits them on the 31st. Why cooperate with someone who will Betray you on the very next game? Players are better off playing Betray on the 30th as well.

This pattern continues backwards all the way back to October 1st, with the end result being that players will simply Betray every day. This takes the original dilemma and makes it worse: a full month of Betrayal.

The problem with this setup is the fixed end point (a **Finite Horizon**). Players are able to look ahead to the last game, and reason back to what they should do in each successive game.

If there is an indefinite horizon, though, things are remarkably different. Supposed that there is always a probability that players will play the Prisoner’s Dilemma again the next day — it’s not guaranteed, but it is not impossible. If they ever end up not playing the Prisoner’s Dilemma, the series of games ends, and they never meet to play again. Intuitively, without knowing when the series of games will end, players won’t know when to switch from Quiet to Betray.

In this larger game of the indefinitely repeated Prisoner’s Dilemma, there are infinitely many strategies available. Basically, any rule that tells you when to Betray, and when to stay Quiet, is a strategy in this game. “Always play Quiet” is one strategy, as is “alternate between Quiet and Betray each game” is another, but there are two strategies in particular that are worth exploring.

One strategy is inspired by “the Foole” in Thomas Hobbes’ Leviathan, who is described as follows:

“He, therefore, that breaketh his Covenant, and consequently declareth that he think that he may with reason do so, cannot be received into any society that unite themselves for Peace and Defense, but by the error of them that receive him”

Hobbes’ Foole agrees to Covenants with others, but then breaks them for personal gain. The Foole strategy is “Always Defect.”

Another strategy draws inspiration from Polemarchus in Plato’s Republic: “Justice is to render every man his due … doing good to friends and harm to enemies.” This strategy begins with goodwill; it begins by playing Quiet, but if the other player ever defects, it plays Defect forever after. It is known as the Grim Trigger strategy.

Consider this version of the Prisoner’s Dilemma:

If the probability of each successive game of the Prisoner’s Dilemma is 60%, then (skipping the details of the math), the game where players can choose between The Foole and Grim Trigger looks like this:

This game is also a Stag Hunt. The Foole is akin to Hare hunting: risk dominant with a guaranteed minimum payoff of 2½. Grim Trigger, which takes a leap of faith with its initial cooperation, is akin to Stag hunting.

This resolves the Prisoner’s Dilemma by making mutual cooperation possible, but it does not make it guaranteed.

There are some morals for game designers here. First, there is no way to resolve the Prisoner's Dilemma without changing the game somehow, so if you find a dilemma like this in your game you actually do need to work to resolve it. Second, resolving it only makes mutual cooperation possible; players could still sensibly end up with mutual betrayal. By tweaking the relative payoffs, you could make for very interesting dilemmas in a game with social deduction elements (akin to Dead of Winter). The base element for a game could certainly be starting with a Prisoner's Dilemma, and have the players find their own way to turn it into a Stag Hunt, and then to end up at mutual cooperation.

The key to this particular resolution was the indefinite, as opposed to fixed, timeline. If players know when the game will end, they can reason back and inform their earlier actions. Keep this in mind when deciding whether you want a fixed end condition or a variable one. An interesting mechanism could be one that extends the amount of times players interact based on how they interact: perhaps cooperation leads to more rounds, but betrayal gives greater payoffs?

Here is another way to change the game to make mutual cooperation possible. Think of good, old “Rock, Paper, Scissors.” Imagine it with just Rock and Scissors -- what would players play? Well, the best strategy is all Rock, all the time, because it guarantees no worse than a tie (it **weakly dominates **Scissors).

“Rock, Scissors” is an uninteresting game, with players each choosing the strategy that leads to an uninteresting outcome. In this respect it is very much like the Prisoner's Dilemma.

What happens when you add Paper, a strategy that beats the dominant Rock, but loses to the dominant Scissors, to the mix? The result, classic “Rock, Paper, Scissors” is a game where the equilibrium strategy to randomize equally between three choices. This makes all outcomes of the game possible, not just the Rock-Rock tie.

The moral of “Rock, Paper, Scissors" for the Prisoner's Dilemma is that, if you want a certain outcome to become possible (like Rock>Scissors, or even Scissors-Scissors, which are both impossible in “Rock, Scissors") you can introduce a brand new strategy to dislodge the previous equilibrium. By tweaking the payoffs, you can tweak the probabilities of the equilibrium, and hence the probabilities of the various outcomes.

Adding a strategy to the Prisoner's Dilemma does not mean that it has to become a combative cycle like “Rock, Paper, Scissors." The added strategy, when chosen properly, can make any number of positive player interactions (including mutual cooperation) possible. Like all resolutions to the Prisoner's Dilemma, though, it only makes these outcomes possible, not guaranteed.