1. Introduction

An open access resource such as a fishing ground, an irrigation system, or a forest is called a common-pool resource (CPR). According to Ostrom et al. (1999), “CPRs include natural and human-constructed resources in which (i) exclusion of beneficiaries through physical and institutional means is especially costly, and (ii) exploitation by one user reduces resource availability for others” (see also Ostrom 2006, 151). Since a CPR is a private good (i.e. excludable and rival), once an appropriator catches a fish, it is no longer available for others to catch. In such a situation, each appropriator has an incentive to exploit as many units as possible. This may eventually lead to overexploitation or the destruction of the CPR, which is well expressed as “the tragedy of the commons” (Hardin 1968). The problem is often formalized in terms of game theory and the concept of a Nash equilibrium, where, in general, the resources are overexploited at the Nash equilibria of CPR dilemma games (e.g. see Gordon 1954; Gould 1972; Dasgupta and Heal 1979; Falk et al. 2001).

However, the static analysis based on the Nash equilibrium entails a number of restrictive assumptions, namely that players (i) are rational, (ii) have complete information about others, and (iii) respond simultaneously. The problem is that these assumptions are often violated in realistic situations, including laboratory experiments. For example, actual players may not be rational enough to determine the Nash equilibrium immediately, and may require a series of trials to achieve it. Thus, it is important to consider how the dynamics of players’ allocation strategies occur and whether it converges to a Nash equilibrium. Now, let us say that a Nash equilibrium of a game is dynamically stable (or simply, stable) if it gives an asymptotically stable fixed point of a dynamic version of the game; a fixed point is asymptotically stable if all nearby solutions not only stay nearby, but also tend to the fixed point (Hirsch and Smale 1974). In this case, a “dynamic version” refers to a system of differential or difference equations that define the way players adjust their behavior over time. Thus, the stability of an equilibrium depends on what rule is adopted for the behavioral adjustment.

In this study, we investigate the conditions for the dynamic stability of the Nash equilibrium of a CPR dilemma game. However, to define the dynamics, we need to specify the decision-making rule (or the rule for behavioral adjustment) of the players. This is an important issue, given that stability is largely dependent on the rule. Experiments provide a hint as to the solution to this problem. In their experiments, Healy (2006) and Healy and Mathevet (2012) implemented five types of public goods mechanisms, including the voluntary contribution mechanism. They found that subjects apparently adopt myopic responses to the decisions by the other players in the previous round, or in some recent rounds. Given this observation, it seems reasonable to model a myopic decision rule. In this study, for simplicity, we focus on the best-response dynamics, where at each discrete time step, each player adopts the best response to the other players’ decisions in the previous time step, although there are many potential ways of modeling myopic decision-making [e.g. see the dynamic oligopoly models in Bischi et al. (2009)]. We believe, however, that our arguments apply qualitatively to other myopic decision rules as well.

We first provide a detailed theoretical basis for the instability of the Nash equilibrium of a CPR dilemma game, though our analysis is confined to the deterministic best-response dynamics with regard to the underlying dynamics. Our theory applies to a broad range of production functions, including those adopted by Ostrom and others as special cases (Walker et al. 1990). The best-response dynamics of our CPR dilemma game are described by a set of difference equations. The key factors of instability in these dynamics are myopic decision-making, owing to limited rationality, the nonlinearity of payoff functions, and the number of appropriators. Using a local stability analysis, we show that with at least four appropriators, the equilibrium is unstable (we also find that the equilibrium is stable if the number of appropriators is two). Thus, the theoretical foundations of the tragedy of the commons based on a static analysis might not be reliable. Thus, we need to reconstruct these foundations based on dynamic analyses instead.

The dynamic instability of the CPR dilemma game also has an implication for the efficiency of the resource use. Here, we say that a CPR is efficiently (inefficiently) used if the social payoff is equal to (lower than) its possible maximum, where the social payoff is defined as the sum of the payoffs of all appropriators, less the sum of the initial endowments of all appropriators (see Section 4). By definition, efficiency in our terminology implies Pareto efficiency, but not vice versa. A well-known common property of CPR dilemma games is that the Nash equilibrium is inefficient (Hanley et al. 2007; Wiesmeth 2011). In fact, we reveal that the instability is likely to bring additional inefficiency to the system; that is, the social payoff averaged over time may be even lower than that at the Nash equilibrium. This implies that previous authors have underestimated the level of inefficiency in CPR problems. Thus, instability may have practical implications for the management of CPRs and, therefore, deserves detailed mathematical and empirical investigations.

The rest of the paper is organized as follows. Sections 2 and 3 provide specific examples of CPR dilemma games with an unstable Nash equilibrium. Section 4 provides a general mathematical result as to the instability of the Nash equilibrium. Section 5 provides statistical analyses relating our results to experimental data. Lastly, Section 6 summarizes our findings and discusses their implications.

2. Example 1 (The WGO model)

The experiment of Walker et al. (1990) (henceforth, WGO) is remarkable in two respects. First, it was the first to use a nonlinear payoff function for a CPR experiment. Since the procedure for experiments with nonlinear payoff functions is complicated, many experimental social scientists still use linear payoff functions (Noussair et al. 2011; Osés-Eraso and Viladrich-Grau 2011; Botelho et al. 2014; Becchetti et al. 2015), although some do use nonlinear functions (Rodriguez-Sickert et al. 2008; Vyrastekova and Van Soest 2008; Hayo and Vollan 2012; Cason and Gangadharan 2014). Second, the number of appropriators is eight in their experiments, which is unusually high for CPR experiments. For example, among the eight papers we surveyed (Rodriguez-Sickert et al. 2008; Vyrastekova and Van Soest 2008; Noussair et al. 2011; Osés-Eraso and Viladrich-Grau 2011; Hayo and Vollan 2012; Botelho et al. 2014; Cason and Gangadharan 2014; Becchetti et al. 2015), the number of appropriators was four in four papers, five in three papers, and six in one paper. These two properties (i.e. nonlinearity and many appropriators) are noteworthy, because our theory suggests that the Nash equilibrium is destabilized when the payoff is nonlinear and the number of appropriators exceeds a small threshold (the threshold is four in our model; see Section 4).

In the experiments of Ostrom et al. (1994) (Walker et al. 1990; Ostrom et al. 1994), the average payoffs approached the values predicted by the Nash equilibrium, but subjects rarely played equilibrium strategies. Instead, players showed “unexplained pulsing behavior” in all the experiments (Ostrom 2006). More specifically, subjects increased their investments in CPR until the payoff dropped significantly, but then reduced the investments, resulting in a recovery of the payoff; this pattern was repeated over time (see also Sturm and Weimann 2006). Keser and Gardner (1999) also observed that less than 5% of subjects obeyed the prediction of Nash equilibrium, and similar patterns have been observed in more recent experiments (Cardenas 2004; Carpenter and Cardenas 2011). We interpret such pulsing behavior as a cyclic solution around an unstable Nash equilibrium, as argued below.

Now, we describe the WGO model, and show that its Nash equilibrium is unstable and a cycle emerges under the parameter values they used in their experiments. Imagine a society with n (≥2) appropriators, each of whom faces a decision problem of how to divide his/her endowment into labor to catch fish and personal leisure time. Let wi be appropriator i’s initial endowment, which is the total possible leisure time, and xi be the labor input for catching fish. Following WGO, let f(Sxi)=aSxib(Sxi)2 be the production function for the fish. The number of fish that appropriator i catches is proportional to the appropriator i’s relative contribution to the total labor inputs of all appropriators. Without loss of generality, we set the unit price of fish to one and denote the wage rate by p. Then, appropriator i’s income or payoff is given by

 (1)

where xi=Sjixj. The optimization problem for appropriator i is to maximize mi(xi, xi), subject to 0≤ xiwi, for given xi. Partial differentiation of eq. (1) with respect to xi yields

 (2a)

which is equal to zero when

 (2b)

Thus, eq. (2b) gives the best response for appropriator i if it is feasible (i.e. 0≤xiwi). If it is not feasible, the best response must be a boundary value. More specifically, the best response for appropriator i, denoted by ri(xi), is given by

 (3a)

if

 (3b)

if and

 (3c)

if

The best-response function (3a–c) does not include appropriator-specific parameters, except wi. Hereafter, we assume that the initial endowment is the same for all appropriators (i.e. wi=w), so that they all have the same best-response function and, hence, we may write ri as r (e.g. Walker et al. 1990; Ito et al. 1995). Then, it follows that a list of labor inputs is a Nash equilibrium if for all i.

In WGO’s basic setup, the parameter values are n=8, p=5, a=23, b=1/4, and w=10. Substituting these values into eqs. (3a–c), we find that the best-response function is for 52≤xi, and r(xi)=10 for xi<52. Figure 1 illustrates this best-response function (polygonal line k-f-d). Line 0-c represents xi=(1/(n−1))xi=(1/7)xi; on this line, all appropriators invest the same amount of labor. The intersection of lines f-d and 0-c (indicated by e) is the Nash equilibrium, where all appropriators invest eight in labor. Note that the Nash equilibrium is unique and symmetric in this game (Ostrom et al. 1994).

Figure 1: 

The best-response curve in the WGO model and the stability of the Nash equilibrium. The purple line represents the best response of each appropriator to given values of x-i. Gray lines represent the set of points where all appropriators invest equal amounts of labor (0e, 0e, and 0c are for n=3, 6, and 8, respectively). Parameter values are p=5, a=23, b=¼, and wi=10 for all i. Green and orange arrows represent the best-response dynamics for the unstable and stable cases, respectively.

Let us consider the best-response dynamics in the WGO model to demonstrate the instability of the equilibrium and the emergence of a cycle. In the best-response dynamics, by definition, appropriator i chooses the best response to the total labor inputs of other appropriators at the current time step t as the labor input at the next time step t+1. For example, let us suppose that appropriators choose x1=(1, 2, 3, 4, 5, 6, 7, 8) at the first time step, and then consider appropriator 1’s choice in the second time step. Given that (indicated by point a), the best response by appropriator 1 should be (point b). Similarly, we find that and that for all i. Hence, the labor inputs in the second time step should be given by x2=(10, 10, 10, 10, 10, 10, 10, 10) (point c). Following the same argument, we find that (point d), and likewise for all i. That is, x3=(1, 1, 1, 1, 1, 1, 1, 1) (point g). Then, for all i (point h) and, hence, x4=x2 (point c). Similarly, for all i (point d). That is, x5=(1, 1, 1, 1, 1, 1, 1, 1) (point g). The system repeats cycle g-h-c-d later on and, hence, the Nash equilibrium e is never reached. This cycle may explain the “unexplained pulsing behavior” in Ostrom (2006); we test this hypothesis in Section 5.

WGO (1990) and Ostrom et al. (1994) (henceforth, OGW) found that almost none of the subjects followed the Nash equilibrium labor input (i.e. xi=8). In the post-experiment questionnaires OGW administered, they reported that many subjects used the rule of thumb “Invest all tokens in Market 2 whenever the rate of return is above $.05 per token in previous decision rounds” (OGW, 121). Although OGW claim that this behavior is inconsistent with the best-response behavior, this is actually close to best response. Appropriator i’s investment in Market 2 in their experiment is xi (in our notation) and the rate of return in Market 2 is The payoff of $.05 in their ­experiment corresponds to a payoff of 5 in our framework. Solving 23−(1/4)∑ xj=5, we have ∑ xj=72. That is, on average, xi=9 and, hence, xi=63. Thus, in terms of our model, the rule of thumb adopted by their subjects reads as “Choose 10 if xi is at most 63,” which roughly corresponds to the best-response behavior.

WGO and OGW conducted two treatments with w=10 and w=25. The amplitude of the cycle (i.e. the difference between the maximum and minimum labor input by an individual in the cycle; the distance between points g and h in Figure 1) is 9 under w=10, as shown above. On the other hand, the predicted amplitude of the cycle is 25 under w=25 (results not shown) and, hence, is larger than that under w=10. As argued later in Section 4, an increased amplitude tends to result in a reduction in efficiency, which was indeed observed in their experiments (see figure 5.4 on p. 119 in OGW). That is, the instability of the system tends to reduce the efficiency, even compared with the Nash equilibrium.

To see the potential effect of the number of appropriators, let us change the number of appropriators from eight to six. Then the best-response function is k-e′ and line 0-c becomes line 0-e′. For example, let x1=(1, 2, 3, 4, 5, 6), and then consider how the best-response dynamics occur. For example, for appropriator 1, we have and, hence, we have Similarly, for all i. Therefore, the appropriators’ labor inputs in the next time step is x2=(10, 10, 10, 10, 10, 10), which is the Nash equilibrium (e′). Given that this equilibrium is itself located on the best-response curve k-e′, we have and the system stays at the equilibrium. It is easy to see that the system is stable for n=2, 3, 4, and 5 as well. For example, if n=3, the system attains e″ in a few steps, and stays there. Figure A2.1 in Appendix 2 shows a bifurcation diagram, which shows the change in the attractor of the system against the number of appropriators under w=10. It reveals that the Nash equilibrium is stable as long as n≤6, but is destabilized for larger n-values, resulting in a cycle with period 2.

The stability property for n=2 to 6 relies on the fact that the equilibrium is on the boundary of the strategy set. As shown in Appendix 1, an equilibrium is unstable if and only if the slope (denoted by α) of the best-response curve at the equilibrium is steeper than 1/(n–1) in absolute value. Since the slope of the best-response curve is zero (α=0) on the boundary, at which α=0<1/(n–1), it follows that boundary equilibria are always stable. On the other hand, as argued in Section 4, interior equilibria are stable only when the number of appropriators is less than four. Somewhat surprisingly, this is true as long as the production function satisfies the set of mild conditions given by eq. (4) (see Section 4).

It is easy to see from Figure 1 that we can obtain a boundary (and, hence, stable) equilibrium for arbitrarily many appropriators by decreasing the initial endowment w sufficiently. Thus, it may be possible to stabilize even the equilibrium of a real CPR system, if we can manipulate the initial endowment externally, for example, through a political institution (for further arguments, see the Discussion section).

3. Example 2 (production function with a square-root form)

In WGO’s model, considered in Section 2, the best-response curve had a “plateau” (i.e. the flat part on the upper boundary), in which the best response is constrained by the initial endowment. The best-response curve was also special in the sense that its non-flat part had a constant negative slope. The latter property results from the quadratic shape of the production function f(∑ xi)=axib(∑ xi)2. While the quadratic production function is mathematically simple, it is somewhat unrealistic, given that the amount of production becomes negative for total labor inputs higher than a threshold a/b. In this section, we consider a more realistic production function with a square-root form, where c>0. This production function is always increasing with the total labor inputs, but with a diminishing return. We further assume that the initial endowment is high enough that the Nash equilibrium is always an interior point.

Figure 2 illustrates the best-response curve generated from the present model under c=7.45 and w=20 (i.e. curve s-h-t and the flat part after t). Consider the case of four appropriators (i.e. the entire box). The slope of line 0-c is 1/3, and the Nash equilibrium is given by the intersection of line 0-c and the best-response curve (point e). Thus, the Nash equilibrium is not at the boundary of the box, but is an interior point. Suppose that every appropriator’s input is the same at point a. Now, the best response to point a is point b. Then, the next point should be d, and the best response to point d is point f. Thereafter, the response should be g-h-k-m. This cycle continues without reaching e. This could be the source of the pulsing behavior. The cycle is persistent, because the equilibrium is repelling in an oscillatory manner. That is, a slight deviation of the total labor input from the equilibrium value causes a response by each appropriator. However, the simultaneous responses by all appropriators results in an excessive response in total labor input, resulting in an expansion of the deviation in the opposite direction. This continues until the dynamics are absorbed into a cycle.

Figure 2: 

Stability properties when p=1 and w=20. The purple line represents the best response of each appropriator to given values of x–i. Gray lines represent the set of points where all appropriators invest equal amounts of labor (0c and 0c are for n=3 and 4, respectively). Green and orange arrows represent the best-response dynamics for the unstable and stable cases, respectively.

To see the effect of a reduction in the number of appropriators, consider the case of three appropriators (i.e. the rectangle spanned by diagonal 0-c′). The slope of line 0-c′ is 1/2 and the Nash equilibrium is given by the intersection of line 0-c′ and the best-response curve (point e′). Suppose that every appropriator’s input is the same at point a. The best response to point a is point b. Then, the next point should be d′, and the next-best response is point f ′. Similarly, the process goes to g, h′, and then continues until point e′, the Nash equilibrium, is reached.

As already argued, an interior equilibrium is unstable under n≥4 as long as the production function satisfies the set of mild conditions given by eq. (4) (see Appendix 1). On the other hand, the stability of an interior equilibrium under n=3 also depends on the shape of the per-capita production function, defined as f(x)/x (see property 5 in Appendix 1); the present case happens to satisfy the stability condition.

As Figures 1 and 2 show, the best-response function itself is, in general, independent of the number of appropriators. However, the number of appropriators influences the stability of the system through its influence on the sum of the responses by all appropriators. That is, given that all appropriators respond simultaneously, the response of the total labor inputs is n times as large as the response of the input by each appropriator (the response of x-i is n–1 times as large as that of xi). As an illustration, suppose that the number of appropriators increases from three to four in Figure 2. Although the best-response function stays unchanged, the width of the box expands to the right and, hence, the jump from b should be amplified from d′ to d. This amplification of the response by the total labor inputs results in the destabilization of the Nash equilibrium. In fact, the system is always unstable if the number of appropriators is at least four, as shown in the next section. Figure A2.2 in Appendix 2 gives a bifurcation diagram of the above model, which shows the ω-limit set of the system against the number of appropriators. As the figure shows, the Nash equilibrium is destabilized for n≥4, resulting in a cycle with period 2.

4. General model

In this section, we explore the mathematical conditions for an interior equilibrium to be destabilized using a general production function satisfying a set of mild conditions. We assume that the initial endowment of each appropriator is large enough to ensure that any Nash equilibrium is an interior point. We consider a production function f satisfying

 (4a)

 (4b)

and

 (4c)

where the last condition ensures the concavity of the production function.

Suppose that appropriator i chooses at time t+1, where t=1, 2, ... Then, the system

 (5)

is asymptotically stable at Nash equilibrium if the system

 (6a)

is asymptotically stable, where

 (6b)

Note that the system in eq. (6) is the linear approximation of the system in eq. (5) at the Nash equilibrium Let v-v′ be the tangent line at the Nash equilibrium e in Figure 2. Then, it is regarded as the linear approximation of the best-response curve at e and is the slope of v-v′. Then, we have the following propositions. The proofs are given in Appendix 1.

Proposition 1. The system in eq. (5) is locally stable at the Nash equilibrium if and only if

 (7)

Since the per-capita production f(x)/x is always greater than the marginal productivity owing to the concavity of f, it holds that f ′(x)−f(x)/x<0. If n=2, the l.h.s. of eq. (7) becomes so that the stability condition is satisfied. On the other hand, if n≥4, both the first and second terms in the l.h.s. of eq. (7) are negative and, hence, the condition is not satisfied.

Proposition 2. (i) If the number of appropriators is two, then the system in eq. (5) is asymptotically stable at the Nash equilibrium. (ii) If the number of appropriators is at least four, then the system is unstable at the Nash equilibrium.

When n=3, the stability is indeterminate (the details are given in the Appendix 1). Let us consider the implications of Proposition 2. First, the result is very general. It applies widely to the best-response dynamics generated from CPR dilemma games defined by a payoff function (1) and a production function satisfying (4). Second, the number of appropriators is important to the stability of the Nash equilibria of CPR dilemma games. As we noticed, the number of appropriators in WGO, OGW, and Keser and Gardner (1999) is eight. In this sense, we found a minimum number of appropriators sufficient to capture the essence of CPR instability. Although several researchers, such as Noussair et al. (2011), Osés-Eraso and Viladrich-Grau (2011), and Becchetti et al. (2015), use n=4, their production functions are linear. Therefore, the best-response function is flat and the Nash equilibrium is always stable.

Instability tends to reduce efficiency, even compared to that in a Nash equilibrium, although we could not derive precise mathematical conditions on which efficiency reduction occurs. Let us define the social payoff function U(x) as the sum of the payoffs of all appropriators less the value of total endowments, i.e. U(x)=f(x)−px. As mentioned in the Introduction, we say that a CPR is used efficiently if the social payoff function is maximized. Inefficiency is measured by the deviation of the social payoff, averaged over time, from the maximum social payoff value. Note that the social payoff function is, by definition, independent of the initial endowments, and can be negative. In the case of OGW, U(x)=18x−(1/4)x2, as shown in Figure 3. In the figure, the maximum point is indicated by P and the Nash equilibrium by e. When w=10, the best-response dynamics result in the cycle between 1 and 10 in the individual labor input, which corresponds to the cycle between a and c in Figure 3. Given that a and c occur with equal frequencies, the middle point b between a and c should be the average outcome of the dynamics. In this example, the total payoff at e is bigger than that at b, and the efficiency is lower than predicted by the Nash equilibrium. In general, whether the social payoff at point b is larger than that at point e depends on the shape of the social payoff function, but it is not easy to determine the condition for this to be the case. However, this is the case, at least if the points in the cycle are symmetric with regard to the distance from the Nash equilibrium, such as points f and g are here. The data in OGW show that realized total labor inputs are scattered around the Nash equilibrium when w=10 and, hence, f and g are much closer to e. When w=25, the best-response dynamics result in the cycle between 0 and 25, which corresponds to line 0-d in Figure 3. However, the experimental data on x in OGW fall in the range between 42 and 115, probably because the subjects did not strictly obey the best-response dynamics. Therefore, the data are above 0-d, although the social payoffs are still negative in the early rounds of the experiment.

Figure 3: 

Instability reduces efficiency. The red curve gives the social payoff function of the WGO model (i.e. U(x)=18x−(1/4)x2). U is maximized at point P, and e is the Nash equilibrium. For more details, see the text.

5. Statistical analysis of the WGO data

In the previous sections, we provided theoretical arguments to explain how the Nash equilibrium in a CPR dilemma game is destabilized. Here, we examine whether our hypothesis is consistent with empirical data. In this section, we conduct statistical analyses to show that the individuals’ data in the WGO experiments can be interpreted as a result of myopic decision rules and deterministic cyclic dynamics around an unstable Nash equilibrium. Note that Cárdenas et al. (2015) invoked the concept of a “sampling equilibrium,” which is known to better explain experimental data than the Nash equilibrium does (Selten and Chmura 2008), to explain the deviation of individual behavior from the Nash strategy. Their approach also invokes the instability of the Nash equilibrium, but considers stochastic dynamics (i.e. the dynamics of the probability of taking each action). We distinguish between different hypotheses, including our hypothesis based on deterministic best-response dynamics and the hypothesis that subjects’ behaviors are in a sampling equilibrium. For this purpose, we consider a more general model than that used in the previous sections. In the general model, subjects’ decision-making consists of three components: the belief formation process, the other-regarding preference, and the stochastic best response. We estimate the parameter values by fitting the model to the individuals’ data in the WGO experiments (we obtained the data through personal communication with Prof. James Walker) to check if the data are consistent with our theory.

Healy (2006) provides experimental evidence to support a “k-period average” model, in which players form their beliefs at the current period, based on the observations in the previous k periods. Here, we instead use a time-weighted average to model the belief formation (cf., Cheung and Friedman 1997). Let bi,t+1 be subject i’s belief about the total labor inputs of the other group members at period t+1, si,t be the observed total labor inputs of the other subjects at period t, and be the time-dependent weighting factor assigned to the observation at period u. We assume that subject i’s belief bi,t+1 for period t+1 is formed as follows:

 (8)

If γi=0, the belief is exactly the observation at the previous period. In contrast, if γi=1, the belief is the average of all previous observations. In this sense, γi determines the lag length of the information used by subject i, and this model is an analog of the k-period average model. Note that the summation starts from period 1, which means that we omit any beliefs prior to period 1. Note that γi might also take a value outside the range [0, 1] (Cheung and Friedman 1997). This is, in theory, possible, although counterintuitive; γi>1 indicates that subject i pays more attention to old information than to recent information, while γi<0 indicates that the effect of past information changes its sign in each period.

To capture subjects’ concerns about other group members’ payoffs, following the design of Andreoni and Miller (2002) and Cox et al. (2007), we assume the following utility function for player i:

 (9)

where yi is the material payoff of subject i, is the average material payoff of other group members, and βi is a coefficient measuring subject i’s concerns about other members’ material payoffs. If βi=0, subject i is purely self-interested. If βi>0, player i is altruistic or reciprocal. If βi<0, player i is spiteful or competitive.

With the utility given by equation (9), we assume that the decisions by subjects follow stochastic best-response dynamics (see Fudenberg and Levine 1998). Let

 (10)

where si,t+1 is the observed choice of subject i at period t+1, ui,t+1 is the utility of subject i at period t+1, wi is the endowment of subject i, and λi>0 is a factor that captures the decision errors of subject i. Equation (10) defines the probability that subject i chooses si,t+1 at period t+1, given si,t, si,t−1, …, si,1. When λi→0, all choices for subject i have equal probability, which means that subject i makes decisions at random. As λi becomes large, subject i becomes more sensitive to the difference in utility between different choices; in particular, when λi→∞, player i chooses the best strategies (i.e. the strategies resulting in the maximum utility) with probability one.

Based on the meanings of the parameters, it is clear that subjects are exactly following the best-response dynamics when γi→0, βi→0, and λi→∞. To obtain parameter estimates, we maximize the following log-likelihood function:

 (11)

where T is the number of rounds in the sample. We estimated the parameter values over two samples separately: the data from the periods in the first half, and the data from the periods in the second half. Table 1 summarizes the results. In the following, we interpret the results for the three parameters, λi, βi, and γi, individually.

Table 1

The results of parameter estimation.

Experiments

w=10
w=25
Periods 2–16 Periods 16–30 Periods 2–11 Periods 11–20
γi 0.410 0.653 0.833 0.641
(0.007) (0.003) (0.009) (0.003)
βi –1.581 –0.491 –0.998 –0.290
(0.055) (0.011) (0.016) (0.010)
λi 0.095 0.340 0.045 0.082
(0.002) (0.005) (0.001) (0.001)
Obs. 360 360 240 240
lnL –677.80 –631.11 –718.50 –681.30

Jackknifed standard errors are shown in parentheses.

In both experiments (w=10 and w=25), λi becomes larger over time, which indicates that the decision errors become smaller (χ2(1)=24.98 for the experiment with w=10, and χ2(1)=9.52 for the experiment with w=25, by the likelihood ratio tests). We further note that the decision errors are larger in the experiment with w=10 than they are in the experiment with w=25. This makes intuitive sense, given that the number of possible choices are fewer for w=10 than in the case of w=25.

All estimates of βi are negative. This makes sense because the CPR environment is competitive. Interestingly, in both experiments, the estimates of βi become closer to zero over time (χ2(1)=13.14 for the experiment with w=10, and χ2(1)=8.44 for the experiment with w=25, by the likelihood ratio tests). This indicates that subjects are feeling worse when other group members increase their working hours in the beginning half of the experiment than when this occurs in the latter half of the experiment. This result implies that subjects become more self-interested with repeated trials, in both experiments.

The interpretation for estimates of γi is not obvious. It seems that players become more myopic in the latter half of the experiment with w=25 than they do in the beginning half. However, it seems the opposite is the case for w=10 (i.e. players are more myopic in the beginning than they are in the latter half of the experiment).

Overall, the above results show that decision errors become smaller and subjects become more self-interested over time. This implies that subjects’ behaviors become closer to the best response based on some previous observations (not only the last observation) with repeated trials in both the experiments. Given this statistical result, and the theoretical result that the best-response dynamics induce pulsing behavior, we hypothesize that the group sum pulses more in the latter half of the experiment than it does in the beginning half of the experiment.

To test this hypothesis, we compute the sample autocorrelation for the group sum in each group. Rassenti et al. (2000) also used the sample autocorrelation as a measure of pulsing behavior. One might think that we should calculate the sample autocorrelation for each subject’s data. However, the pulsing behavior of each subject is not related to our instability theory, unless players’ behaviors are synchronized. Thus, we only check the sample autocorrelation for the group sum. Table 2 summarizes the results.

Table 2

The sample autocorrelations.

Group Experiments
w=10
w=25
Periods 2–16 Periods 16–30 Periods 2–11 Periods 11–20
1 –0.133 –0.137 –0.131 –0.366
2 –0.005 –0.344 0.025 –0.210
3 0.086 –0.279 0.140 –0.221

These results support our hypothesis. All groups generate a negative sample autocorrelation in the latter half of the experiment, which indicates pulsing. However, three of the six groups have a positive sample autocorrelation in the beginning half of the experiment.

Now, we consider whether pulsing causes inefficiency in the experiments. As table 5.2 (page 117) and figure 5.4 (page 119) in OGW show, the low efficiency in the experiment with w=25 stems mainly from the beginning half of the experiment. However, we should not use the data from this period because subjects’ behaviors are still be in a transient phase. Therefore, we use the data from the latter half of the experiments. However, note that the latter half of the experiment might still include a transient phase. Table 3 shows the statistics used to test the hypothesis that pulsing causes inefficiency.

Table 3

Statistics to test the hypothesis that pulsing causes inefficiency.

Experiments
w=10, periods 16–30 w=25, periods 11–20
Centile (25%) 62 60
Centile (75%) 68 71
Interquartile range 6 11
Average payoff (standard deviation) 63.81 (8.16) 133.56 (22.87)
Predicted payoff, Nash 66 141
t-Statistic* –2.12 –3.59

*For the t-test, we first compute the average across periods for each individual, and then conduct the t-test over the sample of those averages.

We use the interquartile range to measure the amplitude of the pulsing in the group sum. This shows that the amplitude of the pulsing is larger in the experiment with w=25 than it is in the experiment with w=10. The t-tests reject the null hypothesis that the average payoff is equal to the payoff predicted by the Nash equilibrium for both experiments at the 5% level. This result supports our theoretical prediction that pulsing around the Nash equilibrium results in a reduced payoff. Furthermore, we compare the efficiency between the two experiments to determine whether the larger pulsing amplitude induces lower efficiency. The Wilcoxon rank-sum test shows that the efficiency in the experiment with w=10 is significantly higher than that in the experiment with w=25 (p-value=0.0259). As in the t-test, we compute the average across periods, for each individual, to eliminate the time series autocorrelation. Then, we conduct the Wilcoxon rank-sum test over the sample of those averages. These findings suggest that WGO’s result is consistent with our theory with regard to efficiency.

6. Discussion

We have shown that a system of simultaneous difference equations describing the best-response dynamics of a CPR dilemma game is locally unstable when the number of appropriators is at least four. Works such as Wiesmeth (2011) exemplify the case of two appropriators, showing that the Nash equilibrium is inefficient (see also Hanley et al. 2007). Although the system is locally stable for two appropriators, such a static analysis is problematic owing to instability if the number of appropriators is at least four, as seen in experimental CPR games (cf., Ostrom 2006). Furthermore, instability may cause additional inefficiency because the social payoff, averaged over time, is lower than predicted under the Nash equilibrium. Our statistical analyses using individuals’ data from the WGO experiments support the theoretical arguments. That is, we found that in the experiments, subjects were myopic, the choices of the subjects were indeed pulsing, and the pulsing resulted in reduced efficiency.

The statistical analyses in Section 5 revealed that in WGO’s experiments, subjects were more self-interested and their choices were closer to the best responses in the latter half of the experiments than they were in the beginning half. On the other hand, using post-experiment questionnaires, OGW found that many subjects used a rule of thumb that is very close to the best-response behavior. It may be that in WGO’s case, subjects were initially exploring the system, but later found the rule of thumb. This could explain why their behaviors were different to the best responses in the first half of the experiments, but were close to best responses in the latter half. It may be interesting to conduct questionnaires not only at the end of the experiments, but also in the middle, in order to reveal the process in which subjects acquire best-response behavior.

We found that interior equilibria are destabilized for a broad range of parameter values, while boundary equilibria are always stable. Importantly, we can create a boundary equilibrium by decreasing the initial endowment sufficiently for virtually any best-response function. Of course, it is difficult to control the initial endowment of a player in a real system if the “endowment” is determined by his/her internal capacities or capabilities (e.g. physical, biological, or monetary). However, we may be able to control the upper limit of the labor input by each player externally, for example, through a political institution. Thus, an external restriction on the use of a CPR may be important, not just for the efficient use of the resource, but also for stable production.

Another possible way to control the instability of a real system may be to manipulate the timing of appropriators’ responses. The equilibrium is unstable in a system with many appropriators because they all respond simultaneously in every round. The system may be stabilized if, in every round, only one (or a few) appropriators are allowed to change their behavior or, similarly, if appropriators choose their behavior in a sequential manner, and each of them responds to the preceding choices of other players. However, this mechanism may require a more complicated political rule to ensure fairness than simply controlling the initial endowment. We may also need further theoretical investigations to reveal the stability conditions for such systems with asynchronous decision-making.

Note that the instability issue in CPR games is closely related to that in oligopoly models (Theocharis 1960; Agiza 1998; Ahmed and Agiza 1998; Bischi et al. 2009). To see this, imagine an oligopoly of n firms. Let xi be the output level of firm i, and let g(x) be the inverse demand function. Assume that every firm has the same cost function c(xi). Then, the payoff function is given by

 (12)

which corresponds to eq. (1). Then, the Nash equilibrium of this game gives a stable fixed point of the best-response dynamics when

 (13)

(see Bischi et al. 2009), which is the counterpart to eq. (7). That is, letting g(x)=f(x)/x and c″=0 in eq. (13), we obtain eq. (7). While it is difficult to determine the exact minimum number of players required to make the system unstable when in the form of eq. (13), we have obtained the minimum number due to c″=0 in our CPR model. In the special case where g′<0 and g″=c″=0, eq. (13) becomes n(n–3)g′>0. Thus, the equilibrium is stable for n<3, neutrally stable for n=3, and unstable for n>3 (Theocharis 1960). On the other hand, if the inverse demand function is isoelastic (i.e. g(x)=k/x, where k is a positive constant, and c is linear), eq. (13) becomes (n–4)(n–1)g′>0. Thus, the equilibrium is stable for n<4, neutrally stable for n=4, and unstable for n>4 (Agiza 1998; Ahmed and Agiza 1998). This corresponds to the case of a CPR with a constant production function, f(x)=k. Interestingly, no previous authors have pointed out that CPR systems have a similar mathematical structure to oligopoly systems and, hence, a tendency toward instability. This implies that mathematical studies of CPR systems are lagging behind those on oligopolies.

Condition (7) (and even (13)) of stability is mathematically clear, but its economic meaning is less so. Unfortunately, we could find neither a simple economic interpretation for each term in the condition, nor an economic reason why four must be the critical number for instability. Each of the terms comes from complicated calculations, and represents joint outcomes of various effects, which makes an intuitive understanding extremely difficult. It might be worth exploring other representations of the stability condition to gain further economic insights.

Thus far, researchers have been arguing that CPRs cause a dilemma between the Pareto efficient and Nash allocations. However, in addition to this dilemma, we have shown that appropriators may suffer from additional inefficiency owing to dynamic instability. Thus, the problem is more complicated than first thought, and we should pay attention to the stability problem and the static inefficiency of the Nash equilibrium. Cyclic solutions caused by the instability may be a possible answer to the pulsing behavior of each appropriator’s labor input in the CPR experiments summarized by Ostrom (2006). Strictly speaking, our theoretical analysis applies only to systems where the payoff for an individual is given by eq. (1) [and, hence, does not apply to some experiments in the literature, such as Cardenas (2004)]. It would be worth investigating in future work the extent to which we can generalize the conclusion presented here. In particular, in Section 5, we considered a utility function with other-regarding preference terms. It would be interesting to study how the stability of the system would change if players with different other-regarding preferences were included.

In addition, there is evidence that players might use behavioral rules affected by institutional devices, such as peer punishment. For instance, Ostrom et al. (1992) showed that communication combined with opportunities for punishment works quite well to promote cooperation and increase the efficiency of a system. It is possible that such behavioral rules affect the stability of the Nash equilibria of CPR dilemma games. WGO, Walker and Gardner (1992), OGW, and Casari and Plott (2003) all use experimental designs with n=8. Hence, our theory predicts that their systems are unstable. Although they do not provide individual data in their papers, their studies seemingly share the pulsing behavior. On the other hand, Cason and Gangadharan (2014) used an experimental design with n=4, where our theory predicts that the Nash equilibrium is unstable. However, the standard errors of the individual data are quite low, and they report that peer punishment works well. It is possible that subjects used behavioral rules influenced by institutional devices, and that this stabilized the equilibrium by an unknown mechanism. Alternatively, the standard errors may have been low just because n=4 is the threshold for instability. These remain open questions. Vyrastekova and Van Soest (2008) and Hayo and Vollan (2012) used n=5, but it is uncertain if they found pulsing behavior.

Our theory reveals that CPR systems with myopic players are stable only under very special conditions, namely, with linear payoff functions, or with very small numbers of appropriators. Given that real CPR systems are unlikely to satisfy such special conditions, we suspect that the results of many laboratory experiments in previous studies, in which linear payoff functions and a very small number of appropriators are used, have only limited applicability. In future experiments, researchers should focus on the number of appropriators and the payoff structure in order to capture the essential behavior of real systems. We should also analyze data from dynamic viewpoints, rather than from static viewpoints, as we did in Section 5. That is, we should focus on how, or to what extent players’ strategies fluctuate.