Transcript
Finitely repeated simultaneous move game. Consider a normal form game (simultaneous move game) ΓN which is played repeatedly for a finite (T ) number of times. The normal form game which is played repeatedly is called a "(one) stage game" (or "one shot game"). Allow players to play mixed "strategies" in the one stage game if they wish. After each round, all players observe the (pure) strategies actually played in the previous round and then play the next round. This entire T - stages game is a dynamic game of complete and imperfect information.
For t > 1,let ht be the history of the game (i.e., what players played) as observed till the end of (t − 1) rounds of play and before the tth round is played. Each possible t = 1, ...T, and each possible ht (for t > 1) defines a distinct decision node for each player. . In this dynamic game, the strategy of a player specifies what she is going to choose for each t = 1, ...T, and each possible ht (for t > 1). Payoff in the dynamic game: sum of payoffs over the T -stages (can also look at discounted sum).
Proposition. If the stage game ΓN has a unique NE, then there is a unique SPNE of the game where for all t = 1, ...T, and all ht, players play the NE of the stage game. Proof: Generalized backward induction.
More generally: Proposition. If the stage game ΓN has multiple NE, then any strategy profile of the dynamic game where for each t = 1, ...T, players play one of the NE of the stage game independent of ht, is a SPNE.
However, if ΓN has multiple NE, then there may be SPNE where players do not play any of the NE of the stage game ΓN for some t.
Example. Suppose the following normal form game is repeated twice. Payoff : sum of payoffs in the two stages. ⎡
⎤
L C R ⎢ T 1, 1 5.0 0, 0⎥ ⎢ ⎥ ⎢ ⎥ ⎣M 0, 5 4, 4 0, 0⎦ B 0, 0 0, 0 3, 3 The stage game has two NE: (T, L), (B, R). SPNE1: {(T in stage 1, and in stage 2, play T whatever be the history), (L in stage 1, and in stage 2, play L whatever be history)} SPNE2: {(B in stage 1, and in stage 2, play B whatever be the history), (R in stage 1, and in stage 2, play R whatever be history)}
SPNE3: {(T in stage 1, and in stage 2, play B whatever be the history), (L in stage 1, and in stage 2, play R whatever be history)} SPNE4: {(B in stage 1, and in stage 2, play T whatever be the history), (R in stage 1, and in stage 2, play L whatever be history)} These four SPNE correspond to playing some NE of the stage game in each period.
SPNE5: ⎡
⎤
L C R ⎢ T 1, 1 5.0 0, 0⎥ ⎢ ⎥ ⎢ ⎥ ⎣M 0, 5 4, 4 0, 0⎦ B 0, 0 0, 0 3, 3 Player 1: Play M in stage 1. In stage 2, play B if (M, C) has been played in stage 1 and play T,otherwise. Player 2: Play C in stage 1. In stage 2, play R if (M, C) has been played in stage 1 and play L,otherwise.
Generalized backward induction. Subgames in the second stage are of two types: (i) The one following (M, C) being played in stage 1 (ii) The ones following (M, C) not being played in stage 1 The above strategies induce NE in both classes of subgames.
The reduced game in stage 1 (given the above strategies): ⎡
⎤
L C R ⎢ T 2, 2 6.1 1, 1⎥ ⎢ ⎥ ⎢ ⎥ ⎣M 1, 6 7, 7 1, 1⎦ B 1, 1 1, 1 4, 4 The specified strategies for the first stage clearly a NE in the reduced game. Thus, this is a SPNE.
In SPNE5, players do better than the if they played the best (or Pareto efficient) NE of the stage game twice. They behave cooperatively in the first round (even though playing cooperatively i.e., (M, C) is not a NE in the one stage game). In stage 2 (last period), players must play one of the two NE of the stage game as it is essentially a one shot game. However, multiplicity of NE here allows players to incorporate the threat of playing the bad NE rather than the good one in case they do not play cooperatively in the first round. This is a credible threat (if we ignore renegotiation possibilities). This illustrates: finitely repeated interaction can induce "cooperation" in early periods when there are multiple NE in the stage game.
In infinitely repeated games, we will see that there are SPNE that involve cooperative play even though there is a unique NE in the stage game.
A deep problem with sequential rationality. SPNE: players should play an SPNE wherever they find themselves in the game tree, even after a sequence of events that is contrary to the prediction of the theory (i.e., the actual play that ought to be induced if the players played the SPNE strategies).
Example. (Centipede game). Finite game of perfect information. 2 players 1 & 2. Each player starts with $1 in front. They alternate saying "stop" or "continue". When a player (whose turn it is to move) says "continue", $1 is taken by a referee from her pile and $2 is added to rival’s pile. When a player says "stop", play is terminated and each player receives the money currently in her pile. Play stops in any case if both players’ pile reaches $100.
•1 S.
&C
" #
1 1
•2 S.
&C
" #
0 3
•1 S. " #
2 2
&C •2 ...............
..........•2 S. "
&C #
97 100
•1 S.
&C
"
•2
#
99 99
S. "
&C #
98 101
"
#
100 100
Backward induction: unique SPNE is that players choose stop at each decision node where they are asked to move.
In actual play of this SPNE., play will end in the first move with player 1 stopping the game and both players getting $1 each. Really bad outcome, considering that they could get $100 each if they played to continue every time. Is the SPNE a reasonable prediction? Player 1 says stop in stage 1, because she thinks player 2 will choose stop at her first turn. But if player 1 thinks that either : (i) player 2 is not fully rational and therefore does not compute the SPNE by backward induction or (ii) player 2 is rational but does not know whether player 1 is rational and thus, observing player 1 choose to continue (when SPNE says player 1 should play stop), she supposes that player 1 is not rational (in the sense
of playing SPNE) and therefore, if she chose to continue, player 1 would not stop the game in the next stage but would actually continue it further allowing player 2 to move again - then it may be optimal for an actually rational player 1 to continue in the first stage. Note that arguments like (i) or (ii) involve some contradiction to common knowledge of rationality. SPNE denies this possibility - however, it then leaves open the question of how players think about the game and other players when they find themselves at decision nodes that ought not be reached if players played by sequential rationality. One resolution: treat deviations from SPNE as mistakes that occur with extremely small probability and unlikely to be repeated again. This is the approach taken by a somewhat different refinement concept called "trembling hand perfection".
Application of SPNE: Bilateral Bargaining with Alternating Offers. Finite time horizon: T > 1, t = 1, ....T. Two player 1 & 2 bargain to determine how to split a dollar. They make offers in alternating time periods. First, player 1 offers a split ∈ [0, 1]. Then, player 2 either accepts or rejects. If she accepts, the split is immediately implemented and the game ends. If she rejects, nothing happens till period 2. In period 2, player 2 makes an offer.
Then, player 1 accepts or rejects. If she accepts, the split is immediately implemented and the game ends. If she rejects, nothing happens till period 3. In period 3, player 1 makes an offer. And so on... If by the end of period T , no offer is accepted, the bargaining is terminated and both players get 0. Each player has a time discount factor δ ∈ (0, 1) : getting $x in period t yields a (present value) payoff of δ t−1x to the player.
Unique SPNE through backward induction. In this equilibrium, players accept an offer if they are indifferent between accepting and rejecting. Suppose T is odd. Then, player 1 offers in period T. In the last subgame, player 2 decides whether to accept or reject. Optimal choice: accept any split (as the alternative is getting 0). Seeing this, player 1 offers 0 to player 2 in period T . So, the payoffs from equilibrium play in the subgame beginning in period T is (δ T −1, 0).
Now, consider period (T − 1) where player 2 makes an offer. In period (T − 1), player 1 knows that she can get a payoff of δ T −1 by rejecting and moving to period T. So, she will accept an offer if and only if it gives her a payoff ≥ δ T −1. On the other hand, player 2 knows that she will get zero if the bargaining moves to period T. So, her optimal offer to player 1 at the beginning of period T − 1 is $δ which will be accepted (yielding player 1 a payoff equal to δ T −1). The payoffs arising if the game reaches period T − 1 is (δ T −1, δ T −2(1 − δ)).
Working backwards, in period 1, an offer will be made by player 1 that is accepted by player 2 (leaving the latter indifferent between accepting and rejecting) and player 1’s payoff: v1∗(T ) = 1 − δ + δ 2 − ..... + δ T −1 1 − δ T −1 T −1 = (1 − δ) + δ 1 − δ2 δ = 1− (1 − δ T −1) 1+δ and player 2’s payoff v2∗(T ) = 1 − v1∗(T ) δ = (1 − δ T −1). 1+δ To derive the above expressions: need to use induction on T. Observe the first mover’s advantage (v2∗(T ) < 12 ) in the division of the dollar & that it shrinks as T increases.
If T is even, player 2 will make the first offer in the subgame beginning in period 2 (with T − 1) and as the latter is a bargaining game with odd number of periods, her payoff in this subgame is δv1∗(T − 1). Hence, in any SPNE, player 1 makes an offer of exactly this amount to her and player 1’s payoff is [1− δv1∗(T − 1)].
1 while As T → ∞, player 1’s payoff converges to 1+δ δ . player 2’s payoff converges to 1+δ
The asymptotic share of the dollar for player 2 is increasing in δ.
As δ → 1, the asymptotic division converges to ( 12 , 12 ). Infinite horizon: Rubinstein (1982). Same game but no longer terminated in any finite T. If the game goes on forever (no player accepts in any time period), both players receive zero payoff .
Unique SPNE: immediate agreement in period 1 with pay1 , δ ). offs ( 1+δ 1+δ Observe: Time Stationarity. For the proposer and the responder, the subgame beginning period 1 is exactly identical to that beginning in any odd period t > 1. The subgame beginning period 2 is exactly identical to that beginning in any even period t > 1. Further, the subgame starting in any odd period is identical to that starting in any even period but with the players’ roles reversed. This stationarity is used in establishing the SPNE payoffs and its uniqueness.
Let v1 denote the largest payoff that player 1 gets in any SPNE. This is also the largest payoff that player 1 gets in any SPNE of any subgame beginning in an odd period (when evaluated in terms of the present value at commencement of the subgame). The largest payoff that player 2 can expect in any SPNE beginning in any even period (when evaluated in terms of the present value at the commencement of the subgame) is also v 1.
Claim: Player 1’s payoff in any SPNE ≥ 1 − δv 1. [Why? If player 1’s payoff in any SPNE < 1 − δv 1, then player 1 can deviate and offer δv 1 + in period 1 to player 2 and this will be accepted as player 2 can make at most v 1 in any SPNE of the subgame beginning next period]. Define: v 1 = 1 − δv 1.
Next, we claim that v 1 ≤ 1 − δv 1. To see this, note that in any SPNE, player 2 will reject an offer in period 1 which is less than δv 1 because she can always get at least that much payoff in any SPNE of the continuation game beginning next period. So, if player 1 makes an offer in period 1 which is accepted, he can do no better than 1 − δv 1. Further, if player 1 makes an offer that is rejected by player 2, then since player 2 can get a payoff of at least δv 1 by moving to the subgame beginning in period 2, player 1 can earn no more than δ(1 − v 1) by making an offer that is rejected. Since δ(1 − v 1) < 1 − δv 1, we have the above inequality.
Thus, v1 ≤ 1 − δv 1
= v 1 + δv 1 − δv 1
so that v 1(1 − δ) ≤ v 1(1 − δ) i.e., v1 ≤ v1 which implies v 1 = v 1 = v1e.. Thus, SPNE payoff is uniquely determined.
Using v 1 = 1 − δv 1, we have v1e. =
1 1+δ
So, player 2’s payoff is δ . 1+δ Finally, agreement is reached in period 1 in any SPNE. It is optimal for player 1 to make an offer that is exactly equal to v2e. = δv1e.. v2e. =
Beyond Subgame Perfection. Example. E• Out.
In1 ↓
" #
0 2
In2 &
I • − − − − −• F. "
F.
&A #
−1 −1
" #"
1 0
Two pure strategy NE: NE1: {out, F if not out} NE2: {In1,A if not out}
#
−1 −1
&A " #
2 1
NE1 not credible. No subgame other than the entire game. So, subgame perfection cannot be used to rule out NE2.
To rule out NE2, we need to impose some kind of discipline on what player I 0s strategy can claim he would do if he were required to act at his information set. One way: (spirit of sequential rationality) incumbent firm’s action at his information set must be optimal for some belief about the probability distribution over the various nodes in her information set (e.g., her belief about the relative likelihood of the two entry strategies of the entrant firm). One can check that F is not optimal for the incumbent at her information set no matter what her belief is about the relative likelihood of In1and In2.
Weak Perfect Bayesian Equilibrium. Definition. A system of beliefs μ in extensive form game ΓE is a specification of a probability μ(x) ∈ [0, 1] for each decision node x in ΓE such that X
μ(x) = 1
x∈H
for all information sets H. For each information set, a system of beliefs indicates the relative likelihood of being at each of the information set’s decision nodes, conditional on the play having reached the information set.
Let E[ui | H, μ, σ i, σ −i] denote player i’s expected utility starting at her information set H if her beliefs regarding the conditional probabilities of being at the various nodes in H are given by μ, if she follows strategy σ i and her rivals use strategy σ −i. Definition. A strategy profile σ = (σ 1, ...σ I ) in an extensive form game ΓE is sequentially rational at information set H given a system of beliefs μ if, denoting by ι(H) the player who moves at information set H, we have E[uι(H) | H, μ, σ ι(H), σ −ι(H)] ≥ b ι(H), σ −ι(H)] E[uι(H) | H, μ, σ
b ι(H) ∈ ∆(Sι(H)). If strategy profile σ satisfies for all σ this condition for all information sets H, then we say that σ is sequentially rational given belief system μ.
The above definition ensures that strategies chosen by players specify actions at unreached information sets are optimal for some system of beliefs (no restriction on what system of beliefs).
However, we also need to ensure that for information sets that are reached with positive probability under a given strategy profile, the strategies chosen by the players at the information sets are optimal - not just for any arbitrary system of beliefs - but for beliefs that are consistent with the strategies.
In the spirit of Nash equilibrium, players should hold correct beliefs about opponents strategy choices and therefore, for any information set reached with positive probability (when players play according to their strategies), it should be possible to correctly forecast the relative likelihood of being at each of the decision nodes in the information set by using Bayes’ rule.
Definition. A strategy profile and a system of beliefs (σ, μ) is a Weak Perfect Bayesian Equilibrium (weak PBE ) in an extensive form game ΓE if the following holds: (i) The strategy profile σ is sequentially rational given system of beliefs μ. (ii) The system of beliefs μ is derived from strategy profile σ through Bayes’ rule whenever possible. That is, for any information set H such that Pr(H | σ) > 0, we must have Pr(x | σ) μ(x) = Pr(H | σ) for all x ∈ H.
* Every weak PBE is a NE. The only difference between a weak PBE and a NE is that the latter imposes no restriction at all on choice of actions by players at information sets that are reached with zero probability during actual play of the strategy profile.
Example. E• Out.
In2 &
In1 ↓
" #
0 2
I • − − − − −• F. "
F.
&A #
−1 −1
" #
1 0
"
#
−1 −1
&A " #
2 1
In any weak PBE, firm I must play A if entry occurs because that is the optimal action for firm I starting at his information set for any system of beliefs. So, the SPNE {out,F if enter} is not a weak PBE. The SPNE {In2,A if enter} is a weak PBE.
In the strategy profile{In2,A if enter}, the information set for firm I is reached with positive probability (in fact, probability one) and therefore the system of beliefs for nodes in this information set must be derived through the strategy of player E - i.e., must assign probability one to being at the right node of the information set.
E1• Out. ⎡ ⎤
0 ⎢ ⎥ ⎣0⎦ 3
↓ Ind
Joint &
↓
•E2 Accept .
↓ ↓
& Decline
.
•E1 . In
& Out ⎡ ⎤
0
⎢ ⎥ ⎣0⎦
I• − − − • − − − − • F.&A ⎡
⎤
−1 ⎢ ⎥ ⎣ 0 ⎦ 2
⎡ ⎤⎡
F.&A ⎤⎡ ⎤⎡
3
F.&A ⎤
2 1 4 −1 ⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥ ⎣0⎦ ⎣ 1 ⎦ ⎣4⎦ ⎣ 0 ⎦ 1 −2 0 2
⎡ ⎤
2 ⎢ ⎥ ⎣0⎦ 1
In any weak PBE, player E2 must accept an offer of Joint by E1 because he is guaranteed strictly positive payoff. This, in turn, imlies player E1 must propose Joint anticipating E1 will accept the offer (his payoff under both F and A strategy of player I is thereby higher). Thus, firm I’s information set is reached with positive probability in any weak PBE and Bayesian updating suggests that (given the srtrategies that will be played by players E1 and E2), he assign probability one to the middle node & so his strategy in any weak PBE must be A if entry occcurs. Unique weak PBE: {(Joint, In if E2 declines), Accept, A if entry occurs} with a system of beliefs that assigns probability one to middle node of information set.
There are other SPNE- for example, {(out,out if E2 declines), Decline, Fight if entry occurs} is a SPNE (the only subgame is the whole game so just need to check this is a NE).
Example. E• Out. " #
0 2
In1 ↓
In2 &
I • − − − − − − −• F. "
F.
&A #
−1 −1
"
#
3 −2
"
&A #
γ>0 −1
" #
2 1
No stategy for firm I is optimal independent of the system of beliefs and the optimal strategy for E depends on how she thinks I will play in its information set. Look for a fixed point: a system of beliefs for firm I’s information set such that the optimal behavior of firm I given this belief generates an optimal behavior of firm E that in turn, is consistent with the system of beliefs.
Unique weak PBE: firm E plays a mixed strategy: In1 with prob 23 and In2 with prob 13 . 1 and A firm I plays a mixed strategy: F with prob γ+2 with prob γ+1 γ+2 .
System of beliefs: left node with prob 23 and right node with prob 13 .
Problems with Weak PBE: No restrictions at all are placed on the system of beliefs off the equilibrium path i.e. in information sets that are reached with probability zero during play of the equilibrium strategies. (Such beliefs are referred to as "off-equilibrium beliefs"). This leads to unreasonable predictions.
•N ature 1 . 2
& 12
• − − − − − − − − − • Player 1 (.5) x. "
(.5)
&y
#
2 10
y.
&x "
2 10
•−−−−−• (.9)
(.1)
l.
&r
l.
&r
" #
" #
" #
"
0 5
#
5 2
0 5
#
5 10
Weak PBE:{x, l} with the system of beliefs shown in brackets.
The beliefs specified for player 2’s information set not sensible because this information set can only be reached if, and only if, player 1 deviates - choosing y with strictly positive probability − and this deviation must independent of nature’s move (since that move is not observed by player 1) - and, in that case, the two nodes ought to be equally likely. Even though an information set is reached with zero probability, the probabilities underlying the beliefs over the nodes need to be consistent with the strategies in some sense.
Example. Weak PBE may be not be a subgame perfect Nash equilibrium. • Firm E Out.
&In
" #
0 2
•Firm E F.
&A
Firm I • − − − − − − −• F. "
#
−3 −1
&A "
#
1 −2
F. "
#
−2 −1
&A " #
3 1
One weak PBE of this game is {(Out, A if In), F if In} with system of belief for player I’s information set that puts probability 1 on the left node.
Note: Player I’s information set is reached with probability zero in this equilibrium and so there is no restriction on how the system of beliefs should be defined. These strategies are not a SPNE, not a NE in the subgame. [Unique SPNE: {(In,A if In),A if In}] Firm I’s post-entry belief about firm E’s post-entry play is unrestricted by the weak PBE concept.
Therefore, need to strengthen weak PBE by imposing additional consistency restrictions on beliefs. * Perfect Bayesian Equilibrium, * Sequential Equilibrium (Kreps and Wilson, 1982).
Sequential Equilibrium. Definition. A strategy profile and a system of beliefs (σ, μ) is said to be sequential equilibrium of an extensive form game ΓE if the following hold: (i) Strategy profile σ is sequentially ration given belief system μ (ii) There exists a sequence of completely mixed stratek gies {σ k }∞ k=1 with limk→∞ σ = σ, such that μ = limk→∞ μk , where μk denotes the beliefs derived from the strategy profile σ k using Bayes’ rule. Note that if strategies are completely mixed then all information sets are reached with positive probability and the probability distribution over nodes in all information sets can be derived directly from the strategies of players using Bayes’ rule. In a sequential equilibrium, beliefs off the equilibrium path are justifiable as coming from some set of totally mixed
strategies that are close enough to the equilibrium strategies. Can be viewed as requiring that players justify their beliefs (approximately) by some story in which, with some small probability, players make mistakes in choosing their strategies.
Example. •N ature 1 . 2
& 12
• − − − − − − − − − • Player 1 (.5) x. "
(.5)
&y
#
2 10
y.
&r
" #
" #" #
0 5
5 2
l. 0 5
"
&r
"
#
5 10
#
2 10
2
• − − − − −• l.
&x
Any system of beliefs that can be derived from a sequence of completely mixed strategies must assign equal probability to both nodes of player 2’s information set. In other words, if player 1 plays strategy y with probability > 0 no matter how small, player 2 must assign prob. 0.5 to the two nodes. Given this, player 2 will play r optimally and this implies that player 1 plays y. Unique sequential equilibrium: {y, r} with system of beliefs that assign probability 0.5 to both nodes of player 2’s information set.
• Firm E Out.
&In
" #
0 2
•Firm E F.
&A
Firm I • − − − − − − −• F. "
#
−3 −1
&A "
#
1 −2
F. "
#
−2 −1
&A " #
3 1
Consider any totally mixed strategy profile. Then, the node following firm E choosing "In" is reached with positive probabilty and so is Firm I’s information set.
Using Bayes’s rule, the posterior probability of being at each of the two nodes in Firm I’s information set (conditional on having reached it) is exactly the relative likelihood with which firm E actually plays F and A (conditional on having entered the industry). The limiting belief used in a sequential equilibrium must therefore reflect the actual play of firm E in the subgame. So, in the sequential equilibrium, firm I must be best responding to the actual play of firm E in the subgame following entry and vice-versa. In other words, the strategies must induce a NE in the subgame following entry. The unique SPNE {(In,A if In),A if In} is the unique sequential equilibrium.
* The equilibrium strategy profile of every sequential equilibrium constitutes a SPNE.
* Concept of sequential equilibrium strengthens both weak PBE and SPNE.