Eco514|Game Theory Lecture 2: (Iterated) Best Response Operators Marciano Siniscalchi September 21, 1999 Introduction This lecture continues the analysis of normal-form games. We analyze general, non-zerosum games, emphasizing the informal \equation": Rational Behavior + Assumptions about Beliefs = Solution Concepts Before we tackle the new material, let us review what we have learned about zerosum games in light of this \equation". Rational behavior in the context of normal-form games (zerosum or otherwise) means that players choose strategies which maximize their expected payo , given their beliefs about their opponents’ choices. Today we shall examine a few technical aspects related to the existence and characteristics of the best reply correspondence, but the basics should be familiar to you from decision theory. The minmax theory makes the following assumption about beliefs: each player expects her opponent to (1) understand that she will best-respond to his strategy, and (2) play so as to minimize her expected payo . That is, each player reasons as if she were to observe the (possibly random) choice of her opponent and best-respond to it, and as if her opponent, anticipating this, chose a strategy in order to make her payo as low as possible. Formally, the problem min ij2 (Aj) max ii2 (Ai) ui( ii; ij) de nes the belief ij of Player i about Player j. Best Replies Let us begin with a formal de nition of best replies in the case of nite games. 1 De nition 1 Fix a nite normal-form game G = (N;(Ai;ui)i2N), a player i 2 I and a probability measure i2 (A i). An action ai2Ai is a best reply to the belief i i X A i i(a i)ui(ai;a i) X A i i(a i)ui(a0i;a i) 8a0i2Ai: In nite games, expected payo s are always well-de ned. Not so for games with in nite action spaces: even assuming a nice measure-theoretic structure, it is not hard to construct (mildly pathological) examples in which payo functions are not integrable. Thus, a better de nition for in nite games is the following. Assume that, for every player i2N, the set of actions (Ai;Ai) is a measure space. Endow the product sets A and A i with the natural product sigma-algebras, denotedAandA i. Finally, assume that ui isA- measurable. The sigma-algebras on the set of actions will be indicated explicitly for in nite games. De nition 2 Fix a normal-form game G = (N;(Ai;Ai;ui)i2N), a player i2N and a belief i 2 (A i;A i). An action ai 2 Ai is a best reply to the belief i i the Lebesgue integral RA iui(ai;a i) i(dai) exists, and moreover Z A i ui(ai;a i) i(dai) Z A i ui(a0i;a i) i(dai) 8a0i2Ai whenever the right-hand side integral exists. I emphasize that integrals are taken (here and in the following) in the Lebesgue sense. Even if expected payo s are well-de ned, the very structure of the game might lead to the non-existence of best replies. An obvious example arises in the Bertrand price competition game. Example. Two rms produce and sell the same good, facing a market of size Q, and zero xed and marginal cost; nally, the two rms compete by posting unit prices pi, i = 1;2. Assume that consumers have unit demand and buy from the cheapest producer; if p1 = p2, consumers buy from each rm with probability 12. Suppose that Firm 1 expects Firm 2 to post a price equal to p2 = 2; this corresponds to a degenerate belief, concentrated on 2. Then any price p1 > 2 yields 0, any price p1 < 2 yields Qp1, and p1 = p2 = 2 yields 2Q2 = Q. Clearly no price p1 satis es our de nition of a best reply. The conditions for the existence of best replies are related to those which guarantee the existence of an optimum. We need to specify a topology Ti on every action space Ai; 2 furthermore, we assume that the relevant sigma-algebra in the de nition of a best reply is the corresponding Borel sigma-algebra B(Ti) (hence, ui( ) is Borel-measurable if it is continuous). Finally, we assume product sets are endowed with the product topology and the induced Borel product sigma-algebra. Proposition 0.1 Consider a game G = (N;(Ai;Ti;ui)i2N) and x a player i2N. If ui( ) is bounded and continuous, and Ai is a compact metrizable space, then for every measure i2 (A i;B(Ti)) there exists a best reply ai2Ai. Proof: We need to show that the map ’ : Ai7!RA iui(ai;a i) i(da i) is continuous. Since Ai is a metric space, sequences characterize continuity. Thus, x a sequence aki !ai in Ai. By assumption, ui(aki;a i) !ui(ai;a i) for every a i 2A i, and ui( ) is bounded. Thus, by Lebesgue’s Dominated Convergence theorem, ’(aki )!’(ai). Observe that, in the above proof, it is essential that expectations be de ned in the Lebesgue sense: otherwise, stronger conditions are required on the payo functions and action spaces. There are two reasons why we are interested in in nite games. First, for Nash equilibria to exist one must enlarge the players’ action spaces to include all mixtures of strategies; thus, e ectively, one is looking at an in nite game, in which every strategy set is a simplex, hence a compact subset of Euclidean space, and every payo function is continuous (because the expected payo from a mixed strategy pro le is linear in the probabilities). Thus, the preceding result applies. Second, several interesting economic models lend themselves to a formulation involving in nite strategy spaces. Thus, getting some exposure to the issues that arise in such settings can be useful. The Best Reply Correspondence Recall that a correspondence is a set-valued function. Since, in general, Player i may have more than one best reply to a belief i2 (A i;A i), we use correspondences to describe the mapping from beliefs to best replies. De nition 3 Fix a game G = (N;(Ai;Ai)i2N) and a player i2N. The best-reply corre- spondence for player i, ri : (A i;A i))Ai, is de ned by 8 i2 (A i;A i); ai2ri( i),ai is a best reply given i The preceding proposition then gives conditions under which the best-reply correspon- dence is nonempty. 3 Correlated Rationalizability and Iterated Dominance Having disposed of all required technicalities, let us go back to our \informal equation" Rational Behavior + Assumptions about Beliefs = Solution Concepts. Fix a game G = (N;(Ai;Ti;ui)i2N) and assume that each (Ai;Ti) is a compact metrizable space (you can also assume that each Ai is nite, as far as the substantive arguments are concerned) and that payo functions are continuous. We begin by asking: what is a plausible outcome of the game, if we are willing to assume that players are (Bayesian) rational? The answer should be clear: we can only expect an action pro le (ai)i2N to be chosen i , for every player i 2 N, ai 2 ri( i) for some i2 (A i;B(Ti)). That is: if we think that players are \good Bayesians," we should expect them to formu- late a belief about their opponents’ choice, and play a best reply given that belief. Equiva- lently, we should expect a rational player to choose an action only if it that action is justi ed by some belief. This exhausts all the implications of Bayesian rationality: in particular, we do not pos- tulate any relationship between the beliefs justifying the actions in any given pro le. In very simple games, Bayesian rationality is su cient to isolate a unique solution: for instance, this is true in the Prisoner’s Dilemma. However, in Matching Pennies, the Row player (who wants to match) may choose H because she expects the Column player to also choose H, whereas the Column player (who wants to avoid matches) actually chooses T because he (correctly, it turns out) expects the Row player to choose H. Indeed, it is easy to see that any action pro le is consistent with Bayesian rationality in Matching Pennies. However, one might reason as follows: If I, the theorist, am able to rule out certain actions because they are never best-replies, perhaps the players will be able to do so, too. But, if so, it seems reasonable to assume that their beliefs should assign zero probability to the col- lection of eliminated actions. This entails a further restriction on the collection of actions that they might choose. That is, in addition to assuming that players are Bayesian rational, we may be willing to assume that they believe that their own opponents are also Bayesian rational. This is precisely the type of assumption about beliefs referred to in our \informal equation." It is also a very powerful idea: 4 Example: Consider a two- rm version of the Cournot quantity competition game, with action spaces Ai = [0;1] and payo s ui(ai;a i) = (2 ai a i)ai: that is, marginal costs are zero, and inverse demand is given by P(q) = 2 q. Notice that Player i’s expected payo depends linearly on Player i’s expected quantity; thus, our analysis can be restricted to degenerate (point) beliefs without loss of generality. For any a i2A i, Player i’s best reply is found by solving max ai2[0;1] (2 ai a i)ai; so ri(a i) = 1 12a i Then notice that no ai2[0; 12) is ever a best reply: after all, even if a i = 1, there would still be residual demand in the market, and consumers would be willing to pay 1 dollar for an additional marginal unit of the good. That is, Player i is actually facing the residual inverse demand curve PR(ai) = 1 ai, so setting ai = 12 (the corresponding monopoly quantity) is optimal. This is true for i = 1;2. Thus, assume that Player i understands that a i2[12;1]: that is, Player i believes that, since her opponent is rational, he will never produce less than 12 units of the good. But then, producing ai 2 (34;1] units cannot be a best reply against a belief satisfying this restriction. Since, moreover, we have already concluded that quantities ai2[0; 12) must be discarded on the basis of rationality considerations alone, the assumptions of Bayesian rationality + Belief in the opponent’s Bayesian rationality already imply that only action pro les (a1;a2) with ai2[12; 34] should be expected. Once we are willing to make this assumption about the players’ beliefs, it becomes natural to at least conceive the possibility that the players themselves might contemplate it. This leads to a hierarchy of hypotheses about mutual belief in rationality. That is, we might think that: (1) players are Bayesian rational; (2) players believe that their opponents are also Bayesian rational; (3) players believe that their opponents believe that their own opponents are Bayesian rational; and so on. The following de nition captures the implications of these assumptions. De nition 4 Fix a game G = (N;(Ai;Ti;ui)i2N). For every player i2N, let A0i = Ai. For k 1 and for every i2I, let ai2Aki i there exists i2 (A i;B(T i)) such that 5 (1) ai2ri( i); (2) i(Qj6=iAk 1j ) = 1. An action ai2Aki is said to be k-correlated rationalizable; an action ai2Tk 1Aki is said to be correlated rationalizable. Part (2) in the above de nition captures the idea that, at each step k, we are willing to impose the assumption that players are able to performk 1 steps of eductive reasoning about their opponents. Although this is not explicitly required in the de nition, this corresponds to progressively more stringent assumptions about rationality; if action sets are nite, this implies that correlated rationalizable strategies exist (why?). Proposition 0.2 Fix a game G = (N;(Ai;Ti;ui)i2N). Then, for every i2N and k 1, Aki Ak 1i . Proof: The claim is true for k = 1. Assume it is true for some k, x i2N and pick ai 2 Ak+1i . By de nition, there exists i 2 (A i;B(T i)) such that ai 2 ri( i) and i(Ak i) = 1. Since by the induction hypothesis Ak i Ak 1 i , this implies i(Ak 1 i ) = 1; hence, ai2Aki , as claimed. Independent rationalizability, or (in keeping with standard terminology) \rationalizabil- ity" tout-court, incorporates the additional restriction that players’ beliefs are actually in- dependent product measures on their opponents’ action pro les; that players believe that their opponents’ beliefs are independent; that they believe that their opponents believe that their respective opponents’ beliefs are independent; and so on and so forth. De nition 5 Fix a game G = (N;(Ai;Ti;ui)i2N). For every player i2N, let R0i = Ai. For k 1 and for every i2I, let ai2Rki i there exists i2 (A i;B(T i)) such that (1) ai2ri( i); (2) i(Qj6=iRk 1j ) = 1; (3) i is a product measure on A i;B(T i). An action ai 2Rki is said to be k-rationalizable; an action ai 2Tk 1Rki is said to be rationalizable. I conclude this section with two notes pertaining to the interpretation of (correlated) rationalizability. First, I emphasize that, although we have motivated (correlated) rationalizability with informal assumptions about rationality and beliefs, we have been quite vague in de ning what we mean by statements such as, \Player i believes that Player i is rational." We have claimed that this \implies" that Player i’s beliefs should assign positive probability only to those strategies of Player i which can themselves be justi ed by some belief. 6 However, this is an informal statement. Recent contributions have shown how to con- struct formal models (based on the classical decision-theoretic setting) which allow one to formalize ideas such as \rationality" and \belief". In the framework of those models, the \implication" we have used to de ne rationalizability can actually be proved|i.e. it is a theorem. This places rationalizability on very solid methodological ground. Second, it might seem rather natural to assume that beliefs should be independent (and that moreover this ought to be \common belief"). After all, the basic model of simultaneous games postulates that players choose their actions independently, without any possibility of coordination. Thus, correlated beliefs might even appear to be unwarranted. The debate on \causal vs. epistemic independence," as the issue is often referred to, is lengthy and beyond the scope of these notes. Let me just invite you to re ect on the subject, based on the following example. Consider the following three-player \coordination" game: Players 1 and 2 choose L or R, and get 1 if they coordinate and 0 if they do not. Player 3 chooses C or N; C yields 1 if Players 1 and 2 coordinate, and 0 otherwise, while N yields 0 if they coordinate and 1 if they do not. That is, Player 3’s choice represents a bet on whether or not Players 1 and 2 manage to coordinate. Now suppose that Player 3 is fully aware that Players 1 and 2 choose independently; however, she expects them to coordinate, but is unsure as to which action they will choose (i.e. L or R). Thus, she assigns probability 12 to each of the pro les (L,L) and (R,R). This is a correlated belief, but it could be explained with the following simple story: Players 1 and 2 have been playing this game for a long time, and have \learned to coordinate." Player 3 knows this, but has never observed their actual play|perhaps because she has also been playing the game for a long time, but she has only observed her own payo in each play of the game (and hence ascertained whether or not there was coordination). I welcome your comments on this issue! Rationalizability and Dominance Consider the following de nition: De nition 6 Fix a nite game G = (N;(Ai;ui)i2N) and a player i2N. An action ai2Ai is strictly dominated for Player i i there exists i2 (Ai) such that 8a i2A i; X a0i2Ai ui(a0i;a i) i(ai) >ui(ai;a i); 7 ai is weakly dominated for Player i i there exists i2 (Ai) such that 8a i2A i; X a0i2Ai ui(a0i;a i) i(ai) ui(ai;a i) and there exists a i2A i such that X a0i2Ai ui(a0i;a i) i(ai) >ui(ai;a i): It is (relatively) easy to see that a strictly dominated strategy cannot be a best reply against any belief. The converse is also true: Proposition 0.3 Fix a nite game G = (N;(Ai;ui)i2N) and a player i 2 N. An action ai 2 Ai is strictly dominated for Player i i there is no belief i 2 (A i) such that ai2ri( i). Proof: Number the actions in Ai except ai from 1 tojAij, and the action pro les in A i from 1 to jA ij. De ne a jAij jA ij matrix U by letting Umn = ui(ani;am i) ui(ai;am i), where ani is the n-th action in Ainfaig, and am i is the m-th action pro le in A i. Consider the problem: max x 0 0 x s.to U x 0;10x = 1 where x is a jA ij-dimensional column vector. If such a vector exists, then it represents a belief for i, and ai2ri(x). The dual problem is min y 0;w w s.to y0 U +w10 0 where y is a jAi 1j-dimensional column vector and w is a scalar (not sign-constrained). The min problem reads as follows: nd a mixture y of actions for i, assigning zero weight to ai, such that the expected payo to the mixture is at least w higher than the payo to ai for any action pro le of i’s opponents. Note that w is unconstrained, and the problem seeks to minimize w (hence make w as large as possible). If the optimal value of the problem is 0, so w = 0, then ai is not strictly dominated. Observe that the max program is always bounded, while the min program is always feasible (take y = 0, w = 0). Thus, suppose ai is a best reply, so the max program is feasible: then, by the Existence-Duality theorem, both problems must have optimal solutions, and the values must coincide. This implies that w = 0, so ai is not strictly dominated. Suppose 8 instead that ai is not a best reply to any belief, so the max program is infeasible; again, this implies that the min program must be unbounded, i.e. w can be made arbitrarily negative (positive values cannot be optimal, because 0 is a feasible value). But this means that ai is strictly dominated: for any K < 0 there exists a feasible pair (y;w) such that w K: but for this y, Pui(ani;a i)yn ui(ai;a i) > K > 0 for any a i2A i. Let me emphasize that it is crucial to look at domination by mixtures of strategies for the equivalence result to hold. If one is not willing to allow for actual randomization, then iterated strict dominance as de ned above may make little sense, and one may wish to consider domination by pure strategies: Tilman Borgers, among others, has worked on this problem. I will ask you for an alternative proof of the preceding proposition, based on the Sepa- rating Hyperplane theorem. I will also ask you to prove the following (easy) result, which is one reason why correlated rationalizability is worth mentioning (even if you believe in independent beliefs). First, consider the following de nition: De nition 7 Fix a nite game G = (N;(Ai;ui)i2N). For every player i2N, let SD0i = Ai; next, for k 1, and for every i2N, say that ai2SDki i ai is not strictly dominated in the game Gk 1 = (N;(SDk 1i ;uk 1i )i2N) (where uk 1i denotes the appropriate restriction of ui). The idea is to discard strictly dominated strategies, then look at the game resulting from G by removing them, and iterate the de nition. Proposition 0.4 Fix a nite game G = (N;(Ai;ui)i2N). Then, for every player i2N, and for every k 1, Aki = SDki . I will have more to say about weak dominance and iterated weak dominance when we discuss extensive games. 9