Eco514|Game Theory 1. Game Theory = Multiperson Decision Theory; Zero-Sum games Marciano Siniscalchi September 16, 1999 Administrative Stu Class: Tue-Thu 10:40-12:10 [?], Room 317, Bendheim. OH, by appointment. The Big Picture Most of you will already have used some of the tools of GT in your core courses. You will probably be familiar with the notions of simultaneous vs. extensive-form game, perfect vs. imperfect information, complete vs. incomplete information, Nash equilibrium, Perfect Bayesian equilibrium, Sequential equilibrium, etc. (but, don’t worry if you are not). This course has two main objectives: (1) introduce additional, perhaps more sophisticated analytical tools, which you will most likely nd useful in your own work; (2) enhance your understanding of the main ideas (as opposed to the main tools) of GT, which includes (among other things) a critical appraisal of the applicability of the techniques you have learned. Under point (1), I will discuss, among other issues: sophisticated models of in- complete information games which, for instance, allow one to think clearly about currency speculation, bank runs and similar phenomena having to do with \mar- ket sentiment"; forward induction, a powerful idea which pops up in many applied models, from political economy to industrial organization; non-equilibrium solution concepts, which we may view as a way to test the robustness of strategic predictions to deviations from \textbook" settings; reputation phenomena, which may shed some light on, say, a certain software behemoth’s unduly response to new entrants; and so on. 1 However, in my mind, (2) is much more important than (1). The reason is that, in my perhaps-not-so-humble opinion, GT is best viewed not as a \bag of tricks", but as a methodology|a select collection of internally consistent, general principles which helps you structure your own reasoning about strategic interaction in a given setting. Games as Multiperson Decision Problems Decision theory enjoys an enviable status in economics. It is, often seen as the prototypical example of a rigorous, well-founded theory, which is easy to apply to the analysis of a wide variety of relevant decision situations in the social sciences: from Econ 101-type models to GE and nance. The link with GT might not be apparent. Indeed, it has been somewhat down- played in the past, because certain conclusions one can draw from strategic reasoning seem to be so much at odds with decision theory that the very existence of such a link might appear questionable (think about: \more information can hurt you.") However, the link is de nitely out there, and recent research has brought it to the fore. What might be surprising is that the link was quite evident in the early days of GT, when people were mostly interested in zero-sum games. I shall presently state my case|and, in the process, review a related mathematical tool whose importance I simply cannot overemphasize: linear programming. Making the connection First, I need to de ne a decision problem. De nition 1 A decision problem is a tuple f ;C;Fg, where is a set of possible states of the world, C is a set of consequences, and F C is a set of acts. The idea is that, if the decision-maker (DM) chooses act f2F, and the prevailing state is !2 , then the consequence f(!)2C obtains. Consequences are supposed to represent all the DM cares about. The DM does not observe the prevailing state of the world: that is, there is uncertainty about it. Still, she might be assumed to exhibit preferences (represented by a binary relation F F) among available acts. The classical \representation" result exhibits a set of joint assumptions about (and F) which are equivalent to the existence of a probability measure 2 ( ) and a utility function u : C ! R (unique up to a positive a ne transformation) which represents : 8f;g2F; f g, Z u(f(!)) (d!) Z u(g(!)) (d!) 2 Next, let us de ne simultaneous games: De nition 2 A nite simultaneous game is a tuple fN;(Ai;ui)i2Ng, where N is a nite set of players, and for each player i2N, Ai is a nite set of actions and ui is a payo function ui : Ai A i!R. (I use the conventional notation A i = Qj6=iAj and A = Qi2N Ai.) We could be more general and assume that each player is endowed with a preference relation over action pro les, but let’s not worry about this right now. The above is the \standard" de nition of a game. As you may see, it is not quite consistent with the de nition of a decision problem, so we need to reinterpret things a bit. Indeed, the most important di erence is that a game implicitly de nes not just one, but N decision problems (with a conventional abuse of notation, I will denote by this both the set N and its cardinality jNj). Let’s take the point of view of Player i. She has no control over her opponents’ choices, and she will not nd out what they were until the game is over. Thus, A i seems a good candidate for the state space: = A i. Next, utilities are attached to points (ai;a i) 2 A according to De nition 2. This amounts to saying that there is a one-to-one correspondence between the actual outcome of the game (which is what players care about) and action pro les. Thus, we may as well imagine that action pro les are the relevant consequences: C = A. By a process of elimination, acts must be actions: \F = A00i . Formally, action ai2Ai de nes an act fai : A i!Ai A i (i.e. fai : !C) by fai(a i) = (ai;a i). However, the de nition of a game incorporates additional information: speci cally, it includes a utility function for each player. This implies that we have a complete description of players’ preferences among consequences. However, in view of the representation results cited above, this is only one part of the description of players’ preferences among acts, because the probabilities are missing. This trivial observation is actually crucial: in some sense, the traditional role of game theory has been that of specifying those probabilities in a manner consistent with assumptions or intuitions concerning strategic rationality. Nash Equilibrium To make this more concrete, recall what you may have heard about Nash equilibrium: a player’s strategy doubles as her opponents’ beliefs about her strategy. Thus, in a 2-player game, a Nash equilibrium is a speci cation of beliefs ( 1; 2) for both players, with the property that if Player i’s belief assigns positive probability to an action aj2Aj (j6= i), then the act faj is preferred by j to any other act, given that j’s preferences are represented by the utility function uj : A ! R and the 3 probability measure i 2 (Ai) (note the indices!). In short, we say \aj is a best reply to i". Note: at rst sight, Nash equilibrium seems to say that players’ beliefs are required to be \consistent with rationality". However, note that, since the rationality of an act can only be assessed with respect to a given (utility function and) probability measure over states of the world, the de nition really says that players’ beliefs are required to be \consistent with rationality, given the equilibrium beliefs." We shall be more speci c about this point in a subsequent lecture. Also, note that this interpretation of Nash equilibrium has no behavioral content: it is simply a restriction on beliefs. If we really wish to push this to the extreme, we can say that, if players’ belief satis es the restrictions in the de nition of a Nash equilibrium, then the players may choose an action that is a best reply to those beliefs. Again, we will return to this point later in the course. Zero-Sum games and the minmax solution Game theorists were originally interested in a much simpler class of games, namely two-player games in which interests were diametrically opposed: De nition 3 A two-player simultaneous game ff1;2g;(Ai;ui)i2Ng is zero-sum i , for all a2A, u1(a) +u2(a) = 0. To put this in perspective, think about board games such as chess or checkers, or war. In the latter case, it appears that doing one’s best in the worst-case scenario might be the \right" thing to do. I am told this is what they teach you in military academy, anyway. In a zero-sum game, the worst-case scenario for Player 1 is also the best scenario for Player 2, and vice-versa. Thus, for Player 1 to assume that Player 2 is \out to get her" is not inconceivable|after all, Player 2 has all the right incentives to do so. This idea leads to the notion of maxmin and minmax strategies. Suppose rst that both players can randomize|commit to spinning roulette wheels or something like that, so as to choose strategies according to any pre-speci ed probability i2 (Ai). We will get back to this later in the course: for now, let’s just assume that this is possible; the reason why this is convenient will become clear momentarily. Again using a conventional notation, let u1( 1; 2) =: P(a1;a2)2Au1(a1;a2) 11(a1) 12(a2) for any ( 1; 2)2 (A1) (A2). Consider rst a pair of mixed strategies 1i 2 (Ai), i = 1;2, such that u1( 11; 12) = min 22 (A2) max 12 (A1) u1( 1; 2) 4 In words, u1( 11; 12) is the minimum payo that Player 1 can secure; equivalently, it is the minimum payo to Player 1 when Player 2 attempts to damage her as much as possible, while keeping into account the fact that Player 1 best-responds. Note that since max f = minf, u2( 11; 12) = u1( 11; 12) = max 22 (A2) min 12 (A1) u2( 1; 2) i.e. ( 11; 12) also yields Player 2 the maximum payo he can obtain if Player 1 chooses her mixed strategy so as to hold Player 2’s payo down. Such a pair of mixed strategies can always be found (why?). Similarly, one can consider the symmetric problem for Player 2, and de ne a new pair ( 21; 22) such that u2( 21; 22) = min 12 (A1) max 22 (A2) u2( 1; 2) so u2( 21; 22) is the minimum payo that Player 2 can secure; as above, u1( 21; 22) = max 12 (A1) min 22 (A2) u1( 1; 2) the maximum payo that Player 1 can obtain if Player 2 attempts to hold Player 1’s payo down. It may be helpful to think about this as a \pseudo-dynamic" story: in order to de ne ( 11; 12), one imagines that Player 2 chooses rst, and Player 1 moves next; the latter best-responds to Player 2’s choice, and the former, anticipating 1’s best response, attempts to minimize Player 1’s payo . On the other hand, to de ne ( 21; 22), one imagines that Player 1 chooses rst, and Player 2 moves next; the latter observes 1’s choice and attempts to minimize 1’s payo , while the former, anticipating this, attempts to maximize her own payo . [Of course there is a dual to this story, with 1’s and 2’s roles inverted.] It is natural to ask whether Player 1’s resulting payo under the two de nitions is the same. The answer is a rmative: Theorem 0.1 (a.k.a. \The minmax theorem"): there exists a pair ( 1; 2)2 (A1) (A2) such that u1( 1; 2) = max 12 (A1) min 22 (A2) u1( 1; 2) = min 22 (A2) max 12 (A1) u1( 1; 2) The easiest proof of this theorem uses Linear Programming. Let ni be the cardi- nality of Ai and number its elements from 1 to ni; de ne the n1 n2 matrix U by letting Uk‘ = u1(ak1;a‘2), where ak1 and a‘2 are the k-th and ‘-th elements of A1 and A2, respectively. Then Player 1’s (2’s) beliefs about Player 2’s choices may be represented as nonegative vectors in 2 2Rn2 (resp. 1 2Rn1) whose elements add up to one. 5 Consider the following problem: min 2 u s.to U 2 + 1u 0; 10 2 = 1 which we can interpret as follows: Player 2 chooses 2 so that every strategy of Player 1 yields u or less; indeed, Player 2 attempts to make u as small as possible. Note that u is unconstrained in sign. The dual program is max 1 v s.to 01U + 10v 0; 011 = 1 i.e. Player 1 chooses 1 so each of Player 2’s strategy yields at least v to Player 1 (think about it!), i.e. at most v to Player 2. Again, v is unconstrained. Note that both programs are feasible (easy); also, both programs are bounded, because, say, in the dual program, v cannot be larger than the largest entry in the matrix U (which, if you think about it, is also obvious!) But then, by the Existence- Duality theorem of LP, both programs have optimal solutions, the optimal values are the same (so u = v), and the optimal solutions are complementary. In particular, this means that 1(a1) > 0 implies u1(a1; 2) = u, and similarly for 2. Thus, only actions which achieve the maximal payo u against 2 receive positive probability; in other words, 1 only assigns positive probabilities to best responses (and similarly for 2). Hence, the optimal solutions of the dual LPs identify a Nash equilibrium; it is then easy to see that any Nash equilibrium satis es the equality in Theorem 1 (see Proposition 22.2 in OR if you don’t see this), and we are done. 6