An Introduction to Mathematical Analysis in Economics 1 Dean Corbae and Juraj Zeman December 2002 1 Still Preliminary. Not to be photocopied or distributed without permission of the authors. 2 Contents 1Introduction 13 1.1 Rulesoflogic ........................... 13 1.2 TaxonomyofProofs ....................... 17 1.3 BibliographyforChapter1.................... 19 2SetTheory 21 2.1 SetOperations .......................... 23 2.1.1 Algebraicpropertiesofsetoperations.......... 24 2.2 CartesianProducts........................ 24 2.3 Relations.............................. 25 2.3.1 Equivalencerelations................... 25 2.3.2 Orderrelations ...................... 27 2.4 CorrespondencesandFunctions ................. 30 2.4.1 Restrictionsandextensions ............... 32 2.4.2 Compositionoffunctions................. 32 2.4.3 Injectionsandinverses.................. 33 2.4.4 Surjectionsandbijections ................ 33 2.5 Finite and InTniteSets...................... 34 2.6 AlgebrasofSets.......................... 38 2.7 BibliographyforChapter2.................... 43 2.8 EndofChapterProblems..................... 44 3 The Space of Real Numbers 45 3.1 TheFieldAxioms ........................ 46 3.2 TheOrderAxioms ........................ 48 3.3 TheCompletenessAxiom .................... 50 3.4 OpenandClosedSets ...................... 53 3.5 BorelSets............................. 60 3 4 CONTENTS 3.6 BibilographyforChapter3.................... 63 3.7 EndofChapterProblems..................... 64 4MetricSpaces 65 4.1 Convergence ............................ 68 4.1.1 Convergenceoffunctions................. 75 4.2 Completeness ........................... 77 4.2.1 Completionofametricspace............... 80 4.3 Compactness ........................... 82 4.4 Connectedness ........................... 87 4.5 NormedVectorSpaces ...................... 88 4.5.1 Convexsets........................ 92 4.5.2 A Tnite dimensional vector space: R n .......... 93 4.5.3 Series ........................... 98 4.5.4 An inTnite dimensional vector space: ! p ......... 99 4.6 ContinuousFunctions.......................105 4.6.1 Intermediatevaluetheorem ...............108 4.6.2 Extremevaluetheorem..................110 4.6.3 Uniformcontinuity....................111 4.7 HemicontinuousCorrespondences ................113 4.7.1 TheoremoftheMaximum................122 4.8 FixedPointsandContractionMappings ............127 4.8.1 Fixedpointsoffunctions.................127 4.8.2 Contractions........................130 4.8.3 Fixedpointsofcorrespondences.............132 4.9 Appendix-ProofsinChapter4.................138 4.10BibilographyforChapter4....................144 4.11EndofChapterProblems ....................145 5 Measure Spaces 149 5.1 LebesgueMeasure.........................150 5.1.1 Outermeasure ......................151 5.1.2 L?measurablesets....................154 5.1.3 Lebesguemeetsborel...................158 5.1.4 L-measurablemappings .................159 5.2 LebesgueIntegration.......................170 5.2.1 Riemannintegrals.....................170 5.2.2 Lebesgueintegrals ....................172 CONTENTS 5 5.3 GeneralMeasure .........................184 5.3.1 SignedMeasures .....................185 5.4 ExamplesUsingMeasureTheory ................194 5.4.1 ProbabilitySpaces ....................194 5.4.2 L 1 .............................195 5.5 Appendix-ProofsinChapter5.................200 5.6 BibilographyforChapter5....................211 6 Function Spaces 213 6.1 Thesetofboundedcontinuousfunctions............216 6.1.1 Completeness.......................216 6.1.2 Compactness .......................218 6.1.3 Approximation ......................221 6.1.4 Separability of C(X) ...................227 6.1.5 Fixedpointtheorems...................227 6.2 Classical Banach spaces: L p ...................229 6.2.1 Additional Topics in L p (X) ...............235 6.2.2 Hilbert Spaces (L 2 (X))..................237 6.3 Linearoperators..........................241 6.4 LinearFunctionals ........................245 6.4.1 Dualspaces........................248 6.4.2 SecondDualSpace....................252 6.5 SeparationResults ........................254 6.5.1 Existence of equilibrium . . . ..............260 6.6 OptimizationofNonlinearOperators..............262 6.6.1 Variational methods on inTnite dimensional vector spaces262 6.6.2 DynamicProgramming..................274 6.7 Appendix-ProofsforChapter6.................284 6.8 BibilographyforChapter6....................297 7 Topological Spaces 299 7.1 ContinuousFunctionsandHomeomorphisms..........302 7.2 SeparationAxioms ........................303 7.3 ConvergenceandCompleteness .................305 6 CONTENTS Acknowledgements To my family: those who put up with me in the past - Jo and Phil - and especially those who put up with me in the present - Margaret, Bethany, Paul, and Elena. D.C. To my family. J.Z. 7 8 CONTENTS Preface The objective of this book is to provide a simple introduction to mathemat- ical analysis with applications in economics. There is increasing use of real and functional analysis in economics, but few books cover that material at an elementary level. Our rationale for writing this book is to bridge the gap between basic mathematical economics books (which deal with introductory calculus and linear algebra) and advanced economics books such as Stokey and Lucas? Recursive Methods in Economic Dynamics that presume a work- ing knowledge of functional analysis. The major innovations in this book relative to classic mathematics books in this area (such as Royden?s Real Analysis or Munkres? Topology) are that we provide: (i) extensive simple examples (we believe strongly that examples provide the intuition necessary to grasp di?cult ideas); (ii) sketches of complicated proofs (followed by the complete proof at the end of the book); and (iii) only material that is rel- evant to economists (which means we drop some material and add other topics (e.g. we focus extensively on set valued mappings instead of just point valued ones)). It is important to emphasize that while we aim to make this material as accessible as possible, we have not excluded demanding mathe- matical concepts used by economists and that the book is self-contained (i.e. virtually any theorem used in proving a given result is itself proven in our book). Road Map Chapter 1 is a brief introduction to logical reasoning and how to construct direct versus indirect proofs. Proving the truth of the compound statement ?If A,thenB? captures the essence of mathematical reasoning; we take the truth of statement ?A? as given and then establish logically the truth of statement ?B? follows. We do so by introducing logical connectives and the 9 10 CONTENTS idea of a truth table. We introduce set operations, relations, functions and correspondences in Chapter 2 . Then we study the ?size? of sets and show the di?erences between countable and uncountable inTnite sets. Finally, we introduce the notion of an algebra (just a collection of sets that satisfy certain properties) and ?generate? (i.e. establish that there always exists) a smallest collection of subsets of a given set where all results of set operations (like complements, union, and intersection) remain in the collection. Chapter 3 focuses on the set of real numbers (denoted R), which is one of the simplest but most economic (both literally and Tguratively) sets to introduce students to the ideas of algebraic, order, and completeness prop- erties.Hereweexposestudentstothemost elementary notions of distance, open and closedness, boundedness, and simple facts like between any two real numbers is another real number. One critical result we prove is the Bolzano- Weierstrass Theorem which says that every bounded inTnite subset of R has apointwithsu?ciently many points in any subset around it. This result has important implications for issues like convergence of a sequence of points which is introduced in more general metric spaces. We end by generating the smallest collection of all open sets in R known as the Borel (σ-)algebra. In Chapter 4 we introduce sequences and the notions of convergence, com- pleteness, compactness, and connectedness in general metric spaces, where we augment an arbitrary set with an abstract notion of a ?distance? function. Understanding these ?C? properties are absolutely essential for economists. For instance, the completeness of a metric space is a very important property for problem solving. In particular, one can construct a sequence of approxi- mate solutions that get closer and closer together and provided the space is complete, then the limit of this sequence exists and is the solution of the orig- inal problem. We also present properties of normed vector spaces and study two important examples, both of which are the used extensively in economics: Tnite dimensional Euclidean space (denoted R n ) and the space of (inTnite dimensional) sequences (denoted ! p ). Then we study continuity of functions and hemicontinuity of correspondences. Particular attention is paid to the properties of a continuous function on a connected domain (a generalization of the Intermediate Value Theorem) as well as a continuous function on a compact domain (a generalization of the Extreme Value Theorem). We end by providing Txed point theorems for functions and correspondences that areusefulinproving,forinstance,theexistence of general equilibrium with competitive markets or a Nash Equilibrium of a noncooperative game. CONTENTS 11 Chapter 5 focuses primarily on Lebesgue measure and integration since almost all applications that economists study are covered by this case and because it is easy to conceptualize the notion of distance through that of the restriction of an outer measure. We show that the collection of Lebesgue measurable sets is a σ-algebra and that the collection of Borel sets is a subset of the Lebesgue measurable sets. Then we provide a set of convergence theorems for the existence of a Lebesgue integral which are applicable under a wide variety of conditions. Next we introduce general and signed measures, whereweshowthatasignedmeasurecanberepresentedsimplybyanintegral (the Radon-Nikodyn Theorem). To prepare for the following chapter, we end by studying a simple function space (the space of integrable functions) and prove it is complete. We study properties such as completeness and compactness in two impor- tant function spaces in Chapter 6: the space of bounded continuous functions (denoted C(X)) and the space of p-integrable functions (denoted L p (X)). A fundamental result on approximating continuous functions inC(X)isgivenin a very general set of Theorems by Stone and Weierstrass. Also, the Brouwer Fixed Point Theorem of Chapter 4 on Tnite dimensional spaces is generalized to inTnite dimensional spaces in the Schauder Fixed Point Theorem. Mov- ingontotheL p (X) space, we show that it is complete in the Riesz-Fischer Theorem. Then we introduce linear operators and functionals, as well as the notion of a dual space. We show that one can construct bounded linear functionals on a given set X in the Hahn-Banach Theorem, which is used to prove certain separation results such as the fact that two disjoint convex sets can be separated by a linear functional. Such results are used extensively in economics; for instance, it is employed to establish the Second Welfare Theorem. The chapter ends with nonlinear operators and focuses particu- larly on optimization in inTnite dimensional spaces. First we introduce the weak topology on a normed vector space and develop a variational method of optimizing nonlinear functions. Then we consider another method of Tnding the optimum of a nonlinear functional by dynamic programming. Chapter 7 provides a brief overview of general topological spaces and the idea of a homeomorphism (i.e. when two topological spaces X and Y have ?similar topological structure? which occurs when there is a one-to-one and onto mapping f from elements in X to elements in Y such that both f and its inverse are continuous). We then compare and contrast topological and metric properties, as well as touch upon the metrizability problem (i.e. Tnding conditions on a topological space X which guarantee that there exists 12 CONTENTS a metric on the set X that induces the topology of X). Uses of the book We taught this manuscript in the Trst year PhD core sequence at the Uni- versity of Pittsburgh and as a PhD class at the University of Texas. The program at University of Pittsburgh begins with an intensive, one month re- medial summer math class that focuses on calculus and linear algebra. Our manuscript was used in the Fall semester class. Since we were able to quickly explain theorems using sketches of proofs, it was possible to teach the entire book in one semester. If the book was used for upper level undergradu- ates, we would suggest simply to teach Chapters 1 to 4. While we used the manuscript in a classroom, we expect it will be beneTcial to researchers; for instance, anyone who reads a book like Stokey and Lucas? Recursive Meth- ods must understand the background concepts in our manuscript. In fact, it was because one of the authors found that his students were ill prepared to understand Stokey and Lucas in his upper level macroeconomics class, that this project began. Chapter 1 Introduction In this chapter we hope to introduce students to applying logical reasoning to prove the validity of economic conclusions (B) from well-deTned premises (A). For example, A may be the statement ?An allocation-price pair (x,p) is a Walrasian equilibrium? and B the statement ? the allocation x is Pareto e?cient?. In general, statements such as A and/or B may be true or false. 1.1 Rules of logic In many cases, we will be interested in establishing the truth of statements of the form ?If A,then B.? Equivalently, such a statement can be written as: ?A?B?; ?A implies B?; ?A only if B?; ?A is su?cient for B?; or ?B is necessary forA.? Applied to the example given in the previous paragraph, ?If A,then B? is just a statement of the First Fundamental Theorem of Welfare Economics. In other cases, we will be interested in the truth of statements of the form ?A if and only if B.? Equivalently, such a statement can be written: ?A ? B and B ? A?whichisjust?A ? B?; ?A implies B and B implies A?; ?A is necessary and su?cient for B?; or ?A is equivalent to B.? Notice that a statement of the form ?A?B? is simply a construct of two simple statements connected by ???. Proving the truth of the statement ?A?B? captures the essence of mathematical reasoning; we take the truth of A as given and then establish logically the truth of B follows. Before actually setting out on that path, let us deTne a few terms. A Theorem or Proposition is a statement that we prove to be true. A Lemma is a theorem we use to prove another theorem. A Corollary is a theorem whose proof is 13 14 CHAPTER 1. INTRODUCTION obvious from the previous theorem. A DeTnition is a statement that is true by interpreting one of its terms in such a way as to make the statement true. An Axiom or Assumption is a statement that is taken to be true without proof. A Tautology is a statement which is true without assumptions (for example, x = x). A Contradiction is a statement that cannot be true (for example, A is true and A is false). There are other important logical connectives for statements besides ??? and ???: ?∧? means ?and?; ?∨?means?or?;and?~? means ?not?. The meaning of these connectives is given by a truth table,where?T? stands for a true statement and ?F? stands for a false statement. One can consider the truth table as an Axiom. Table 1 A B ~A A∧B A∨B A?B A?B T T F T T T T T F F F T F F F T T F T T F F F T F F T T To read the truth table, consider row two where A is true and B is false. Then ~A is false since A is true, A∧B is false since B is, A∨B is true since at least one statement (A) is true, A?B is false since A can?t imply B when A is true and B isn?t. Notice that if A is false, then A?B is always true since B can be anything. Manipulating these connectives, we can prove some useful tautologies. The Trst set of tautologies are the commutative, associative, and distributive laws. To prove these tautologies, one can simply generate the appropriate truth table. For example, the truth table to prove (A∨(B∧C) ? ((A∨B)∧ (A∨C)) is: A B C B∧C A∨(B∧C) A∨B A∨C (A∨B)∧(A∨C) T T T T T T T T T T F F T T T T T F T F T T T T T F F F T T T T F T T T T T T T F T F F F T F F F F T F F F T F F F F F F F F F 1.1. RULES OF LOGIC 15 Since every case in whichA∨(B∧C)istrueorfalse,sois(A∨B)∧(A∨C), the two statements are equivalent. Theorem 1 Let A, B,andC be any statements. Then (A∨B) ? (B∨A) and (A∧B) ? (B∧A) (1.1) ((A∨B)∨C) ? (A∨(B∨C)) and ((A∧B)∧C) ? (A∧(B∧C)) (1.2) (A∨(B∧C) ? ((A∨B)∧(A∨C)) and (A∧(B∨C)) ? (A∧B)∨(A∧C)) (1.3) Exercise 1.1.1 Complete the proof of Theorem 1. The next set of results form the basis of the methods of logical reasoning we will be pursuing in this book. The Trst (direct) approach (1.4) is the syllogism, which says that ?if A is true and A implies B,thenB is true?. The second (indirect) approach (1.5) is the contradiction,whichsaysinwords that ?if not A leads to a false statement of the form B and not B,thenA is true. That is, one way to prove A is to hypothesize ~A, and show this leads to a contradiction. Another (indirect) approach (1.6) is the contrapositive, which says that ?A implies B isthesameaswheneverB is false, A is false?. Theorem 2 (A∧(A?B)) ?B (1.4) ((~A) ? (B∧(~B))) ?A (1.5) (A?B) ? ((~B) ? (~A)). (1.6) Proof. Before proceeding, we need a few results (we could have established these in the form of a lemma, but we?re just starting here). The Trst result 1 we need is that (A?B) ? ((~A)∨B) (1.7) and the second is ~ (~A) ?A. (1.8) 1 The result follows from A B A?B ~A∨B T T T T 16 CHAPTER 1. INTRODUCTION Inthecaseof(1.4),(A∧(A?B)) (1.7) ? (A∧((~A)∨B)) (1.3) ? (A∧ (~ A))∨(A∧B)) ?B by table 1.1. In the case of (1.5), ((~A) ? (B∧(~B))) (1.7) ? (A∨(B∧(~B))) ? A by table 1.1. In the case of (1.6), (A ? B) (1.7) ? ((~A)∨B) (1.1) ? (B∨(~A)) (1.8) ? (~ (~B)∨(~A)) (1.7) ? ((~B) ? (~A)). Note that the contrapositive of ?A?B?isnotthesameastheconverse of ?A?B?, which is ?B?A?. Another important way to ?construct? complicated statements from sim- ple ones is by the use of quantiTers. In particular, a quantiTer allows a statement A(x) to vary across elements x in some universe U. For example, x could be a price (whose universe is always positive) with the property that demand equals supply. When there is an x with the property A(x), we write (?x)A(x)tomeanthatforsomex in U, A(x) is true. 2 In the context of the previous example, this establishes there exists an equilibrium price. When all x have the property A(x), we write (?x)A(x)tomeanthatforallx, A(x) is true. 3 There are obvious relations between ???and???. In particular ~ ((?x)A(x)) ? (?x)(~A(x)) (1.9) ~ ((?x)A(x)) ? (?x)(~A(x)). (1.10) The second tautology is important since it illustrates the concept of a coun- terexample. In particular, (1.10) states ?If it is not true that A(x) is true for all x, then there must exist a counterexample (that is, an x satisfying ~ A(x)), and vice versa. Counterexamples are an important tool, since while hundreds of examples do not make a theorem, a single counterexample kills one. One should also note that the symmetry we experienced with ?∨?and ?∧? in (1.1) to (1.3) may break down with quantiTers. Thus while (?x)(A(x)∨B(x)) ? (?(x)A(x)∨?(x)B(x)) (1.11) canbeexpressedasatautology(i.e.???), it?s the case that (?x)(A(x)∧B(x)) ? (?(x)A(x)∧?(x)B(x)) (1.12) 2 Thus, we let ??? denote ?for some? or ?there exists a?. 3 Thus, we let ??? denote ?for all?. 1.2. TAXONOMY OF PROOFS 17 cannot be expressed that way (i.e. it is only ???). To see why (1.12) cannot hold as an ?if and only if? statement, suppose x is the set of countries in the world, A(x) is the property that x is above average gross domestic product andB(x)isthepropertythatx is below average gross domestic product, then there will be at least one country above the mean and at least one country below the mean (i.e. (?(x)A(x)∧?(x)B(x)) is true), but clearly there cannot be a country that is both above and below the mean (i.e. (?x)(A(x)∧B(x)) is false). We can make increasingly complex statements by adding more variables (e.g. the statementA(x,y) can vary across elements x and y in some universe e U). For instance, when A(x,y) states that ?y that is larger than x?where x and y are in the universe of real numbers, the statement (?x)(?y)(x<y) says ?for every x there is a y that is larger than x?, while the statement (?y)(?x)(x<y)says?thereisay which is larger than every x?. Note, however, the former statement is true, but the latter is false. 1.2 Taxonomy of Proofs While the previous section introduced the basics of the rules of logic (how to manipulate connectives and quantiTers to establish the truth of statements), here we will discuss broadly the methodology of proofs you will frequently encounter in economics. The most intuitive is the direct proof in the form of ?A?B?, discussed in (1.4). The work is to Tll in the intermediate steps so that A?A 1 and A 1 ?A 2 and ... A n?1 ?B are all tautologies. In some cases, it may be simpler to prove a statement like A ? B by splitting B into cases. For example, if we wish to prove the uniqueness of the least upper bound of a set A ?R, we can consider two candidate least upper bounds x 1 and x 2 in A and split B intothecaseswhereweassumex 1 is the least upper bound implying x 1 ≤ x 2 and another case where we assume x 2 is the least upper bound implying x 2 ≤ x 1 .But(x 1 ≤ x 2 ) ∧ (x 2 ≤ x 1 ) ? (x 1 = x 2 ) so that the least upper bound is unique. In other instances, one might want to split A into cases (call them A 1 and A 2 ), show A?(A 1 ∨A 2 ) and then show A 1 ?A and A 2 ?A. For example, to prove (0 ≤ x ≤ 1) ? ? x 2 ≤ x ¢ we can use the fact that (0 ≤ x ≤ 1) ? (x =0∨(0 <x≤ 1)) 18 CHAPTER 1. INTRODUCTION where the latter case allows us to consider the truth of B by dividing through by x. Anotherdirectmethodofproof,calledinduction,worksonlyforthenat- ural numbers N ={0,1,2,3,...}. Suppose we wish to show (?n ∈N)A(n)is true. This is equivalent to proving A(0)∧(?n ∈N)(A(n) ?A(n+1)). This works since A(0) is true and A(0) ? A(1) and A(1) ? A(2) and so on. In the next chapter, after we introduce set theory, we will show why induction works. As discussed before, two indirect forms of proof are the contrapositive (1.6) and the contradiction (1.5). In the latter case, we use the fact that ~ (A?B) ? (A∧(~B)) and show (A∧(~B)) leads to a contradiction (B∧(~B)). Since direct proofs seem more natural than indirect proofs, we now give an indirect proof of the First Welfare Theorem, perhaps one of the most important things you will learn in all of economics. It is so simple, that it is hard to Tnd a direct counterpart. 4 DeTnition 3 Given a Tnite vector of endowments y,anallocationx is fea- sible if for each good k, X i x i,k ≤ X i y i,k (1.13) where the summation is over all individuals in the economy. DeTnition 4 A feasible allocation x is a Pareto e?cient allocation if there is no feasible allocation x 0 such that all agents prefer x 0 to x. DeTnition 5 An allocation-price pair (x,p) in a competitive exchange econ- omy is a Walrasian equilibrium if it is feasible and if x 0 i is preferred by i to x i , then each agent i is maximized in his budget set X k p k x 0 i,k > X k p k y i,k (1.14) (i.e. i?s tastes outweigh his pocketbook). Theorem 6 (First Fundamental Theorem of Welfare Economics) If (x,p) is a Walrasian equilibrium, then x is Pareto e?cient. 4 See Debreu (1959, p.94). 1.3. BIBLIOGRAPHY FOR CHAPTER 1 19 Proof. By contradiction. Suppose x is not Pareto e?cient. Let x 0 be a feasible allocation that all agents prefer to x. Then by the deTnition of Walrasian equlibrium, we can sum (1.14) across all individuals to obtain X i ? X k p k x 0 i,k ! > X i ? X k p k y i,k ! ? X k p k ? X i x 0 i,k ! > X k p k ? X i y i,k ! . (1.15) Since x 0 is a feasible allocation, summing (1.13) over all goods we have X k X i p k x 0 i,k ≤ X k X i p k y i,k . (1.16) But(1.15)and(1.16)imply X k X i p k y i,k > X k X i p k y i,k , which is a contradiction. Here B is the statement ?x is Pareto E?cient?. So the proof by contra- diction assumes ~ B, which is ?Suppose x is not Pareto E?cient?. In that case, by deTnition 4, there?s a preferred allocation x 0 which is feasible. But if x 0 is preferred to x,thenitmustcosttoomuchifitwasn?tchoseninthe Trst place (this is 1.14). But this contradicts that x 0 was feasible. 1.3 Bibliography for Chapter 1 An excellent treatment of this material is in McA?ee (1986, Economics 241 handout). See also Munkres (1975, p. 7-9) and Royden (1988, p. 2-3). 20 CHAPTER 1. INTRODUCTION Chapter 2 Set Theory The basic notions of set theory are those of a group of objects and the idea of membership in that group. In what follows, we will Tx a given universe (or space) X and consider only sets (or groups) whose elements (or members) are elements of X. We can express the notion of membership by ?∈ ?sothat ? x ∈ A?means?x is an element of the set A?and?x/∈ A?means?x is not an element of A?. Since a set is completely determined by its elements, we usually specify its elements explicitly by saying ?The set A is the set of all elements x in X such that each x has the property A (i.e. that A(x)istrue)? and write A = {x ∈ X : A(x)}. 1 This also makes it clear that we identify sets with statements. Example 7 Agent i 0 s budget set, denoted B i (p,y i )={x i ∈ X : P k p k x i,k ≤ P k p k y i,k }, is the set of all consumption goods that can be purchased with endowments y i . DeTnition 8 If each x ∈ A is also in the set B (i.e. x ∈ A ? x ∈ B), then we say A is a subset of B (denoted A ? B). If A ? B and ?x ∈ B such that x/∈ A, then A is a proper subset of B. If A ? B, then it is equivalent to say that B contains A (denoted B ? A). DeTnition 9 A collection is a set whose elements are subsets of X. The power set of X,denotedP(X), is the set of all possible subsets of X (it has 2 #(X) elements, where #(X) denotes the number of elements (or cardinality) 1 In those instances where the space is understood, we sometimes abbreviate this as A = {x : A(x)}. 21 22 CHAPTER 2. SET THEORY of the set X). A family is a set whose elements are collections of subsets of X. DeTnition 10 Two sets are equal if (A ? B)∧(B ? A) (denoted A = B). DeTnition 11 A set that has no elements is called empty (denoted ?). Thus, ? = {x : x ∈ X : A(x)∧(~A(x))}. The empty set serves the same role in the theory of sets as 0 serves in the counting numbers; it is a placeholder. Example 12 Let the universe be given by X = {a,b,c}. We could let A = {a,b},B= {c} be subsets of X, C = {A,B},D = {?},P(X)= {?,{a},{b},{c},{a,b},{a,c},{b,c},X} be collections, and F = {C} be a family. The next result provides the Trst example of the relation between set theory and logical rules we developed in Chapter 1. In particular, it relates ???and???aswellas?=?and???. Theorem 13 Let A = {x ∈ X : A(x)} and B = {x ∈ X : B(x)}.Then (a) A ? B ? (?x ∈ X)(A(x) ? B(x)) and (b) A = B ? (?x ∈ X)(A(x) ? B(x)). Proof. Just use deTnition (8) in (a) A ? B ? x ∈ A ? x ∈ B ? A(x) ? B(x)anddeTnition (10) in (b) A = B ? (A ? B) ∧ (B ? A) ? (?x ∈ X)(A(x) ?B(x)). The following are some of the most important sets we will encounter in this book: ? N = {1,2,3,...},the natural or ?counting? numbers. ? Z = {...,?2,?1,0,1,2,...}, the integers. Z + = {0,1,2,...},the non- negative integers. ? Q = { m n : m,n ∈Z, n 6=0}, the rational numbers. ? Chapter 3 will discuss the real numbers, which we denote R.Thisset just adds what are called irrational numbers to the above rationals. 2.1. SET OPERATIONS 23 The are several important results you will see at the end of this chapter. The Trst establishes that there are fundamentally di?erent sizes of inTnite sets. While some inTnite sets can be counted, others are uncountable. These results are summarized in Theorem 71 and Theorem 80. The second re- sult establishes that there always exists a smallest collection of subsets of a given set where all results of set operations (like complements, union, and intersection) remain in the collection (Theorem 87). 2.1 Set Operations The following operations help us construct new sets from old ones. The Trst three play the same role for sets as the connectives ?~?, ?∧?, and ?∨?played for statements. DeTnition 14 If A ? X,wedeTne the complement of A (relative to X)(denotedA c ) to be the set of all elements of X that do not belong to A. That is, A c = {x ∈ X : x/∈ A}. DeTnition 15 If A,B ? X,wedeTne their intersection (denoted A∩B) to be the set of all elements that belong to both A and B. That is, A∩B = {x ∈ X : x ∈ A∧x ∈ B}. DeTnition 16 If A,B ? X and A ∩ B = ?,then we say A and B are disjoint. DeTnition 17 If A,B ? X,wedeTne their union (denoted A∪B)tobe the set of all elements that belong to A or B or both (i.e. or is inclusive). That is, A∪B = {x ∈ X : x ∈ A∨x ∈ B}. DeTnition 18 If A,B ? X,wedeTne their di?erence (or relative com- plement of A in B)(denotedA\B) to be the set of all elements of A that do not belong to B. That is, A\B = {x ∈ X : x ∈ A ∧ x/∈ B}. Each of these deTnitions can be visualized in Figure 2.1.1 through the useofVennDiagrams.ThesedeTnitions can easily be extended to arbitrary collections of sets. Let Λ be an index set (e.g. Λ = N or a Tnite subset of N) and let A i ,i∈ Λ be subsets of X.Then∪ i∈Λ A i = {x ∈ X :(?i)(x ∈ A i )}. Indexed families of sets will be deTned formally after we develop the notion of a function in Section 5.2. 24 CHAPTER 2. SET THEORY 2.1.1 Algebraic properties of set operations The following commutative, associative, and distributive properties of sets are natural extensions of Theorem 1 and easily seen in Figure 2.1.2. Theorem 19 Let A, B, C be any sets. Then (i) (C) A ∩ B = B ∩ A, A∪B = B∪A;(ii) (A) (A∩B)∩C = A∩(B∩C), (A∪B)∪C = A∪(B∪C); and (iii) (D) A∩(B∪C)=(A∩B)∪(A∩C), A∪(B∩C)=(A∪B)∩(A∪C). Exercise 2.1.1 Prove Theorem 19.This amounts to applying the logical con- nectives and above deTnitions. Besides using Venn Diagrams, we can just use the deTnition of ∩ and ∪. For example,to show A∩B = B∩A,itissu?cient to note x ∈ A∩B ? (x ∈ A) ∧ (x ∈ B) 1.1 ? (x ∈ B) ∧ (x ∈ A) ? x ∈ B∩A. The following properties are used extensively in probability theory and are easily seen in Figure 2.1.3. Theorem 20 (DeMorgan?s Laws) If A, B, C are any sets, then (a) A\(B∪ C)=(A\B)∩(A\C), and (b) A\(B∩C)=(A\B)∪(A\C). Proof. (a) 2 parts. (i,?) Suppose x ∈ A\(B∪C). Then x ∈ A and x/∈ (B∪C). Thus x ∈ A and (x/∈ B and x/∈ C). This implies x ∈ A\B and x ∈ A\C. But this is just x ∈ (A\B)∩(A\C). (ii,?)Supposex ∈ (A\B)∩(A\C). Then x ∈ (A\B)andx ∈ (A\C). Thus x ∈ A and ( x/∈ B or x/∈ C). This implies x ∈ A and x/∈ (B ∪C). Butthisisjustx ∈ A\(B∪C). Exercise 2.1.2 Finish the proof of Theorem 20. 2.2 Cartesian Products There is another way to construct new sets out of given ones; it involves the notion of an ?ordered pair? of objects. That is, in the set {a,b} there is no preference given to a over b;i.e. {a,b} = {b,a} so that it is an unordered pair. We can also consider ordered pairs (a,b) where we distinguish between the Trst and second elements. 2 2 Don?t confuse this notation with the interval consisting of all real numbers such that a<x<b. 2.3. RELATIONS 25 DeTnition 21 If A and B are nonempty sets, then the cartesian product (denoted A×B) is just the set of all ordered pairs {(a,b):a ∈ A and b ∈ B}. Example 22 A = {1,2,3},B= {4,5},A×B = {(1,4),(1,5),(2,4),(2,5), (3,4),(3,5)}. Example 23 A =[0,1]∪[2,3],B=[1,2]∪[3,4],A×B in Figure 2.2.1 This set operation also generalizes to Tnite and inTnite index sets. 2.3 Relations To be able to compare elements of a set, we need to deTne how they are related. The general concept of a relation underlies all that will follow. For instance, just comparing the real numbers 1 and 2 requires such a deTnition. Furthermore, a correspondence or function is just a special case of a relation. In what follows, our deTnitions of relations, correspondences, and functions are meant to emphasize that they are simply special kinds of sets. DeTnition 24 Given two sets A and B,abinary relation between mem- bers of A and members of B is a subset R ? A × B.Weusethenotation (a,b) ∈ R to denote the relation R on A×B andreadit?a is in the relation R to b?. If A = B we say that R is the relation on the set A. Example 25 Let A = {Austin, Des Moines, Harrisburg} and B = {Texas, Iowa, Pennsylvania}. Then the relation R = {(Austin,Texas), (DesMoines,Iowa), (Harrisburg,Pennsylvania)} expresses ?is the state capital of?. In general, we can consider n-nary relations between members of sets A 1 , A 2 , ..., A n which is just the subset R ? A 1 ×A 2 ×...×A n . A relation is characterized by a certain set of properties that it possesses. We next consider important types of relations that di?er in their symmetry properties. 2.3.1 Equivalence relations DeTnition 26 An equivalence relation on a set A is a relation ? ~ 0 hav- ing the following three properties: (i) Re?exivity, x ~ x, ?x ∈ A; (ii) Sym- metry, if x ~ y,theny ~ x, ?x,y ∈ A;and (iii) Transitivity, if x ~ y and y ~ z,thenx ~ z, ?x,y,z ∈ A. 26 CHAPTER 2. SET THEORY Example 27 Equality is an equivalence relation on R. Example 28 DeTne the congruence modulo 4 relation ?M 0 onZby?x,y ∈ Z, xMy if remainders obtained by dividing x and y by 4 are equal. For ex- ample, 13M65 because dividing 13 and 65 by 4 give the same remainder of 1. Exercise 2.3.1 Show that congruence modulo 4 is an equivalence relation. DeTnition 29 Given an equivalence relation ~ on a set A and an element x ∈ A, we deTne a certain subset E of A called the equivalence class de- termined by x by the equation E = {y ∈ A : y?x}. Note that the equivalence class determined by x contains x since x?x. Example 30 The equivalence classes of Z for the relation congruence mod- ulo4aredeterminedbyx ∈{0,1,2,3} where E x = {z ∈Z : z =4k +x,k ∈Z} (i.e. x is the remainder when z is divided by 4). Equivalence classes have the following property. Theorem 31 Two equivalence classes E and E 0 are either disjoint or equal. Proof. Let E = {y ∈ A : y?x} and E 0 = {y ∈ A : y?x 0 }. Consider E∩E 0 . It can be either empty (in which case E and E 0 are disjoint) or nonempty. Let z ∈ E∩E 0 . We show that E = E 0 . Let w ∈ E. Then w?x. Since z ∈ E∩E 0 , we know z?x and z?x 0 so that by transitivity x?x 0 . Also by transitivity w?x 0 so that w ∈ E 0 .ThusE ? E. Symmetry allows us to conclude that E 0 ? E as well. Hence E = E 0 . GivenanequivalencerelationonA, let us denote by E the collection of all equivalence classes. Theorem 31 shows that distinct elements of E are disjoint. On the other hand, the union of all the elements of E equals all of A because every element of A belongs to an equivalence class. In this case we say that E is a partition of A. DeTnition 32 A partition of a set A is a collection of disjoint subsets of A whose union is all of A. Example 33 It is clear that the equivalence classes of Z in Example (30) is a partition since, for instance, E 0 = {...,?8,?4,0,4,8,...},E 1 = {...,?5,?1,1,5,...}, E 2 = {...,?6,?2,2,6,...},E 3 = {...,?7,?3,3,7,...} are disjoint and their union is all of Z. Another simple example is a coin toss experiment where the sample space S = {Heads,Tails} has mutually exclusive events (i.e.Heads∩ Tails= ?). 2.3. RELATIONS 27 2.3.2 Order relations Arelationthatisre?exive and transitive but not symmetric is said to be an order relation. If we consider special types of non symmetry, we have special types of order relations. DeTnition 34 Arelation?R?onA is said to be a partial ordering of asetA if it has the following properties: (i) Re?exivity, xRx, ?x ∈ A; (ii) Antisymmetry, if xRy and yRx,thenx = y,?x,y ∈ A; and (iii) Transitivity, if xRy and yRz, then xRz, ?x,y,z ∈ A. We call (R,A) a partially ordered set. Example 35 ? ≤?isapartialorderingonR and ??? is a partial ordering on P(A). It is clear that ≤ is not symmetric on R; just take x =1and y =2. It is also clear that ? is not symmetric on P(A); if A = {a,b}, then while {a} ? A it is not the case that A ? {a}. Finally, ?- 1 ?onR×R given by (x 1 ,x 2 ) - 1 (y 1 ,y 2 ) if x 1 ≤ y 1 and x 2 ≤ y 2 is a partial ordering since it is clear that - 1 is not symmetric on R×R because ≤ is not symmetric even on R. DeTnition 36 A partially ordered relation ?R?onA is said to be a total (or linear) ordering of A if (i) Completeness, for any two elements x,y ∈ A we have either xRy or yRx.We call (R,A) a totally ordered set. A chain in a partially ordered set is a subset on which the order is total. Thus, a total ordering means that any two elements x and y in A can be compared, unlike a partial ordering where there are elements that are noncomparable. Exercise 2.3.2 Show that if A ? B and B is totally ordered, then A is totally ordered. We write x ? y if x 1 y and x 6= y, and call ???astrict partial or strict total ordering. Example 37 ?<? is a strict total ordering on R while ?≤? is a total ordering on R, both of which follow by the completeness axiom of real numbers. ???is not a total ordering on P(A) since if A = {a,b}, there is no inclusion relation between the sets {a} and {b}.?- 1 ?on R×R given in Example 35 is not a 28 CHAPTER 2. SET THEORY total ordering because we can?t compare elements where x 1 ≤ y 1 and x 2 ≥ y 2 . However, a line passing through the origin having positive slope is a chain. On the other hand, the relation ?- 2 ?onR×R given by (x 1 ,x 2 )- 2 (y 1 ,y 2 ) if x 1 ≤ y 1 or if x 1 = y 1 and x 2 ≤ y 2 is a total ordering. 3 This is also known as a lexicographic ordering since the Trst element of the totally ordered set has the highest priority in determining the ordering just as the Trst letter of a word does in the ordering of a dictionary. We compare ?- 1 ?to?- 2 ?inFigure 2.3.1 for the following four elements x = ? 1 4 , 1 4 ¢ , y = ? 1 2 , 1 2 ¢ , z = ? 1 4 , 3 4 ¢ in R×R. There are 3 pairwise comparisons for each relation. First consider ? - 1 ? . We have x - 1 y, x - 1 z but y and z are not comparable under ?- 1 ?, which is why we call it a partial ordering. Next consider ? - 2 ? where each pair is comparable (i.e. we have x - 2 z, x - 2 y, and z - 2 y)whichiswhy we call it a total ordering. Notice that by transitivity they can be ranked (all can be placed in the dictionary). There are other types of order relations. DeTnition 38 A weak order relation assumes: (i) transitivity; (ii) com- pleteness; and (iii) non symmetry (just the negation of symmetry deTned in 26. 4 Weak order relations form the basis for consumer choice. Example 39 Preference relations: We can represent consumer preferences by the binary relation % deTned on a non-empty, closed, convex consumption set X. If (x 1 ,x 2 ) ∈% or x 1 % x 2 we say ?consumption bundle x 1 is at least as good as x 2 ?. We embody rationality or consistency by completeness and transitivity. 5 Exercise 2.3.3 Why aren?t preference relations just total orderings? Why are they weak orderings? Show why indi?erence is an equivalence relation. Because elements of a partiallyorderedset are not necessarilycomparable, it may be the case that a maximum and/or minimum of a two element set doesn?t even exist. We turn to this next. 3 Don?tbeconfusedthatwehaveleftoutacase(i.e. x 1 >y 1 ) by considering only x 1 ≤ y 1 or if x 1 = y 1 and x 2 ≤ y 2 . For instance, if the two elements we are considering are (2,3) and (1,7), simply take x =(1,7) and y =(2,3). The point is that any two real numbers can be compared using ?≤?. 4 Re?exivity is implied by completeness. 5 Experiments show that transitivity is often violated. 2.3. RELATIONS 29 DeTnition 40 Let - be a partial ordering of X. An upper bound for a set A ? X is an element u ∈ X satisfying x - u, ?x ∈ A. The supremum of a set is its least upper bound and when the set contains its supremum we call it a maximum. A lower bound for a set A ? X is an element l ∈ X satisfying l - x, ?x ∈ A. The inTmum of a set is its greatest lower bound and when the set contains its inTmum we call it a minimum. DeTnition 41 AsetS is bounded above if it has an upper bound; bounded below if it has a lower bound; bounded if it has an upper and lower bound; unbounded if it lacks either an upper or a lower bound. We deTne the operators x ∨ y to denote the supremum and x ∧ y the inTmum of the two point set {x,y}. 6 If X is a total order, then x and y are comparable, so that one must be bigger or smaller than the other in which case x ∨ y =max{x,y} and x ∧ y =min{x,y}. However, if X is a partial order, then x and y may not be comparable but we can still Tnd their supremum and inTmum. DeTnition 42 A lattice is a partially ordered set in which every pair of elements has a supremum and an inTmum. Exercise 2.3.4 Show that: (i) every Tnite set in a lattice has a supremum and an inTmum; and (ii) if a lattice is totally ordered, then every pair of elements has a minimum and a maximum. Hint: (i) sup{x 1 ,x 2 ,x 3 } = sup{sup{x 1 ,x 2 },x 3 }. Exercise 2.3.5 Show that a totally ordered set L is always a lattice. Next we give examples of partially oredered sets that are not totally ordered yet have a lattice structure. For any set X, an example is P(X)with ? is a lattice where if A,B ∈P(X), then A∨B = A∪B and A∧B = A∩B.. Example 43 Let X = {a,b},so that P(X)={?,{a},{b},{a,b}}. Then, for instance, {a}∨ {b} = {a,b}, {a}∧{b} = ?, {a}∧{a,b} = {a}, and {a}∨? = {a}. 6 Here?s another place where we don?t have enough good symbols to go around. Don?t confuse ?∨?and?∧? here with the logical connectives in Chapter 1. 30 CHAPTER 2. SET THEORY Example 44 R×R is a lattice with the ordering ?1 1 ?. The inTmum and supremum of any two points x,y are given by x∨y =(max{x 1 ,y 1 },max{x 2 ,y 2 }) and x∧y =(min{x 1 ,y 1 },min{x 2 ,y 2 }). SeeFigure2.3.2whereweconsider the noncomparable elements x =(1,0) and y =(0,1). Example 45 The next example shows that not every partially ordered set is a lattice. We show this by resorting to the following subset X = {(x 1 ,x 2 ) ∈ R : x 2 1 + x 2 2 ≤ 1}. For - 1 on X, sup{(0,1),(1,0)} does not exist. See Figure 2.3.3. While the next result is stated as a lemma, we will take it as an axiom. 7 It will prove useful in separation theorems which are used extensively in economics. Lemma 46 (Zorn) If A is a partially ordered set such that each totally or- dered subset (a chain) has an upper bound in A,thenA has a maximal element. Example 47 (1,1) is the maximal element of ?- 1 ?onA =[0,1]×[0,1].The upper bounds of each chain in A are given by the intersection of the lines (chains) with the x =1or y =1axes. See Figure 2.3.4. ADD WELL ORDERING??? 2.4 Correspondences and Functions In your Trst economics classes you probably saw downward sloping demand and upward sloping supply functions, and perhaps even correspondences (e.g. backward bending labor supply curves). Given that we have already intro- duced the idea of a relation, here we will deTne correspondences and functions simply as a relation which has certain properties. DeTnition 48 Let A and B be any two sets. A correspondence G, de- noted G : A →→ B, is a relation between A and P(B) (i.e. G ? A×P(B)). That is, G isarulethatassignsasubsetG(a) ? B to each element a ∈ A. 7 It is can be shown to be equivalent to the Axiom of choice. 2.4. CORRESPONDENCES AND FUNCTIONS 31 DeTnition 49 Let A and B be any two sets. A function (or mapping) f, denoted f : A → B, is a relation between A and B (i.e. f ? A × B) satisfying the following property: if (a,b) ∈ f and (a,b 0 ) ∈ f, then b = b 0 . That is, f is a rule that assigns a unique element f(a) ∈ B to each a ∈ A. A is called the domain of f,sometimes denoted D(f).Therange of f, denoted R(f),is{b ∈ B : ?a ∈ A such that (a,b) ∈ f}.Thegraph of f is G(f)={(a,b)∈f : ?a ∈ A}. Thus, a function can be thought of as a single valued correspondence. A function is deTned if the following is given: (i) The domain D(f). (ii) An assignment rule a → f(a)=b, a ∈ D(f). Then R(f)isdeterminedby these two. See Figure 2.4.1a for a function and 2.4.1b for a correspondence, as well as Figure 2.4.2a and Figure 2.4.2b for another interpretation which emphasizes ?mapping?. Example 50 A sequence is a function f : N→B for some set B. DeTnition 51 Let f be an arbitrary function with domain A and R(f) ? B. If E ? A,then the (direct) image of E under f,denotedf(E), is the subset {f(a)|a ∈ E ∩D(f)}? R(f). SeeFigure2.4.3a. Theorem 52 Let f be a function with domain A and R(f) ? B and let E,F ? A.(a)IfE ? F, then f(E) ? f(F). (b) f(E ∩F) ? f(E)∩f(F), (c) f(E ∪F)=f(E)∪f(F),(d)f(E\F) ? f(E). Proof. (a) If a ∈ E,then a ∈ F so f(a) ∈ f(F). But this is true ?a ∈ E, hence f(E) ? f(F). Exercise 2.4.1 Finish the proof of Theorem 52. DeTnition 53 If H ? B, then the inverse image of H under f, denoted f ?1 (H), is the subset {a|f(a) ∈ H}? D(f). See Figure 2.4.3b. It is important to note that the inverse image is di?erent from the inverse function (to be discussed shortly). The inverse function need not exist when the inverse image does. See Example 65. Theorem 54 Let G,H ? B.(a)IfG ? H,,thenf ?1 (G) ? f ?1 (H). (b) f ?1 (G ∩ H)=f ?1 (G) ∩ f ?1 (H),(c)f(G ∪ H)=f ?1 (G) ∪ f ?1 (H),(d) f ?1 (G\H)=f ?1 (G)\f ?1 (H). 32 CHAPTER 2. SET THEORY Proof. (a) If a ∈ f ?1 (G), then f(a) ∈ G ? H so a ∈ f ?1 (H). Exercise 2.4.2 Finish the proof of Theorem 54. Exercise 2.4.3 Let f : A →B be a function. Prove that the inverse images f ?1 ({a}) and f ?1 ({a 0 }) are disjoint. FIX 2.4.1 Restrictions and extensions Example 55 Let A = R\{0} and f(a)= 1 a . Then R(f)=R\{0}.Se Figure 2.4.4a To deal with the above ?hole? in the domain of f in Example 55, we can employ the idea of restricting or extending the function to a given set. DeTnition 56 Let ?? D(f). The restriction of f to the set ?, which we will denote f| r ?, is given by {(a,b) ∈ f : a ∈ ?}.Let? ? D(f). The extension of f on the set ?, which we will denote f| e ?, is given by ? (a,b):b = ? f(a) a ∈ D(f) g(a) a ∈?\D(f) ? . Example 57 See Figure 2.4.4b for a restriction of 1 a to ? = R ++ and Figure 2.4.4c for an extension of 1 a on ? = R is ? b = ? 1 a a 6=0 0 a =0 ? . Note that extensions are not generally unique. 2.4.2 Composition of functions DeTnition 58 Let f : A → B and g : B 0 → C.LetR(f) ? B 0 . The composition g ?f is the function from A to C given by g ?f = {(a,c) ∈ A×C : ?b ∈ R(f) ? B 0 3 (a,b) ∈ f and (b,c) ∈ g}. 8 See Figure 2.4.5. Note that order matters, as the next example shows. Example 59 Let A ? R,f(a)=2a, and g(a)=3a 2 ? 1. Then g ? f = 3(2a) 2 ?1=12a 2 ?1 while f ?g =2(3a 2 ?1) = 6a 2 ?2. 8 Alternatively, we create a new function h(a)=g(f(a)). 2.4. CORRESPONDENCES AND FUNCTIONS 33 2.4.3 Injections and inverses DeTnition 60 f : A → B is one-to-one or an injection if whenever (a,b) ∈ f and (a 0 ,b) ∈ f for a,a 0 ∈ D(f), then a = a 0 . 9 DeTnition 61 Let f be an injection. If g = {(b,a) ∈ B × A :(a,b) ∈ f}, then g is an injection with D(g)=R(f) and R(g)=D(f). The function g is called the inverse to f and denoted f ?1 . SeeFigure2.4.6. 2.4.4 Surjections and bijections DeTnition 62 If R(f)=B, f maps A onto B (in this case, we call f a surjection). See Figure 2.4.7 DeTnition 63 f : A → B is a bijection if it is one-to-one and onto (or an injection and a surjection). Example 64 Let E =[0,1] ? A = R, H =[0,1] ? B = R,andf(a)=2a. See Figure 2.4.8. R(f)=R so that f is a surjection, the image set is f(E)=[0,2],the inverse image set is f ?1 (H)=[0, 1 2 ],fis an injection and has inverse f ?1 (b)= 1 2 b, and as a consequence of being one-to-one and onto, is a bijection. Notice that if F =[?1,0], then f(F) ∩ f(E)={0} and f(E∩F)=f(0) = {0},so that in the special case of injections statement (b) of Theorem 52 holds with equality. Example 65 Let E =[0,1] ? A = R, H =[0,1] ? B = R,andf(a)=a 2 . SeeFigure2.4.9.R(f)=R + so that f is not a surjection, the image set is f(E)=[0,1],the inverse image set is f ?1 (H)=[?1,1],fis not an injection (since, for instance, f(?1) = f(1) = 1), and is obviously not a bijection. However, the restriction of f to R + or R ? (in particular, let f + ≡ f| r R + and f ? ≡ f| r R ? )isaninjectionandf ?1 + (b)= √ b while f ?1 ? (b)=? √ b. Finally, notice that if F =[?1,0], then f(F) ∩f(E)=[0,1] but that f(E ∩F)= f(0) = 0,which is why we cannot generally prove equality in statement (b) of Theorem 52. The next theorem shows that composition preserves surjection. It is useful to prove that statements about inTnite sets. 9 Alternatively, we can say f is one-to-one if f(a)=f(a 0 )onlywhena = a 0 . 34 CHAPTER 2. SET THEORY Theorem 66 Let f : A → B and g : B → C be surjections. Their composi- tion g?f is a surjection. Exercise 2.4.4 Prove Theorem 66. Answer: We must show that for g?f : A → C given by (g?f)(a)=g(f(a)),it is the case that ?c ∈ C, there exists a ∈ A such that (g?f)(a)=c. To see this, let c ∈ C.Sinceg is a surjection, ?b ∈ B such that g(b)=c. Similarly, since f is a surjection, ?a ∈ A such that f(a)=b. Then (g?f)(a)=g(f(a)) = g(b)=c. 2.5 Finite and InTnite Sets The purpose of this section is to compare sizes of sets with respect to the number of elements they contain. Take two sets A = {1,2,3} and B = {a,b,c,d}. ThenumberofelementsofthesetA (also called the cardinality of A, denoted card(A)) is three and of the set B isfour.Inthiscasewesay that the set B is bigger than the set A. It is hard, however, to apply this same concept in comparing, for instance, the set of all natural numbers N with the set of all integers Z. Both are inTnite. Is the ?inTnity? that represents card(N) smaller than the ?inTnity? that represents card(Z)? One might think the statement was true because there are integers that are not real numbers (e.g. ?1,?2,?3,...). We will show however that this statement is false, but Trst we have to introduce a di?erent concept of the size of a set known as countability and uncountablity. To illustrate it, one of the authors placed a set of 3 coins in front of his 3 year old daughter and asked her ?Is that collection of coins countable??. She proceeded to pick up the Trstcoinwithherrighthand,putitinherleft hand, and said ?1?, pick up the second coin, put it in her left hand, and said ?2?, and pick up the Tnal coin, put it in her left hand, and said ?3?. Thus, she put the set of coins into a one-to-one assignment with the Trst three natural numbers. We will now make use of one-to-one assignments between elements of two sets. DeTnition 67 Two sets A and B are equivalent if there is a bijection f : A → B. DeTnition 68 An initial segment (or section) of N is the set a n = {i ∈ N : i ≤ n}. 2.5. FINITE AND INFINITE SETS 35 DeTnition 69 AsetA is Tnite if it is empty or there exists a bijection f : A →a n for some n ∈N.IntheformercaseA has zero elements and in the latter case A has n elements. Lemma 70 Let B be a proper subset of a Tnite set A. There does not exist a bijection f : A →B. Proof. (Sketch) Since A is Tnite, ?f : A →a n .IfB is a proper subset of A, then it contains m<nelements. But there cannot be a bijection between n and m elements. Exercise 2.5.1 Prove lemma 70 more formally. See lemma 6.1 in Munkres. Lemma70saysthatapropersubsetofaTnite set cannot be equivalent with the whole set. This is quite clear. But is it true for any set? Let?s consider N = {1,2,3,4,...} and a proper subset N\{1} = {2,3,4,...}. We can construct a one-to-one assignment from N onto N\{1} (i.e. 1 → 2, 2 → 3,...). Thus, in this case, it is possible for a set to be equivalent with its proper subset. Given Lemma 70, we must conclude the following. Theorem 71 N is not Tnite. Proof. By contradiction. Suppose N is Tnite. Then .f : N→N\{1} deTned by f(n)=n + 1 is a bijection of N with a proper subset of itself. This contradicts Lemma 70. DeTnition 72 AsetA is inTnite if it is not Tnite. It is countably inT- nite if there exists a bijection f : N→ A. Thus, N is countably inTnite since f can be taken to be the identity function (which is a bijection). DeTnition 73 Asetiscountable if it is Tnite or countably inTnite. A set that is not countable is uncountable. Next we examine whether the set of integers, Z,iscountable. Thatis, are N and Z equivalent? This isn?t apparent since N = {1,2,...} has one end of the set that goes to inTnity, while Z = {...,?2,?1,0,1,2,...} has two ends of the set that go to inTnity. But it is possible to reorganize Z in a way that looks like N since we can simply construct Z = {0,1,?1,2,?2,...}. One can think of this set as being constructed from two rows {0,1,2,...} and {?1,?2,...} by alternating between the Trst and second rows. This is formalized in the next example. 36 CHAPTER 2. SET THEORY Example 74 The set of integers, Z, is countably inTnite. The function f : Z→N deTned by f(z)= ? 2z if z>0 ?2z +1 if z ≤ 0 is a bijection. Exercise 2.5.2 Prove that a Tnite union of countable sets is countable. Next we examine whetherN×N~N. As in the preceding example where we had two rows, we can think about enumerating the set N×N in Figure 2.5.1. As in the preceding example, each row has inTnitely many elements but now there are an inTnite number of rows. Yet all of the elements of this ?inTnite matrix? can be enumerated if we start from (1,1) and then continue by following the arrows. This enumeration provides us with the desired bijection as shown next. Example 75 The cartesian product N×N is countably inTnite. First, let the bijection g : N×N→A, where A ? N×N consists of pairs (x,y) for which y ≤ x, be given by g(x,y)=(x+ y?1,y). Next construct a bijection h : A →N given by h(x,y)= 1 2 (x?1)x+y. Then the composition f = h?g is the desired bijection. We can actually weaken the condition for proving countability of a given set A. The next theorem accomplishes this. Theorem 76 Let A be a non-empty set. The following statements are equiv- alent: (i) There is a surjection f : N→A. (ii) There is an injection g : A → N. (iii) A is countable. Proof. (Sketch) (i)?(ii). Given f,deTne g : A → N by g(a) =smallest element of f ?1 ({a}). Since f is a surjection, the inverse image f ?1 ({a})is non-empty so that g is well deTned. g is an injection since if a 6= a 0 , the sets f ?1 ({a})andf ?1 ({a 0 }) are disjoint (recall Exercise 2.4.3), so their smallest elements are distinct proving g : A →N is an injection. (ii)?(iii). Since g : A → R(g) is a surjection by deTnition, g : A → R(g) is a bijection. Since R(g) ?N, A must be countable. (iii)?(i). By deTnition. 2.5. FINITE AND INFINITE SETS 37 Exercise 2.5.3 Finish parts (ii)?(iii) and (iii)?(i) of the proof of Theorem 76. See Munkres 7.1 (USES WELL ORDERING). Example 77 The set of positive rationals, Q ++ , is countably inTnite. DeTne a surjection g : N×N→Q ++ by g(n,m)= m n .SinceN×N is countable (Example 75), there is a surjection h : N→N×N. Then f = g?h : N→Q ++ is a surjection (Theorem 66) so by Theorem 76, Q ++ is countable. The intuition for the preceding example follows simply from Figure 2.5.1 if you replace the ?,? with ?/?. That is, replace (1,1) with the rational 1 1 , (1,2) with the rational 1 2 ,(3,2) with the rational 3 2 ,etc. Theorem 78 A countable union of countable sets is countable. Proof. Let {A i ,i∈ Λ} be an indexed family of countable sets where Λ is countable. Because each A i is countable, for each i we can choose a surjection f i : N→A i . Similarly, we can choose a surjection g : N→¤.DeTne h : N×N→∪ i∈Λ A i by h(n,m)=f g(n) (m), which is a surjection. Since N×N is in bijective correspondence with N (recall Example 75), the countability of the union follows from Theorem 76. The next theorem provides an alternative proof of example 75. Theorem 79 A Tnite product of countable sets is countable. Proof. Let A and B be two non-empty, countable sets. Choose surjective functions g : N → A and h : N → B. Then the function f : N×N→A×B deTned by f(n,m)=(g(n),h(m)) is surjective. By Theorem 76, A × B is countable. Proceed by induction for any Tnite product. While it?s tempting to think that this result could be extended to show that a countable product of countable sets is countable, the next Theorem shows this is false. Furthermore, it gives us our Trst example of an uncount- able set. Theorem 80 Let X = {0,1}. The set of all functions x : N→X, denoted X ω , is uncountable. 10 10 An alternative statement of the theorem is that the set of all inTnite sequences of X is uncountable. 38 CHAPTER 2. SET THEORY Proof. We show that any function g : N→X ω is not a surjection. Let g(n)=(x n1 ,x n2 ,...,x nm ,...)whereeachx ij is either 0 or 1. DeTne a point y =(y 1 ,y 2 ,...,y n ,...)ofX ω by letting y n = ? 0ifx nn =1 1ifx nn =0 . Now y ∈ X ω and y is not in the image of g. That is, given n, g(n)andy di?er in at least one coordinate, namely the n th .Thusg is not a surjection. The diagonal argument used above (See Figure 2.5.2) will be useful to establish the uncountability of the reals, which we save until Chapter 3. Exercise 2.5.4 Consider the following game known as ?matching pennies?. You (A)andI(B) each hold a penny. We simultaneously reveal either ?heads? (H)or?tails?(T) to each other. If both faces match (i.e. both heads or both tails) you receive a penny, otherwise I get the penny. The ac- tion sets for each player are S A = S B = {H,T}. Now suppose we decide to play this game every day for the indeTnite (inTnite) future (we?re opti- mistic about medical technology). Before you begin, you should think of all the di?erent combinations of actions you may employ in the inTnitely re- peated game. For instance, you may alternate H and T starting with H in the Trst round. Prove that although the number of actions you play in the inTnitely repeated game is countable and the set of actions S A is Tnite, the set of possible combinations of actions (S A ×S A ×...) is uncountable. 2.6 Algebras of Sets An algebra is just a collection of sets (which could be inTnite) that is closed under (Tnite) union and complementation. It is used extensively in proba- bility and measure theory. DeTnition 81 A collection A of subsets of X is called an algebra of sets if (i) A c ∈A if A ∈A and (ii) A∪B ∈A if A,B ∈A. Note that ?,X∈A since, for instance, A ∈A? A c ∈A by (i) and then A∪A c = X ∈ A by (ii). It also follows from De Morgan?s laws that (iii) A∩B ∈A if A,B ∈A. The deTnition extends to larger collections (just take unions two at a time). 2.6. ALGEBRAS OF SETS 39 Theorem 82 Given any collection C of subsets of X, there is a smallest algebra A which contains C. Proof. (Sketch) It is su?cient to show there is an algebra A containing C such that if B is any algebra containing C,thenB ?A.LetF be the family of all algebras that contain C (which is nonempty since P(X) ∈ F). Let A = ∩{B : B ∈ F}. Then C is a subcollection of A since each B in F contains C. All that remains to be shown is that A is an algebra (i.e. if A and B are in A,thenA∪B and A c are in ∩{B : B ∈ F}). It follows from the deTnition of A that B ?A. See Figure 2.6.1 Exercise 2.6.1 Finish the proof of Theorem 82 .If A and B are in A,then for each B ∈F, we have A ∈ B and B ∈ B.SinceB is an algebra, A∪B ∈ B. Since this is true for every B ∈F,wehaveA ∪ B ∈∩{B|B ∈ F}. Similarly, if A ∈A,thenA c ∈A. We say that the smallest algebra containing C is called the algebra gen- erated by C. By construction, the smallest algebra is unique. Notice the proof makes clear that the intersection of any collection of algebras is itself an algebra. Example 83 Let X = {a,b,c}. The following three collections are algebras: C 1 = {?,X},C 2 = {?,{a},{b,c},X},C 3 = P(X). The following two collec- tions are not algebras: C 4 = {?,{a},X} since, for instance, {a} c = {b,c} /∈ C 4 and C 5 = {?,{a},{b},{b,c},X} since {b} c = {a,c} /∈ C 5 . However, the smallest algebra which contains C 4 is just C 2 . To see this, we can apply the argument in Theorem 82. Let F = {C 2 ,P(X)} be the family of all algebras that contain C 4 .ButA = C 2 ∩P(X)=C 2 . Exercise 2.6.2 Let X = N. Show that the collection A = {A i : A i is Tnite or N\A i is Tnite} is an algebra on N andthatitisapropersubsetofP(N). The next theorem proves that it is always possible to construct a new collection of disjoint sets from an existing algebra with the property that its union is equivalent to the union of subsets in the existing algebra. This will become very useful when we begin to think about probability measures. Theorem 84 Let A be an algebra comprised of subsets {A i : i ∈Λ}. 11 Then there is a collection of subsets {B i : i ∈Λ} in A such that B n ∩B m = ? for n 6= m and ∪ i∈Λ B i = ∪ i∈Λ A i . 11 Note that the index set Λ can be countably or even uncountably inTnite. 40 CHAPTER 2. SET THEORY Proof. (Sketch)The theorem is trivial when the collection is Tnite (see Ex- ample 85 below). When the collection is indexed on N,weletB 1 = A 1 and for each n ∈N\{1} deTne B n = A n \[A 1 ∪A 2 ∪...∪A n?1 ] = A n ∩A c 1 ∩A c 2 ∩...∩A c n?1 . Since the complements and intersections of sets in A are in A, B n ∈ A and by construction B n ? A n . The remainder of the proof amounts to showing that the above constructed sets are disjoint and yield the same union as the algebra. Exercise 2.6.3 Finish the proof of Theorem 84 above. See Royden Prop2 p. 17. Note that Theorem 84 does not say that the new collection {B i : i ∈Λ} is necessarily itself an algebra. The next example shows this. Example 85 Let X = {a,b,c} and algebra A = P(X) with A 1 = {a}, A 2 = {b}, A 3 = {c}, A 4 = {a,b}, A 5 = {a,c},A 6 = {b,c}, A 7 = ?,A 8 = X. Let B 1 = A 1 . By construction B 2 = A 2 \A 1 = {b},B 3 = A 3 \{A 1 ∪A 2 } = {c},B n = A n \{A 1 ∪ A 2 ∪ ... ∪ A n?1 } = ? for n ≥ 4. Note that the new collection {{a},{b},{c}} is not itself an algebra, since it?s not closed under complementation and that if we chose a di?erent sequence of A i we could obtain a di?erent collection {B i : i ∈Λ}. In the next chapter, we will learn an important result: any (open) set of real numbers can be represented as a countable union of disjoint open intervals. Hence we cannot guarantee that the set is in an algebra, which is closed only under Tnite union, even if all the sets belong to the algebra. Thus we extend the notion of an algebra to countable collections that are closed under complementation and countable union. DeTnition 86 A collection X of subsets of X is called a σ?algebra of sets if (i) A c (= X\A) ∈X if A ∈X and (ii) ∪ n∈N A n ∈X if each A n ∈X. Asinthecaseofalgebras,?,X ∈ X and ∩ ∞ n=1 A n =(∪ = ∞ n=1 A c n ) c ∈ X which means that a σ-algebra is closed under countable intersections as well. Furthermore, we can always construct the unique smallest σ-algebra con- taining a given collection X (called the σ-algebra generated by X)byforming the intersection of all the σ-algebras containing X). This result is an exten- sion of Theorem 82. 2.6. ALGEBRAS OF SETS 41 Theorem 87 Given any collection X of subsets of X,thereisasmallest σ-algebra that contains X. Exercise 2.6.4 Prove Theorem 87. Exercise 2.6.5 Let C , D be collections of subsets of X.(i)Showthat the smallest algebra generated by C is contained in the smallest σ-algebra generated by C.(ii)IfC ?D , show the smallest σ-algebra generated by C is contained in the smallest σ-algebra generated by D. 42 CHAPTER 2. SET THEORY Figures for Sections 2.1 to 2.2 Figure 2.1.1: Set Operations Figure 2.1.2: Distributive Property Figure 2.1.3: DeMorgan?s Laws Figure 2.2.1: Cartesian Product Figures for Sections 2.3 to 2.4 Figure 2.3.1: Illustrating Partial vs Total Ordering Figure 2.3.2: A Lattice Figure 2.3.3: A Partially Ordered Set that?s not a Lattice Figure 2.3.4: Chains and Upper Bounds Figure 2.4.1a: f : A →B and 2.4.1b: Not a function Figure 2.4.2a: Graph of f in A×B and 2.4.2b: Not a function in A×B Figure 2.4.3a: Image Set and Figure 2.4.3b: Inverse Image Set Figure 2.4.4a: Graph of 1 a Figure 2.4.4b: Restriction of 1 a and Figure 2.4.4c: Extension of 1 a Figure 2.4.5: Composition Figure 2.4.6: Inverse Function Figure 2.4.7. Surjection Figure 2.4.8. Graph of 2a Figure 2.4.9. Graph of a 2 Figures for Section 2.5 Figure 2.5.1: Countability of N×N Figure 2.5.2: Uncountability of {0,1} ω Figures for Section 2.6 Figure 2.5.1: Smallest σ?algebras 2.7. BIBLIOGRAPHY FOR CHAPTER 2 43 2.7 Bibliography for Chapter 2 Sections 2.1 to 2.2 drew on Bartle (1978, Ch 1), Munkres (1975, Ch.1), and Royden (1988, Ch 1, Sec. 1,3,4). The material on relations and correspon- dences in Section 2.3 is drawn from Royden (1988, Ch1, Sec 7), Munkres (1975, Ch1, Sec 3), Aliprantis and Border (1999, Ch.1, Sec 2),and Mas- Collel, Whinston, and Green (1995, Ch 1, Sec B). The material on functions in Section 5.2 is drawn from Bartle (1976, Ch 2) and Munkres (1975, Ch1, Sec 2). Section 2.5 drew from Munkres (1975, Ch1, Sec 6-7) and Bartle (1976, Ch.3). 44 CHAPTER 2. SET THEORY 2.8 End of Chapter Problems. 1. Let f : A → B be a function. Prove the following statements are equivalent. ? (i) f is one-to-one on A. ? (ii) f(C ∩D)=f(C)∩f(D)forallsubsetsC and D of A. ? (iii) f ?1 [f(C)] = C for every subset C of A. ? (iv) For all disjoint subsets C and D of A,theimagesf(C)and f(D)aredisjoint. ? (v) For all subsets C and D of A with D ? C,we have f(C\D)= f(C)\f(D). 2. Prove that a Tnite union of countable sets is a countable set. . Chapter 3 The Space of Real Numbers In this chapter we introduce the most common set that economists will en- counter. The real numbers can be thought of as being built up using the set operations and order relations that we introduced in the preceding chap- ter. In particular we can start with the most elementary set N (the counting numbers we all learned in pre-kindergarten) upon which certain operations like ?+? and ?·?aredeTned. The naturals are closed (i.e. for any two counting numbers, say n 1 and n 2, the operation n 1 +n 2 is contained in N). However, N is not closed with respect to certain other operations like ???sincefor example 2?4 /∈ N. To handle that example we need the integers Z, which is closed under ?+?, ?·?, and ???(i.e. 2? 4 ∈ Z). However, Z can?t handle operations like dividing 2 pies between 3 people (i.e. 2 3 /∈Z). To handle that example we need the rationals Q, which is closed under ?+?,???, ?·?, and ?÷?. (i.e. 2 3 ∈ Q). But the rationals can?t handle something as simple as Tnding the length of the diagonal of a unit square. That is, √ 2 /∈ Q. To extend Q to include such cases, besides the operations ?+?,???, ?·?, and ?÷?, we could use Dedekind cuts which makes use the order relation ?≤?. A Dedekind cut in Q is an ordered pair (D,E) of nonempty subsets of Q with the properties D∩E = ?, D∪E, and d<e, ?d ∈ D and ?e ∈ E.Anexampleofacutin Q is, for ξ ∈Q, D = {x ∈Q : x ≤ ξ},E= {x ∈Q : x>ξ}. In this case, we say that ξ ∈ Q represents the cut (D,E). If a cut can be represented by a rational number, it is called a rational cut. It is simple to see that there are cuts in Q which cannot be represented by a rational 45 46 CHAPTER 3. THE SPACE OF REAL NUMBERS number. For example, take the cut D 0 = {x ∈Q : x ≤ 0orx 2 ≤ 2},E 0 = {x ∈Q : x>0andx 2 > 2}. As we will show in this chapter, (D 0 ,E 0 )cannotberepresentedbyarational number. Such cuts are called irrational cuts. Each irrational cut deTnes a unique number. The set of all such numbers is called the irrational numbers. In this way, we can extend the rationals by adding in these irrationals. Rather than build up the real numbers as discussed above, our approach will simply be to take the real numbers as given, list a set of axioms for them, and derive properties of the real numbers as consequences of these axioms. The Trst group of axioms describe the algebraic properties, the sec- ond group the order properties, and we shall call the third the completeness axiom. With these three groups of axioms we can completely characterize the real numbers. In the next chapter we will focus on important issues like convergence, compactness, completeness, and connectedness in spaces more general than the real numbers. However, to understand those concepts it is often helpful to provide examples from R,which is why we start here. In this chapter we focus on four important results in R. The Trst (see Theorem 108)is that any open set in R can be written in terms of a countable union of open intervals. The next two results are proven using the Nested Intervals Property (see Theorem 116) which says that a decreasing sequence of closed, bounded, nonempty intervals ?converges? to a nonempty set. The Trst important result that this is used to prove is the Bolzano-Weierstrass Theorem (118) which says that every bounded inTnite subset ofRhas a point with su?ciently many points in any subset around it. It is also used to prove the important ?size? result (see Theorem 122) that open intervals in R are uncountable. 3.1 The Field Axioms The functions or binary operations ?+? and ?·?onR×R to R satisfy the following axioms. It shouldn?t be surprising that we require the operations to satisfy commutative, associative, and distributive properties as we did in Chapter 2 with respect to the set operations ?∪?and?∩?. Axiom 1 (Algebraic Properties of R) x, y,z ∈R satisfy: 3.1. THE FIELD AXIOMS 47 A1. x+y = y +x. A2. (x+y)+z = x+(y +z) A3. ?0 ∈R 3 x+0=x, ?x ∈R A4. ?x ∈R, ?w ∈R3x+w =0 A5. x·y = y ·x A6. (x·y)·z = x·(y ·z) A7. ?1 ∈R 3 1 6=0andx·1=x, ?x ∈R A8. ?x ∈R3 x 6=0,?w ∈R 3x·w =1 A9. x·(y +z)=x·y +x·z Any set that satisTes Axiom 1 is called a Teld (under ?+? and ?·?). If we have a Teld, we can perform all the operations of elementary algebra, including the solution of simultaneous linear equations. It follows from A1 that the 0 in A3 is unique, which was used in formulating A4,A7, and A8. It also follows that the w in A4 is unique and denoted ??x?. Subtraction ?x?y?isdeTned as ?x +(?y)?. That 1 in A7isuniquefollowsfromA5. The w in A8 can also be shown to be unique. Exercise 3.1.1 Let a,b ∈ R. Prove that the equation a + x = b has the unique solution x =(?a)+b.Witha 6=0, prove that the equation a·x = b has the unique solution x = ? 1 a ¢ ·b. (This is Theorem 4.4 of Bartle). In what follows, we drop the ?·? to denote multiplication and write xy for x· y. Furthermore, we write x 2 for xx and generally x n+1 =(x n )x with n ∈ N. It follows by mathematical induction that x n+m = x n x m for x ∈ R and n,m ∈N. We shall also write x y instead of 3 1 y ′ ·x.RecallthatwedeTned the rationals as Q = { m n : m,n ∈Z, n 6=0}. Theorem 88 There does not exist a rational number q ∈Q such that q 2 =2. 48 CHAPTER 3. THE SPACE OF REAL NUMBERS Proof. Suppose not. Then ? m n ¢ 2 =2form,n ∈Z, n 6=0. Assume, without loss of generality, that m and n have no common factors. Since m 2 =2n 2 is an even integer,then m must be an even integer. 1 In that case we can represent it as m =2k for some integer k. Hence (m 2 =)4k 2 =2n 2 or n 2 =2k 2 which implies that n is also even. But this implies that m and n are both divisible by 2, which contradicts the assumption that m and n have no common factors. Theorem 88 says that the cut (D 0 ,E 0 ) in the introduction to this chapter is not rational. DeTnition 89 All of the elements of R which are not rational numbers are irrational numbers. In Section 3.3 we provide a complementary result to Theorem 88 to es- tablish the existence of irrational numbers. 3.2 The Order Axioms The next class of properties possessed by the real numbers have to do with the fact that they are ordered. The order relation ?≤?deTned on R is a special, and most important, case of the more general relations discussed in Chapter 2. 2 Axiom 2 (Order Properties of R) The subset P of positive real numbers satisTes 3 B1. If x,y ∈ P,thenx+y ∈ P. B2. If x,y ∈ P,thenx·y ∈ P. B3. If x ∈ R, then one and only one of the following holds: x ∈ P, x =0, or ?x ∈ P. Note that B3 implies that if x ∈ P,then?x/∈ P. More importantly, B3 guarantees that R is totally ordered with respect to the order relation ?≤?. 1 Otherwise, if m is odd we can represent it as m =2k+1forsomeintegerk.Butthen m 2 =4k 2 +4k +1=2(2k 2 +2k) + 1 is odd, contradicting the fact that m 2 is even. 2 In fact, the order relations in Chapter 2 were developed to generalize these concepts to more abstract spaces than R. 3 Later, we will associate P with the notation R ++ . 3.2. THE ORDER AXIOMS 49 DeTnition 90 Any system satisfying Axiom 1 and Axiom 2 is called an ordered Teld. By deTnition then, R is an ordered Teld. DeTnition 91 Let x,y ∈R. If x?y ∈ P,thenwesayx>yand if x?y ∈ P ∪{0},thenwesayx ≥ y. If ?(x ? y) ∈ P, then we say x<yand if ?(x?y) ∈ P ∪{0}, then we say x ≤ y. Exercise 3.2.1 Show that (R,≤) is a totally ordered set. The following properties are a consequence of Axiom 2. Theorem 92 Let x,y,z ∈ R.(i)Ifx>yand y>z,thenx>z.(i) Exactly one holds: x>y, x = y, x<y. (iii) If x ≥ y and y ≥ x,then x = y. Proof. (i) If x?y ∈ P and y?z ∈ P, then B1 ? (x?y)+(y?z) ∈ P or (x?z) ∈ P Exercise 3.2.2 Finish the proof of Theorem 92. (Bartle 5.4) The next theorem is one of the simplest we will encounter, yet it is one of the most far-reaching. For one thing, it implies that given anystrictlypositive real number, there is another smaller and strictly positive real number so that there is no smallest strictly positive real number! 4 Theorem 93 (Half the distance to the goal line) If x,y ∈R with x> y, then x> 1 2 (x+y) >y. Proof. x>y? x+x>x+y and x+y>y+y ? 2x>x+y>2y. Now we deTne a very useful functoin onRthat assigns to each real number its distance from the origin. DeTnition 94 If x ∈ R,theabsolute value of x, denoted |·|: R→R + , is deTned by |x| = ? x if x ≥ 0 ?x if x<0 4 Another thing it proves is that even if a defense is continuously penalized half the distance to the goal line, the o?ense will never score unless they Tnally run a play. 50 CHAPTER 3. THE SPACE OF REAL NUMBERS This function satisTes the well-known property of triangles; that is, the lenght of any side of a triangle is less than the sum of the lengths of the other two sides. Theorem 95 (Triangle Inequality) If x,y ∈R,then|x+y|≤|x|+|y|. Proof. (Sketch) Since x ≤ |x| and y ≤ |y|, then |x|? x ∈ P ∪{0} and |y| ? y ∈ P ∪ {0}. By Axiom B1, (|x|?x)+(|y|?y) ∈ P ∪ {0}. But (|x|?x)+(|y|?y)=(|x|+|y|)?(x+y), so (|x|+|y|)?(x+y) ∈ P ∪{0} or (x+y) ≤ (|x|+|y|). But then |x+y|≤|x|+|y|. 3.3 The Completeness Axiom This axiom distinguishes R from other totally ordered Telds like Q.To begin, we use the deTnition of upper and lower bounds from 40 with the order relation ?≤?onR. If S ? R has an upper and/or lower bound, it has inTnitely many (e.g. if u is an ub of S,thenu + n is an ub for n ∈ N). Supremum and inTmum that were deTned in 40 for the general case, can be characterized in R by the following lemma. Lemma 96 Let S ? R.Thenu ∈ R is a supremum (or sup or least upper bound (lub)) of S i? (i) s ≤ u, ?s ∈ S and (ii) ?ε>0, ?s ∈ S such that u?ε<s. 5 Similarly, ! ∈R is an inTmum (or inf or greatest lower bound (glb)) of S i? (i) ! ≤ s, ?s ∈ S and (ii) ?ε>0, ?s ∈ S such that s<!+ε. See Figure 3.3.1. Proof. (?) (i) holds by deTnition (just use ?≤?onR in 40). To see (ii), suppose u is the least upper bound. Because u?ε<u,then u?ε cannot be an upper bound. This implies ?s ∈ S such that u?ε<s. (?) (i) implies u is an upper bound again by deTnition. To see that (ii) implies u is the least upper bound, consider v<u.Then u?v = ε>0(or v = u?ε). By (ii), ?s ∈ S such that u?ε = v<s,hence v is not an upper bound. InthecasewhereS does not have an upper (lower) bound, we assign supS = ∞ (infS = ?∞). 5 Statement (i) makes u an ub while (ii) makes it the lub. 3.3. THE COMPLETENESS AXIOM 51 Example 97 A set may not contain its sup. To see this, let S = {x ∈ R : 0 <x<1} and S 0 = {x ∈ R :0≤ x ≤ 1}. Any number u ≥ 1 is an ub for both sets, but while S 0 contains the ub 1, S does not contain any of its ub! Also, it?s clear that no number c<1 canbeanubforS.Toseethis,just apply our famous Theorem 93 . That is, since c<1, then ?s = 1+c 2 >cand s ∈ S. Theorem 98 There can be only one supremum for any S ?R. Proof. If u 1 and u 2 are lub, then they are both ub. Since u 1 is lub and u 2 is ub, then u 1 ≤ u 2 . Similarly, since u 2 is lub and u 1 is ub, then u 2 ≤ u 1 . Then u 1 = u 2 . The next axiom is critical to establish that Rdoes not have any ?holes? in it. In particular, it will be su?cient to establish that the set R is ?complete? (a term that will be made precise in Chapter 4). Don?t be fooled however, it takes more work than just stating the Axiom to establish completeness. Axiom 3 (Completeness Property of R) Every non-empty set S ? R which has an upper bound has a supremum. From the completeness axiom, it is easy to establish that every non-empty set which has a lower bound has an inTmum. A consequence of Axiom 3 is that N (a subset of R) is not bounded above in R. Theorem 99 (Archimedian Property) If x ∈R, ?n x ∈N such that x< n x . Proof. Suppose not. Then x is an ub for N and hence by Axiom 3 N has a sup, call it u, and u ≤ x. 6 Since u?1 is not an ub, ?n 1 ∈ N 3 u?1 <n 1 . Then u<n 1 +1andsincen 1 +1∈N, this contradicts that u is an ub of N. It follows from Theorem 99 that there exists a rational number between any two real numbers. Theorem 100 If x,y ∈R with x<y, ?q ∈Q such that x<q<y. 6 Note that u is not necessarily in N, which is why we choose to subtract 1 ∈ N in the next statement. 52 CHAPTER 3. THE SPACE OF REAL NUMBERS Exercise 3.3.1 Prove Theorem 100. (Royden p.35) The following theorem complements the result that there are elements of R which are not rational in Theorem 88 of section 3.1. It provides an existence proof of an irrational. We present it since it makes use of Axiom 3. Without the Axiom, the set S = {y ∈ R + : y 2 ≤ 2} does not have a supremum. Theorem 101 ?x ∈R + such that x 2 =2. Proof. Let S = {y ∈R + : y 2 ≤ 2}.ClearlyS is non-empty (take 1) and S is bounded above (take 1.5). Let x =supS,which exists by Axiom 3. Suppose x 2 6=2. Theneitherx 2 < 2orx 2 > 2. First take x 2 < 2. Let n ∈ N be su?ciently large so that 2x+1 n < 2 ?x 2 . Then ? x+ 1 n ¢ 2 ≤ x 2 + 2x+1 n < 2. 7 This means x + 1 n ∈ S which contradicts that x is an upper bound. Next take x 2 > 2. Let m ∈ N be su?ciently large so that 2x m <x 2 ? 2. Then ? x? 1 m ¢ 2 >x 2 ? 2x m > 2. Since x =supS,then ?s 0 ∈ S such that x? 1 m <s 0 . Butthisimplies(s 0 ) 2 > ? x? 1 m ¢ 2 (or (s 0 ) 2 > 2) which contradicts s 0 ∈ S. Exercise 3.3.2 Why doesn?t S = {y ∈R + : y 2 +1≤ 0} work? The next theorem complements the result in Theorem 100 and establishes that between any two real numbers there exists an irrational number. Theorem 102 Let x,y ∈R with x<y.Ifι is any irrational number, then ?q ∈Q such that the irrational number ιq satisTes x<ιq<y. Exercise 3.3.3 Prove Theorem 102. (Bartle) In fact, there are inTnitely many of both kinds of numbers between x and y 7 The Trstweakinequalityholdswithequalityonlyifn =1. 3.4. OPEN AND CLOSED SETS 53 3.4 Open and Closed Sets In this section we deTne the most common subsets of real numbers and determine some of their properties. DeTnition 103 If a,b ∈R, then the set {x ∈R : a<x<b} ({x ∈R : a ≤ x ≤ b}, {x ∈ R : a ≤ x<b})iscalledaopen (closed, half-open) cell denoted (a,b) ( [a,b],[a,b) ) respectively with endpoints a and b.Ifa ∈ R, then the set {x ∈R : a<x} ({x ∈R : a ≤ x})iscalledanopen (closed) ray denoted (a,∞) ( [a,∞) ), respectively. An interval in R is either a cell, a ray, or all of R. A generalization of the notion of an open interval is that of an open set. DeTnition 104 AsetO ?R is open if for each x ∈ O, there is some δ>0 such that the open interval B δ (x)={y ∈ O : |x?y| <δ}? O. Example 105 (0,1) ?R is open since for any x arbitrarily close to 1 (i.e. x =1?ε, ε > 0 arbitrarily small), there is an open interval B ε 2 (1 ?ε) ? (0,1) by Theorem 93. (0,1] is not open since there does not exist δ>0 for which B δ (1) ? (0,1]. That is, no matter how small δ>0 is, there exists x 0 =1+ δ 2 ∈ B δ (1) by Theorem 93 which is not contained in (0,1].Se Figure 3.4.1. Theorem 106 (i)? and R are open. (ii) The intersection of any Tnite col- lectionofopensetsinR is open. (iii) The union of any collection of open sets in R is open. Proof. (i)? contains no points, hence DeTnition 104 is trivially satisTed. 8 R is open since all y 6= x are already in R. (ii) Let {O i : O i ? R,O i open, i =1,...,k} be a Tnite collection of open sets. We must show O = ∩ k i=1 O i is open. Assume x ∈R.BydeTnition of an intersection, x ∈ O i ,?i =1,...,k. Since each O i is open, we can Tnd B δ i (x) ? O i for each i. Let δ =min{δ i : i =1,...,k}. Then B δ (s) ? B δ i (s) ? O i ,?i. This implies B δ (s) ? O. (iii) Take x ∈ O = ∪ i∈Λ O i ,whereΛ is either a Tnite or inTnite index set. Since O i is open, ?B δ (x) ? O i ?∪ i∈Λ O i . 8 In particular, the statement x ∈? is always false. Thus, according to the truth table, any implication of the form x ∈?? P(x)istrue. 54 CHAPTER 3. THE SPACE OF REAL NUMBERS Example 107 Property (ii) of Theorem 106 does not necessarily hold for inTnite intersections. Consider the following counterexample. Let O n = {x ∈ R : ? 1 n <x< 1 n ,n∈ N}. Then ∩ ∞ n=1 O n = {0}, but a singleton set is not open since there does not exist δ>0 such that B δ (0) ? {0}.SeeFigure 3.4.2. The following theorem provides a characterization of open sets in R. Theorem 108 (Open Sets Property in R) Every open set in R is the union of a countable collection of disjoint open intervals. Proof. The proof is in several steps. First, construct an open interval around each y ∈ O. Let O be open. Then, for each y ∈ O, ? an open interval (x,z) such that x<y<zand (x,z) ? O.Letb =sup{z :(y,z) ? O} and a =inf:(x,y) ? O}. Then a<y<band I y =(a,b)isanopeninterval containing y. Second, show the constructed interval is contained in O. Take any w ∈ (a,b)withw>y.Theny<w<band by the deTnition of b (i.e. it is the sup), we know w ∈ O. An identical argument establishes that if w<y, w ∈ O. Third, show the constructed interval is open (i.e. a,b /∈ O). If b ∈ O,then since O is open, ?ε>0 such that (b?ε,b+ε) ? O and hence (y,b+ε) ? O which contradicts the deTnition of b. Fourth, show the union of constructed intervals is O. Let w ∈ O.Then w ∈ I w and hence w ∈∪ y∈O I y Fifth, establish that the intervals are disjoint. Suppose y ∈ (a 1 ,b 1 ) ∩ (a 2 ,b 2 ). Since b 1 =sup{z :(y,z) ? O} and (y,b 2 ) ? O,thenb 1 ≤ b 2 . Since b 2 =sup{z :(y,z) ? O} and (y,b 1 ) ? O,thenb 2 ≤ b 1 . But b 1 ≤ b 2 and b 1 ≥ b 2 implies b 1 = b 2 . A similar argument establishes that a 1 = a 2 .Thus, two di?erent intervals in {I y } are disjoint. Sixth, establish that {I y } is countable. In each I y , ?q ∈ Q such that q ∈ I y by Theorem 100. Since I y are disjoint, q ∈ I y and q 0 ∈ I y 0,fory 6= y 0 implies q 6= q 0 . Hence there exists a one-to-one correspondence between the collection {I y } and a subset of the rational numbers. Thus, {I y } is countable by an argument similar to that in Example 77. Figure 3.4.3 illustrates the theorem for the open set O = O 1 ∪O 2 where O 1 =(?1,0) and O 2 =( √ 2,∞). Part (a) of the Tgure illustrates steps 1 to 4. For example, take y = ? 1 4 ∈ O 1 . Then the supremum of the set of 3.4. OPEN AND CLOSED SETS 55 upper interval endpoints around ? 1 4 contained in O 1 is b ? 1 4 =0andthe inTmum of the set of lower interval endpoints around ? 1 4 contained in O 1 is a ? 1 4 = ?1sothatI ? 1 4 =(?1,0) which is just O 1 . Similarly take y = 3 2 .Then the supremum of the set of upper interval endpoints around 3 2 contained in O 2 is b3 2 = ∞ and the inTmumofthesetoflowerintervalendpointsaround 3 2 contained in O 2 is a3 2 = √ 2sothatI3 2 =( √ 2,∞)whichisjustO 2 . Part (b) of the Tgure illustrates step 6, where the injection is Tnite (and hence countable). Nowwemoveontoclosedsets. DeTnition 109 C ?R is closed if its complement (i.e. R\C)isopen. Example 110 [0,1] ?R is closed since its complement R\[0,1] = (?∞,0)∪ (1,∞) is open since the union of open sets is open by Theorem 106. (0,1] is not closed since its complement, (?∞,0)∪[1,∞), is not open. The singleton set {1} is closed since its complement, (?∞,1) ∪ (1,∞) is open. The set N is closed since its complement, (∪ ∞ n=1 (n?1,n))∪(?∞,0), is a countable union of open sets and hence by Theorem 106 is open. There is another way to describe closed sets which uses cluster points. DeTnition 111 Apointx ∈ R is a cluster point of a subset A ? R if any open ball around x intersects A at some point other than x itself (i.e. (B δ (x)\{x})∩A 6= ?). Note that the point x may lie in A or not. A cluster point must have points of A su?ciently near to it as the next examples show. Example 112 (i) Let A =(0,1]. Then every point in the interval [0,1] is a cluster point of A.In particular, the point 0 is a cluster point since for any δ> 0, ?y = δ 2 ∈ B δ (0) such that B δ (0)∩A ? A. (ii) Let A = { 1 n ,n∈ N}. Then 0 istheonlyclusterpointofA. To see why, for any δ, just take n δ = 1 δ +1, in which case for any δ>0, ?y = δ 1+δ ∈ B δ (0) such that B δ (0)∩A ? A.(iii) Let A = {0}∪(1,2). Then [1,2] are the only cluster points of A since for any δ ∈ (0,1),B δ (0)∩A = ?. (iv) N has no cluster points for the same reason as (iii). (v) Let A = Q. The set of cluster points of A is R. This follows from Theorem 100 that between any two real numbers lies a rational. See Figure 3.4.4. 56 CHAPTER 3. THE SPACE OF REAL NUMBERS WenextuseAxiom3toproveaveryimportantpropertyofR;every nested sequence of closed intervalshasacommonpoint(andwecantake that common point to either be the sup of the lower endpoints or the inf of the upper endpoints). First we must make that statement precise. Example 113 Returning to Example 97 where the open interval S =(0,1) and the closed interval S 0 =[0,1] are both bounded (and hence both possess a supremum by Axiom 3), only the closed interval S 0 contains its supremum of 1 (ie. has a maximum). DeTnition 114 A set of intervals {I n ,n∈N} is nested if I 1 ? I 2 ? ... ? I n ? I n+1 ? ... Example 115 A nested set of intervals does not necessarily have a common point (i.e. ∩ ∞ n=1 I n = ?). For example, neither I n =(n,∞) (so that (1,∞) ? (2,∞) ? ...)norI n =(0, 1 n ) (so that (0,1) ? (0, 1 2 ) ? ...) have common points. Why? It follows from the Archimedean Property 99 that for any x ∈R,?n ∈N such that 0 < 1 n <x.SeeFigure3.4.5. Theorem 116 (Nested Intervals Property in R) If {I n ,n∈N} is a set of non-empty, closed, nested intervals in R,then?x ∈R such that ∩ ∞ n=1 I n 6= ?. Proof. Let I n =[a n ,b n ]witha n ≤ b n .SinceI 1 ? I n ,thenb 1 ≥ b n ≥ a n . Hence {a n : n ∈N} is bounded above and let α be its sup. To establish the claim, it is su?cient to show α ≤ b n ,?n ∈ N. Suppose not. Then ?m ∈ N 3b m <α.Sinceα =sup{a n : n ∈ N}, ?a p >b m .Letq =max{p,m}. Then b q ≤ b m <a p ≤ a q . But b q <a q contradicts I q is a non-empty interval. Thus a n ≤ α ≤ b n or α ∈ I n , ?n ∈ N.IfI n is not closed, then the last statement (α ∈ I n ) doesn?t necessarily hold. See Figure 3.4.6. Note that the same arguments can be applied so that β =inf{b n |n ∈N} is in every interval. Example 117 Let us return to Example 115. Instead of the open interval I n =(0, 1 n ) consider the closed interval I n =[0, 1 n ] for which sup{a n |n ∈ N} =0. But it is clear that 0 is indeed in every nested interval. Another example of Theorem 116 may be I n =[? 1 n ,1+ 1 n ]. Obviously this is nested since [?1,2] ? [? 1 2 , 3 2 ] ? [? 1 3 , 4 3 ] ? ... In this case the sup{a n : n ∈ N} = sup{?1,? 1 2 ,? 1 3 ,...} =0,which is again in every interval. See Figure 3.4.7. 3.4. OPEN AND CLOSED SETS 57 We need the following important result to show that R doesn?t have any ?holes? in it. 9 In Section 4.2, we will show the precise meaning of this ?absence of holes? property known as completeness. For now, one should simplyrecognizethattoruleoutholes, we need to draw out the implications of the Completeness Axiom 3. We do this through the next theorem. Theorem 118 (Bolzano-Weierstrass) Every bounded inTnite subset A ? R has a cluster point. Proof. (Sketch) If A is bounded, then there is a closed interval I such that A ? I.Bisect I.ThereareinTnitely many elements in at least one of the bisections. Denote such a bisection I 1 ? I.Bisect I 1 .Again,thereare inTnitely many elements in at least one of the bisections. Denote such a bisection I 2 ? I 1 .By continuing this process we construct a set {I n ,n∈ N} of non-empty, closed, nested intervals in R. By Theorem 116, there is a point x ? ∈∩ ∞ n=1 I n , which is a cluster point of A. Exercise 3.4.1 Show that x ? ∈∩ ∞ n=1 I n in Theorem 118 is a cluster point of A to Tnish the proof. In the proof we enclosed A in a closed interval I =[a,b]andshowedthat any inTnite subset of I has a cluster point. This special property of [a,b]is called the Bolzano-Weierstrass property. DeTnition 119 AsubsetA ? R has the Bolzano-Weierstrass property if every inTnite subset of A has a cluster point belonging to A. We did not show that any inTnite subset of A has a cluster point. The next example illustrates this. Example 120 Let A =(a,b) with b?a>1.DeTne B = {a+ 1 n ,n∈N}? A. TheonlyclusterpointofB is a,which doesn?t belong to A. Thus open sets like (a,b) don?t have the Bolzano-Weierstrass property. Boundedness is also important. Let A = R. Then N is an inTnite subset of R which does not have aclusterpoint. 9 In Section 4.2, we will show the preceise meaning of the ?absence of holes? property known as completeness. For now, one should simply recognize that to rule out holes, we need to draw out the implications of the completeness axiom. We do this through the Bolzano-Weierstrass Theorem. 58 CHAPTER 3. THE SPACE OF REAL NUMBERS We next present a necessary and su?cient condition for a subset of R to have the Bolzano-Weierstrass property. This important result is known as the Heine-Borel theorem. 10 Theorem 121 (Heine-Borel) A ? R has the Bolzano-Weierstrass prop- erty i? A is closed and bounded. Proof. (?)IfA is Tnite, then A has the B-W property since inTnite subsets of a Tnite set is a false statement. 11 Let A be inTnite and let B be an inTnite subset of A. Since B is bounded, it can be enclosed in a closed interval. Using the same procedure as in Theorem 118 we construct a cluster point x ? of B and hence also of A. Since A is closed, x ? ∈ A. (?) ?closedness?. Let x ? be a cluster point of A.Then for each δ = 1 n , ?x n ∈ A such that |x ? ?x n | < 1 n . The set {x n } n∈N is an inTnite subset of A which has the B-W property so that x ? ∈ A. ?boundedness?. By contradiction. Suppose A is unbounded. Then for any n, ?x n ∈ A such that x n >n.Then {x n } is an iniTnite subset of A which doesn?t have a cluster point (since |x n+2 ?x n | > 1 for all n). But this contradicts the B-W property. We next use the Nested Intervals Property in R (Theorem 116) to estab- lish the uncountability of the set of real numbers. Theorem 122 [0,1] is uncountable. Proof. Suppose not. Then there exists a bijection b : N → [0,1]. Then all elements from [0,1] can be numbered {x 1 ,x 2 ,...,x n ,...}. Divide [0,1] into three closed intervals: I 1 1 =[0, 1 3 ],I 1 2 =[ 1 3 , 2 3 ],I 1 3 =[ 2 3 ,1]. This implies x 1 is not contained in at least one of these three intervals. 12 WLOG, say it is I 1 1 . Divide I 1 1 into three closed intervals: I 2 1 =[0, 1 9 ],I 2 2 =[ 1 9 , 2 9 ],I 1 3 =[ 2 9 , 1 3 ]. This implies ?I 2 such that x 2 /∈ I 2 .NoticethatI 2 ? I 1 and that x 1 ,x 2 /∈ I 2 . In this way we can construct a sequence {I n } ∞ n=1 with the following properties: (i) I n is closed; (ii) I 1 ? I 2 ? ... ? I n ? ... (i.e. nested intervals); and (iii) x i /∈ I n , ?i =1,...,n. From (i) and (ii), Theorem 116 implies ?x 0 ∈ ∩ ∞ n=1 I n ? [0,1].Sowehavefoundarealnumberx 0 ∈ [0,1] which is di?erent 10 Those of you experienced readers may associate Heine-Borel with compactness. Since we wanted to keep this section simple, we?ll put o? the treatment of compactness until we work with more general metric spaces in Section 4.3. 11 And from a false statement, the implication is true by the truth table. 12 It is possible x 1 is an element of 2 closed intervals (e.g. x 1 = 1 3 ). 3.4. OPEN AND CLOSED SETS 59 from any x i , i =1,2,...This contradicts our assumption that {x 1 ,x 2 ,...} are all real numbers from [0,1]. While the above theorem establishes that [0,1] is uncountable (i.e. and hence really big in one sense), we next provide an example of an uncountable subset of [0,1] that is somehow small in another sense. This concrete example is known as the Cantor set and is constructed in the following way (see Figure 3.4.8). First, divide [0,1] into three ?equal? parts: [0, 1 3 ], ( 1 3 , 2 3 ), [ 2 3 ,1]. 13 DeTne F 1 =[0, 1 3 ]∪[ 2 3 ,1] or equivalently F 1 =[0,1]\A 1 where A 1 =( 1 3 , 2 3 ). That is, to construct F 1 we take out the center of [0,1]. Second, divide each part of F 1 into three equal parts (giving us now 6 intervals). DeTne F 2 = [0, 1 9 ]∪[ 2 9 , 3 9 ]∪[ 6 9 , 7 9 ]∪[ 8 9 ,1] or F 2 =[0,1]\A 2 where A 2 =( 1 9 , 2 9 )∪( 3 9 , 6 9 )∪( 7 9 , 8 9 ). That is, to construct F 2 we take out the center of each of the two intervals in F 1 . By this process of removing the open ?middle third? intervals, we construct F n , ?n ∈N. The Cantor set is just the intersection of the sets F n . That is, F = n∈N F n ? ≡ [0,1][ n∈N A n ! . The Cantor set has the following properties: 1. F is nonempty (by Theorem 116). 2. F is closed because it is the intersection of closed intervals F n (by (iii) of Corollary ??,eachF n is closed because it is the union of Tnitely many closed intervals). 3. F doesn?t contain any interval (a,b)witha<b(by construction). 4. F is uncountable (by the same argument used in the proof of Theorem 122). There are two important things to note about the Cantor set. First, while Theorem 108 says that any open set can be expressed as a countable union of open intervals, properties (1)-(4) of the Cantor set shows that there is no analogous result for closed sets. That is, a closed set may not in general be written as a countable union of closed intervals. In this sense, closed sets 13 The sense in which we mean equal parts is that while the sets are di?erent (some are closed, some open), they have the same distance between endpoints of 1 3 (more formally, theyhavethesamemeasure). 60 CHAPTER 3. THE SPACE OF REAL NUMBERS can have a more complicated structure than open sets. Second, property (4) above shows that even though F n seem to be getting smaller and smaller in one sense (i.e. that it has many holes in it) in Figure 3.4.9, F is uncountable (and hence large in another sense). 3.5 Borel Sets Since the intersection of a countable collection of open sets need not be open (e.g. Example 107), the collection of all open sets in R is not a σ-algebra. By Theorem 87, however, there exists a smallest σ-algebra containing all open sets. DeTnition 123 The smallest σ-algebra generated by the collection of all open sets in R,denotedB, is called the Borel σ?algebra in R. Just as Example 83 showed in the case of algebras, even though B is the smallest σ-algebra containing all open sets, it is bigger than just the collection of open sets. For example, we have to add back in singleton sets like those in Example 107 (i.e. the closed set {0} = ∩ n∈N (? 1 n , 1 n )) in order to keep it closed under countable intersection. 14 Infact,almostanysetthatyoucan conceive of is contained in the Borel σ-algebra: open sets, closed sets, half open intervals (a,b], sets of the form ∩ n∈N O n with O n open (which we saw is not necessarily open), sets of the form ∪ n∈N F n with F n closed (which we saw is not necessarily closed), and more. On the other hand, while Tnding a subset of R which is not Borel requires a rather sophisticated construction (see p.??? of Jain and Gupta (1986)), the size of the collection of non-Borel sets is much bigger than the size of B. Loosely speaking, B is as thin in P(R) as N is in R (as we will see in Chapter 5). Exercise 3.5.1 Prove that the following sets in R belong to B:(i)anyclosed set; (ii) (a,b]. Borel sets can be generated by even smaller collections than all open sets as the next theorem shows. 14 Recall in Example 83, for underlying set X = {a,b,c}, we showed that while C 4 = {?,{a},X} was not an algebra (just as the collection of all open sets is not an algebra), we can create an algebra generated by {a} (whose analogue is the Borel σ-algebra) which is just C 2 = {?,{a},{b,c},X} ? P(X) and is ?bigger? in the sense of C 4 ? C 2 (where {b,c} plays the anologue of the other sets we have to add in). 3.5. BOREL SETS 61 Theorem 124 The collection of all open rays {(a,∞):a ∈R} generates B. Proof. It is su?cient to show that any open set A can be constructed in terms of open rays. By Theorem 108, we know that A = ∪ ∞ n=1 I n where I n are disjoint open intervals. But (a,b)=(a,∞)£ ∩ ∞ n=1 ? b? 1 n ,∞ ¢¤ with a<b. Exercise 3.5.2 Using the same idea, show that B can be generated by the collection of all closed intervals {[a,b]:a,b ∈R,a<b}. 62 CHAPTER 3. THE SPACE OF REAL NUMBERS Figures for Chapter 3 Figure 3.3.1: ub, lb, sup, inf Figure 3.4.1: Open and Half-open unit intervals Figure 3.4.2: Example where Countable Intersection of Open Intervals is not Open Figure 3.4.3a&b:Open Sets as a Countable Union of Disjoint Intervals Figure 3.4.4: Examples of Cluster points Figure 3.4.5: Examples of Nested Cells without a Common Point Figure 3.4.6: Nested Cells Property Figure 3.4.7: Example of a Common Point in Nested Cells Figure 3.4.8 Cantor Set 3.6. BIBILOGRAPHY FOR CHAPTER 3 63 3.6 Bibilography for Chapter 3 Sections3.1to3.3arebasedonBartle(Sec4-6)andRoyden(Ch2.,Sec1 and 2). 64 CHAPTER 3. THE SPACE OF REAL NUMBERS 3.7 End of Chapter Problems. 1. Let D be non-empty and let f : D →R have bounded range. If D 0 is anon-emptysubsetofD, prove that inf{f(x):x ∈ D}≤ inf{f(x):x ∈ D 0 }≤ sup{f(x):x ∈ D 0 }≤ sup{f(x):x ∈ D} 2. Let X and Y be non-empty sets and let f : X ×Y →R have bounded range in R.Let f 1 (x)=sup{f(x,y):y ∈ Y},f 2 (y)=sup{f(x,y):x ∈ X} Establish the Principle of Iterated Suprema: sup{f(x,y):x ∈ X,y ∈ Y} =sup{f 1 (x):x ∈ X} =sup{f 2 (y):y ∈ Y} (We sometimes express this as sup x,y f(x,y)=sup x sup y f(x,y)= sup y sup x f(x,y)). 3. Let f and f 1 be as in the preceding exercise and let g 2 (y)=inf{f(x,y):x ∈ X}. Prove that sup{g 2 (y):y ∈ Y}≤ inf{f 1 (x):x ∈ X} (We sometimes express this as sup y inf x f(x,y) ≤ inf x sup y f(x,y)). Chapter 4 Metric Spaces There are three basic theorems about continuous functions in the study of calculus (upon which most of calculus depends) that will prove extremely useful in your study of economics. They are the following: 1. The Intermediate Value Theorem. If f :[a,b] →R is continuous and if r ∈R such that f(a) ≤ r ≤ f(b),then ?c ∈ [a,b] such that f(c)=r. 2. The Extreme Value Theorem. If f :[a,b] → R is continuous, then ?c ∈ [a,b] such that f(x) ≤ f(c), ?x ∈ [a,b]. 3. The Uniform Continuity Theorem. If f :[a,b] →R is continuous, then given ε>0,?δ>0 such that |f(x 1 )?f(x 2 )| <ε, ?x 1 ,x 2 ∈ [a,b]for which |x 1 ?x 2 | <δ. These theorems are used in a number of places. The intermediate value theorem forms the basis for Txed point problems such as the existence of equi- librium. The extreme value theorem is useful since we often seek solutions to problems where we maximize a continuous objective function over a compact constraint set. The uniform continuity theorem is used to prove that every continuous function is integrable, which is important for proving properties of the value function in stochastic dynamic programming problems. While we write these theorems in terms of real numbers, they can be formulated in more general spaces than R. To this end, we will introduce (al- literatively) the 6 C?s: convergence, closedness, completeness, compactness, connectedness, and continuity. In this chapter, we formulate these properties in terms of sequences. Each of the C properties uses some notion of distance. 65 66 CHAPTER 4. METRIC SPACES For instance, convergence requires the distance between a limit point and elements in the sequence to eventually get smaller.Our goal in this chapter, is to consider theorems like those above but for any arbitrary set X. To do so however, requires X to be equipped with a distance function. How will we proceed? First we will clarify what is meant by a distance function on an arbitrary set X. Then using the notion of convergence, which relies on distance, we will deTne closed sets in X. Then the collection of all closed (or by complementation open) sets is called a topology on a set X and it is the main building block in real analysis. That means properties such as continuity, compactness, and connectedness are deTned directly or indirectly in terms of closed or open sets and for this reason are called topological properties. While there is an even more general way of deTning a topology on X that doesn?t use the notion of distance, we will wait until Chapter 7 to discuss it. DeTnition 125 A metric space (X,d) is a nonempty set X of elements (called points) together with a function d : X×X →Rsuch that?x,y,z ∈ X : (i) d(x,y) ≥ 0; (ii) d(x,y)=0i? x = y; (iii) d(x,y)=d(y,x);and(iv) d(x,z) ≤ d(x,y)+d(y,z). The function d is called a metric. Example 126 We give three examples. First, let X beaset(e.g. X = {a,b,c,d})anddeTne a metric d(x,y)=0for x = y,andd(x,y)=1 for x 6= y. Thisiscalledthe?discretemetric?. Itiseasytocheckthat (X,d) is a metric space. Second, (R, |·|), where d is simply the abso- lute value function and property (iv) is simply a statement of the trian- gle inequality. Thus, Chapter 3 should be seen as a special case of this chapter. Third, let X be the set of all continuous functions on [a,b] and d(f,g)=sup{|f(x)?g(x)|,x∈ [a,b]}.In Chapter 6, we will see this as well as other metrics are valid metric spaces. It should be emphasized that a metric space is not just the set of points X but the metric d as well. To see this, we introduce the notion of the cartesian product of metric spaces. Let (X,d x )and(Y,d y ) be two metric spaces, then we can construct a metricdonX×Y from the metricsd x andd y .In fact, there are many metrics we can construct: d 2 (x,y)= q (d x (x 1 ,y 1 )) 2 +(d y (x 2 ,y 2 )) 2 and d ∞ (x,y)=sup{d x (x 1 ,y 1 ),d y (x 2 ,y 2 )}. Exercise 4.0.1 Show that d 2 and d ∞ are metrics in X ×Y. 67 Next, metrics provide us with the ability to measure the distance between two sets (if one of the sets is a singleton, then we can measure the distance of a point from a set). DeTnition 127 Let A ? X and B ? X. The distance between sets A and B is d(A,B)=inf{d(x,y),x∈ A,y ∈ B}. We note that any subset of a metric space is a metric space itself. DeTnition 128 If (X,d) is a metric space and H ? X, then (H,d| r H) is also a metric space called the subspace of (X,d). 1 Example 129 ([0,1],|·|) isametricspacewhichisasubspaceof(R,|·|). In a metric space, we can extend the notion of open intervals in DeTnition (104). DeTnition 130 For x ∈ X, we call the set B δ (x)={y ∈ X : d(x,y) <δ} an open ball with center x and radius δ. In this case, G is open if ?x ∈ G, B δ (x) ? G. 2 Don?t assume that an open ball is an open set. We still don?t know what an open set is. We will prove this in the next section. Also note that a ball is deTned relative to the space X, so that if for example X = N, then a ball of size δ =1.5around5isjust{4,5,6}. The next example shows that balls don?t need to be ?round?. Their shape depends on their metric. Example 131 In R 2 , Figure 4.1 illustrates a ball with metric d 1 (x,y)= |x 1 ?y 1 |+|x 2 ?y 2 |, one with a Euclidean metric d 2 (x,y)= p (x 1 ?y 1 ) 2 +(x 2 ?y 2 ) 2 , and one with a sup metric d ∞ (x,y)=sup{|x 1 ?y 1 |,|x 2 ?y 2 |}. Before proceeding, we brie?y mention some of the important results that you will see in this chapter. Here we extend the Heine Borel Theorem 121 of Chapter 3 to provide necessary and su?cient conditions for compactness in general metric spaces in Theorem 198. We also introduce the notion of a Banach space (a complete normed vector space) and for the Trst time give an example of an inTnite dimensional Banach space. In many theorems that 1 Note that we restrict the metric function to the set H using DeTnition 56. 2 Don?tassumethatanopenballisanopenset.WewillprovethisinSectionX. 68 CHAPTER 4. METRIC SPACES follow, the dimensionality of a Banach space plays a crucial role. Another important set of results pertain to the properties of a continuous function on a connected domain (a generalization of the Intermediate Value Theorem is given in Theorem 254) as well as a continuous function on a compact domain (a generalization of the Extreme Value Theorem is given in Theorem 261 and the Uniform Continuity Theorem in general metric spaces is given in The- orem ??). Since many applications in economics result in correspondences, we spend considerable time on upper and lower hemicontinuous correspon- dences. Probably one of the most important theorems in economics is Berge?s Theorem of the Maximum 295. The chapter concludes with a set of Txed point theorems that are useful in proving the existence of general equilibrium or existence of a solution to a dynamic programming problem. 4.1 Convergence In this section we will build all the topological properties of a metric space in terms of convergent sequences (as an alternative to building upon open sets). In many cases, the sequence version (of deTnitions and theorems) is more convenient, easier to verify, and/or easier to picture. DeTnition 132 If X is any set, a Tnite sequence (or ordered N-tuple) in X is a function f : Ψ N →X denoted <x n > N n=1 .AninTnite sequence in X is a function f : N→ X denoted <x n > ∞ n=1 (or <x n > for short). When there is no misunderstanding, we assume all sequences are inT- nite unless otherwise noted. We use the <x n > notation to reinforce the di?erence from {x n |n ∈N} since order matters for a sequence. Example 133 There are many ways of deTning sequences. Consider the sequence of even numbers < 2,4,6,...>. One way to list it is < 2n> n∈N . Another way this is to specify an initial value x 1 and a rule for obtaining x n+1 from x n . In the above case x 1 =2and x n+1 = x n +2,n∈N. It is possible that while a sequence doesn?t have some desired properties, but a subset of the sequence has the desired properties. DeTnition 134 A mapping g : N→N is monotone if (n>m) implies (g(n)) > (g(m)).Iff : N→X is an (inTnite) sequence, then h is an (inTnite) subsequence of f if there is a monotone mapping g:N→N such that h = f ?g, denoted <x g(n) >. 4.1. CONVERGENCE 69 Example 135 Consider the sequence f : N→{?1,1} given by < (?1) n > n∈N . If g(n)=2n for n ∈N (i.e. the even indices), then the subsequence h = f?g is simply < 1,1,...>while if g(n)=2n?1 (i.e. the odd indices), then the subsequence is < ?1,?1,...>. See Figure 4.1.1. DeTnition 136 A sequence <x n > from a metric space (X,d) converges to the point x ∈ X (or has x as a limit), if given any δ>0,?N (which maydependonδ)suchthatd(x,x n ) <δ, ?n ≥ N(δ). In geometric terms, this says that <x n > converges to x if every ball around x contains all but a Tnite number of terms of the sequence. We write x = limx i or x i → x to mean that x is the limit of <x i >. If a sequence has no limit, we say it diverges. Example 137 Toseeanexampleofalimit,considerthesequencef : N→R given by < ( 1 n ) > n∈N . In this case the lim < ( 1 n ) > n∈N =0. To see why, notice that for any δ>0, it is possible to Tnd an N(δ) such that d(0,x n )=|x n | <δ, ?n ≥ N(δ).Forinstance,ifδ =1, then N(1) = 2 (respects the strict inequality), if δ = 1 2 , then N ? 1 2 ¢ =3, etc. In general, let N(δ)=w ? 1 δ ¢ +1where w(x) denotes an operator which takes the whole part of the real number x. Such natural numbers always exist by the Archimedean Property (Theorem 99). See Figure 4.1.2. Theorem 138 (Uniqueness of Limit Points) Asequencein(X,d) can have at most one limit. Proof. Suppose, to the contrary, x 0 and x 00 are limits of <x n > and x 0 6= x 00 . Let B δ (x 0 )(= {x ∈ X : d(x,x 0 ) <δ})andB δ (x 00 ) be disjoint open balls around x 0 and x 00 , respectively. 3 Furthermore, let N 0 , N 00 ∈ N be such that if n ≥ N 0 and n ≥ N 00 , then x n ∈ B δ (x 0 )andx n ∈ B δ (x 00 ),respectively. Let k =max{N 0 ,N 00 }. But then x k ∈ B δ (x 0 )∩B δ (x 00 ), a contradiction. Lemma 139 If <x n > in (X,d) converges to x ∈ X, then any subsequence <x g(n) > also converges to x. Proof. By deTnition 136, ?N(δ)suchthatd(x,x n ) <δ, ?n ≥ N(δ). Let <x g(n) > be a subsequence of <x n >.Sinceg(n) ≥ n,then g(n) ≥ N(gd) in which case d(x,x g(n) ) <δ. The next deTnition gives another notion of convergence to a point which is the sequential version of DeTnition 111. 3 It is always possible to construct such disjoint balls. Just let δ = 1 4 d(x 0 ,x 00 ). 70 CHAPTER 4. METRIC SPACES DeTnition 140 Asequence<x n > fromametricspace(X,d) has a clus- ter point x ? ∈ X if given any δ>0 and given any N,?n ≥ N such that d(x ? ,x n ) <δ. In geometric terms, this says that x ? is a cluster point of <x n > if each ball around x ? contains inTnitely many terms of the sequence. Thus if x = lim <x n >, then it is a cluster point. 4 However, if x ? is a cluster point, it need not be a limit. To see this, note that the key di?erence between DeTnitions 136 and 140 lie in what terms in the sequence qualify as a limit or cluster point. If x is a limit point we know ?N(δ)forwhich d(x,x n ) <δfor n ≥ N(δ). For a cluster point, given N, it is su?cient to Tnd just one term in the sequence x n su?ciently far out that satisTes d(x ? ,x n ) <δ. But then just take n in deTnition 140 as n =max{N(δ), N}.To see this, consider the next example. Example 141 Consider the sequence < (?1) n > n∈N from Example 135. This sequence has no limit point but two cluster points. To see why, notice that the only candidate limit points are {?1,1}. Consider x =1.Forallδ ∈ (0,1), d(1,x i )=|1 ? x i | >δfor any odd index i =2n ? 1,n∈ N.Asimilar argument holds for x = ?1. To see why x ? =1satisTes the deTnition of a cluster point, notice that for any N, there exists i =2N +1(an odd index) such that for any δ>0,d(1,x i ) <δ. For this particular sequence, there are actually an inTnite number of such indices. See Figure 4.1.1. Example 141 provides a sequence which does not have a limit point (and hence the assumption of Lemma 139 does not apply). However, it is easy to see that there is a subsequence - (?1) g(n) ? n∈N of odd indices that has a limit point (which is one of the cluster points of the original sequence). The following theorem applies to such cases. Lemma 142 x ? is a cluster point of <x n > i? there exists a subsequence <x n k >?<x n > such that <x n k >→ x ? as k →∞. Proof. (?)Ifx ? is a cluster point, then ? 1 k > 0, ?x n k such that d(x ? ,x n k ) < 1 k for any N. (?)istrivial. 4 One shouldn?t be confused between cluster points for a set and for a sequence. For instance, singletons like {1} do not have cluster points, whereas the constant sequence < 1,1,1,... > does (which is 1). 4.1. CONVERGENCE 71 DeTnition 143 Let (X,d) beametricspaceandA ? X. A is closed if any convergent sequence of elements from A has its limit point in A. Theorem 144 (Closed Sets Properties) (i)?and X are closed. (ii) The intersection of any collection of closed sets in X is closed. (iii) The union of any Tnite collection of closed sets in X is closed. Proof. (i) Trivial. (ii) Let A = ∩ i∈Λ A i where A i ? X is closed ?i ∈ Λ, which is any index set. Take any convergent sequence from A and show its limit point is in A as well. That is, let <x n >? A and <x n >→ x. Then <x n >? A i ?i ∈Λ and because A i is closed <x n >→x and x ∈ A i ?i ∈Λ implies x ∈∩ i∈Λ A i . (iii) Let A = ∪ n i=1 A i where A i ? X is closed ?i ∈ {1,...,n}. Again, take any convergent sequence from A and show its limit point is in A as well. In particular, let <x n >? A and <x n >→ x. There exists A j containing inTnitely many elements of <x n > (i.e. A j contains a subset <x n k >.By lemma 139, <x n k >→ x and because <x n k >? A j and A j is closed, then x ∈ A j implies x ∈ A = ∪ n i=1 A i . Example 145 Property (iii) of Theorem 144 does not necessarily hold for inTnite union. Consider the following counterexample. Let F n =[?1,? 1 n ]∪ [ 1 n ,1]. Then ∪ ∞ n=1 F n =[?1,0)∪(0,1]. See Figure 4.1.3. Closed sets can also be described in terms of cluster points. Theorem 146 AsubsetofX is closed i? it contains all its cluster points. Proof. 5 (?) By contradiction. Let x be a cluster point of a closed set A and let x/∈ A. Then x ∈ X\A. Because X\A is open, there exists an open ball B δ x (x) such that B δ x (x) ? X\A.Thus B δ x (x) is a neighborhood of x having empty intersection with A.This contradicts the assumption that x is a cluster point of A. (?)Letx ∈ X\A.Thenx is not a cluster point of A since A contains all its cluster points by assumption. Then there exists an open ball B δ x (x) such that A∩B δ x (x)=?. This implies B δ x (x) ? X\A or that X\A is open, in which case A is closed. 5 From Munkres Theorem 6.6, p. 97. 72 CHAPTER 4. METRIC SPACES Exercise 4.1.1 Explain why a singleton set {x} is consistent with Theorem 146. We now introduce another topological notion that permits us to charac- terize closed sets in other terms. DeTnition 147 Given a set A ? X, the union of all its points and all its cluster points is called the closure of A, denoted A (i.e. A = A∪A 0 where A 0 is the set of all cluster points of A). Notice that A is not a partition since A and A 0 are not necessarily disjoint. Take (iii) of Example 112 where A = {0}∪(1,2), in which case A 0 =[1,2] and A = {0}∪[1,2]. As an exercise, prove the following theorems. Theorem 148 Let A ? X. x ∈ A i? any open ball around x has a non- empty intersection with A. Theorem 149 The closure of A is the intersection of all closed sets con- taining A. Theorem 150 A is closed i? A = A. Exercise 4.1.2 Prove that A ? A and (A∪B)=A∪B. Give an example to show that (A∩B)=A∩B may not hold. Example 151 (i) A = {2,3}.Then A = {2,3}. (ii) A = N. Then A = N. (iii) A =(0,1]. Then A =[0,1]. (iv) A = {x ∈ Q : x ∈ (0,1)}. Then A =[0,1]. Intuitively, one would expect that a point x lies in the closure of A if there is a sequence of points in A converging to x. This is not necessarily true in a general topological space, but it is true in a metric space as the next lemma shows. Lemma 152 Let (X,d) be a metric space and A ? X.Then x ∈ A i? it is a limit point of a sequence <x n > of points from A (i.e. ? <x n >? A such that <x n >→ x). 4.1. CONVERGENCE 73 Proof. (?)Takeanyx ∈ A. By Theorem 148, ?δ = 1 n > 0, ?x n ∈ A such that x n ∈ B1 n (x). Hence this sequence <x n >→ x (a limit point). (?)If <x n >→ x such that <x n >? A,thenineveryopenballaroundx there is x n (actually, inTnitely many of them) inside this ball. Then by Theorem 148 x ∈ A. Exercise 4.1.3 (i) Show that if A is closed and d(x,A)=0,then x ∈ A.Does (i) hold without assuming closedness of A? (ii) Show that if A is closed and x 0 /∈ A,thend(x 0 ,A) > 0. In the previous Example 151, we see that in some cases the closure of a set is: the set itself (i); ?brings? Tnitely many new points to the set (iii); or brings uncountably many points (iv). This leads us to the notion of density. DeTnition 153 Given the metric space (X,d),asubsetA ? X is dense in X if A = X. Example 154 To see that Q is dense in R, we know that in any ball around x ∈R there is a rational number. Hence by Theorem 148 x is from Q.Thus we have R?Q. Obviously, R?Q as well, so R =Q. A similar argument establishes that the set of irrationals is dense in R. Intuitively, if A is dense in X then for any x ∈ X, there exists a point in A that is su?ciently close to (or approximates) x.From the previous example, since Q is dense in R, any real number can be approximated by a rational number which is countable. More importantly for applied economists, we might take X to be the set of continuous functions and A the set of poly- nomials with rational coe?cients which is again countable. Then, provided the set of such polynomials is dense in the set of all continuous functions, working with polynomials will yield a good approximation to the continuous function we are interested in. DeTnition 155 Ametricspace(X,d) is separable if it contains a dense subset that is countable. Example 156 (R, |·|) is separable since Q is a countable dense subset of R. So far in a general metric space we have dealt only with closed sets. Now we can introduce open sets as follows. 74 CHAPTER 4. METRIC SPACES DeTnition 157 AsetA ? X is open if its complement is closed. Exercise 4.1.4 Show that an open ball is an open set. Example 158 Let A = {(0,1) ×{2}} = {(x,y):0<x<1,y=2} ? R 2 equipped with d 2 . A is not open since no matter how small δ is, there exist y 0 =2± δ 2 such that (x,y 0 ) ∈ B δ ((x,2)) is not contained in A. See Figure 4.1.4. We could have proven the properties of open sets as we did in Theorem 106, but we will not repeat it. Here we will simply mention a few concepts that will be useful. The Trstconceptisthatofaneighborhood. DeTnition 159 A neighborhood of x ∈ X is an open set containing x. Sometimes it is more convenient to use the concept of a neighborhood of x rather than an open ball around x,but you should realize that these two concepts are equivalent since an open ball B ε (x) is a neighborhood of x and conversely, if V x is a neighborhood of x,then there is a B ε (x) ? V x . See Figure 4.1.5. There is another way to describe closed sets which uses boundary points. DeTnition 160 Apointx ∈ X is a boundary point of A if every open ball around x contains points in A and in X\A (i.e.(B δ (x)∩A)∩(B δ (x)∩ (X\A)) 6= ?).Apointx ∈ X is an interior point of A if ?B δ (x) ? A. SeeFigure4.1.6. Note that a boundary point need not be contained in the set. For example, the boundary points of (0,1] are 0 and 1. Example 161 The set of boundary points of Q is R since in any open ball around a rational number there are other rationals and irrationals by Theo- rems 100 and 102. The next theorem provides an alternative characterization of a closed set. Theorem 162 AsetA ? X is closed i? it contains its boundary points. 4.1. CONVERGENCE 75 Proof. (?) Suppose A is closed and x is a boundary point. If x/∈ A, then x ∈ X\A (which is open), contrary to x beingaboundarypoint.(?) Suppose A contains all its boundary points. If y/∈ A,then?B δ (y)aproper subset of X\A. Since this is true ?y/∈ A, X\A is open so A is closed. Unlike properties like closedness and openness, boundedness is deTned relative to the distance measure and hence is a metric property rather than a topological property. DeTnition 163 Given (X,d), A ? X is bounded if ?M>0 such that d(x,y) ≤ M, ?x,y ∈ X. Boundedness cannot be deTned only in terms of open sets. It requires the notion of distance. Thus it is not a topological property. Theorem 164 A convergent sequence in the metric space (X,d) is bounded. Proof. Taking δ =1,weknowbyDeTnition 136,?N(1) such that |x n ?x| < 1, ?n ≥ N(1). 6 By the triangle inequality, we know |x n | = |x n ?x+x| ≤ |x n ?x| + |x| < 1+|x|,?n ≥ N(1). Since there are a Tnite number of indices n<N(1), then we set M =sup{|x 1 |,|x 2 |,...,|x N(1)?1 |,|x|+1}. Hence, |x n |≤ M, ?n ∈N,sothat<x i > is bounded. 4.1.1 Convergence of functions While we will focus on convergence of functions in Chapter 6, it will be necessary for some results in the upcoming sections to introduce a form of functional convergence. A sequence of functions is simply a sequence whose elements f n (x) contain two variables, n and x,wheren indicates the order in the sequence and xis the variable of a function. For example, <f n (x) >=< x n >=<x,x 2 ,x 3 ,... > for x ∈ [0,1]. What does it mean for a sequence of functions to be convergent? There are basically two di?erent answers to this question. If we work in a metric space whose elements are functions themselves with a certain metric, then convergence of functions is nothing other than convergence of elements (the element being a function) with respect to the given metric. We will deal with this type of convergence in Chapter 6 on function spaces. The second 6 Since the sequence converges, we are free to choose any ε>0. Here we simply choose ε =1. 76 CHAPTER 4. METRIC SPACES approach is to take any set X along with a metric space (Y,d Y )andlet f n : X → Y for all n. Fix x 0 ∈ X. Then hf n (x 0 )i is a sequence of elements (the element being a point) in Y. If this sequence is convergent, then it converges to a certain point y 0 (i.e. hf n (x 0 )i → y 0 = f(x 0 ). This leads to the following deTnition. DeTnition 165 Given any set X and a metric space (Y,d Y ), let <f n > be a sequence of functions from X to Y. The sequence <f n > is said to converge pointwise to a function f : X → Y if for every x 0 ∈ X, lim n→∞ f n (x 0 )= f(x 0 ).Wecallf a pointwise limit of <f n >.In other words, <f n > converges pointwise to f on X if ?x 0 ∈ X and ?ε>0, ?N(x 0 ,ε) such that ?n>N(x 0 ,ε) we have d Y (f n (x 0 ),f(x 0 )) <ε. Notice that if x 0 is Txed, then <f n (x 0 ) > is simply a sequence of elements in the metric space (Y,d Y ). Example 166 Let f n : R→R given by f n (x)= x n and f : R→R given by f(x)=0. Thus, this example is a simple generalization of Example 137. Then <f n > converges pointwise to f sincewecanalwaysTnd a natural number N(x,ε)=w ?ˉ ˉ x ε ˉ ˉ ¢ +1 by the Archimedean Property. See Figure 4.1.7. Example 167 Let f n :[0,1]→R given by f n (x)=x n and f :[0,1]→R given by f(x)= ? 00≤ x<1 1 x =1 . It is clear that when x =1, then f n (x)=1 n =1=f(x) so that f n (1) → 1 trivially. To see that for x ∈ [0,1),f n (x) → f(x) note that if we write x = 1 1+a with a>0 then we can use Bernoulli?s inequality that (1+a) n ≥ 1+na, then 0 <x n = ? 1 1+a ¢ n ≤ 1 1+na < 1 na so that we can take N(x,ε)=w ? 1 aε ¢ +1. SeeFigure4.1.8. Notice that the rate of convergence N(x 0 ,ε)canbeverydi?erent for each x 0 . In Example 166, the rate is very low for very large x (we say the rate of convergence is smaller the larger is N). However, if we restict the domain for f n ,sayf n :[0,2] →R, thenthesmallestpossiblerateforagiven ε is N(2,ε)=w ?ˉ ˉ x ε ˉ ˉ ¢ +1. If it is possible, for a given ε, to Tnd the rate independently of x,then we call this type of convergence uniform. 4.2. COMPLETENESS 77 DeTnition 168 Given X and a metric space (Y,d Y ), let <f n > be a se- quence of functions from X to Y. The sequence <f n > is said to converge uniformly to a function f : X → Y if ?ε>0, ?N(ε) such that ?n>N(ε) we have d Y (f n (x),f(x)) <ε, ?x ∈ X. It is apparent from the deTnition that uniform convergence implies point- wise convergence, but as Examples 166 and 167 show the converse does not necessarily hold (i.e.the above two sequences of functions do not converge uniformly but do converge pointwise). In the Trst case, ?ε>0and?n ∈N, ?x 0 ∈ R such that x 0 > ε n by the Archimedean property X so that x 0 n >ε. Similarly, in the second example,?ε ∈ (0,1) and?n ∈N,wehave1>ε 1 n > 0. Then ?x 0 ∈ (0,1) such that 1 >x 0 >ε 1 n > 0inwhichcasex n 0 >ε.Onthe other hand, if we restrict the domain of the Trst example to [?1,1] (or for that matter any bounded set) f n is uniformly convergent since for any ε we can take N = 1 ε +1. 4.2 Completeness The completeness of a metric space is a very important property for problem solving. For instance, to prove the existence of the solution of a problem, we usually manage to Tnd the solution of an approximate problem. That is, we construct a sequence of solutions that are getting closer and closer to one another via the method of successive approximations. But for this method to work, we need a guarantee that the limit point exists.Ifthespace is complete, then the limit of this sequence exists and is the solution of the original problem. For this reason, we turn to establishing when a given space is complete. DeTnition 169 Asequence<x n > from a metric space (X,d) is a Cauchy sequence if given δ>0,?N(δ) such that d(x m ,x n ) <δ, ?m,n ≥ N(δ). Note that if <x n > is convergent, then there is a limit point x to which elements of <x n >eventually approach. If <x n > is Cauchy, then elements of <x n > eventually approach a point which may or may not exist. Hence all Cauchy sequences can be divided into two di?erent classes: those for which ?x such that <x n >→ x (i.e. convergent Cauchy sequences); and those for which @x such that <x n >→x (i.e. nonconvergent Cauchy sequences). 78 CHAPTER 4. METRIC SPACES Example 170 Suppose we did not know that there existed a limit in example 137 where < ( 1 n ) > n∈N in (R,|·|). We can however, establish that this sequence of real numbers is a Cauchy sequence and hence has a limit. Let m,n ≥ N(δ) and without loss of generality let m ≤ n. Then d(s n ,s m )= ˉ ˉ 1 m ? 1 n ˉ ˉ < 1 m . Hence a su?cient condition for this sequence to be Cauchy for any δ>0 is that N(ε)=w ? 1 δ ¢ +1. Example 171 Considerthemetricspace(X,d) with X =(0,1] and d = |x|. Then by Example170, we?ve established that < ( 1 n ) > n∈N is a Cauchy sequence that converges (in R )toalimit0 /∈ X. We now list some results that are not so useful in and of themselves but will be used repeatedly to prove important theorems in the next few sections. Lemma 172 Given (X,d),if <x n > converges, then <x n > is a Cauchy sequence. Proof. Let x = lim <x n >.Then given δ>0,?N ? δ 2 ¢ such that if n ≥ N ? δ 2 ¢ , then d(x,x n ) < δ 2 .Thusifn,m ≥ N ? δ 2 ¢ , then d(x m ,x n ) ≤ d(x m ,x)+d(x,x n ) < δ 2 + δ 2 . Lemma 173 If a subsequence <x g(n) > of a Cauchy sequence <x n > converges to x, then <x n > also converges to x. Exercise 4.2.1 Prove Lemma 173. Lemma 174 A Cauchy sequence in (X,d) is bounded. Exercise 4.2.2 Prove Lemma 174. It is similar to Lemma 164 in the pre- ceding section. The converse of Lemma 172 is not necessarily true. Those spaces for which the converse of Lemma 172 is true are called complete. DeTnition 175 If (X,d) has the property that every Cauchy sequence con- vergestosomepointinthemetricspace,then(X,d) is complete. 4.2. COMPLETENESS 79 Establishing a metric space is complete is a di?cult task since we must show that every Cauchy sequence converges. In fact, due to Lemma 173 we can (somewhat) weaken this deTnition,whichgivesusthefollowinglemma. Lemma 176 (X,d) is complete if every Cauchy Sequence has a convergent subsequence. Proof. It is su?cient to show that if <x n > is a Cauchy sequence that has a subsequence <x g(n) > which converges to x, then <x n > converges to x. Since <x n > is a Cauchy sequence, given δ>0,we can choose N ? δ 2 ¢ large enough such that d(x m ,x n ) < δ 2 , ?m,n ≥ N ? δ 2 ¢ by DeTnition 169. Since <x g(n) > is a convergent subsequence, given δ>0, we can choose N ? δ 2 ¢ large enough such that d(x g(n) ,x) <δ, ?g(n) ≥ N ? δ 2 ¢ by DeTnition 136. Combining these two facts and using (iv) of DeTnition 125, d(x n ,x) ≤ d(x n ,x g(n) )+d(x g(n) ,x) <δ. Another useful fact is that if we know a space is complete, then we know a closed subspace is complete. Theorem 177 A closed subset of a complete metric space is complete. Proof. Any Cauchy sequence in the closed subset is a Cauchy sequence in the metric space. Since the metric space is complete, it is convergent. Since the subset is closed, the limit also must be from this set. Establishing that a metric space is not complete is an easier task since we must only show that one Cauchy sequence does not converge to a point in the space. Just take ((0,1],|·|) in Example 171 since the limit of < ( 1 n ) > is 0 which is not contained in (0,1]. Example 178 Consider the sequence f : N→Rgiven by < (1+ 1 n ) n > n∈N .It can be shown that this sequence is increasing and bounded above. Then by the Monotone Convergence Theorem 324, which is proven in the End of Chapter Exercises, this sequence converges in (R,|·|). The limit of this sequence is called the Eulernumbere(e=2.71828...), which is irrational. But then (Q,|·|) is not complete; the sequence < (1+ 1 n ) n > n∈N ?Q is Cauchy (because it is convergent in R ) but is not convergent in Q. Example 179 While Q is not complete, N is complete because the only Cauchy sequences in N are constant sequences (e.g. < 1,1,1,...>), which are also convergent. 80 CHAPTER 4. METRIC SPACES We next take up the important question of completeness of (R,|·|). This takes some work. Theorem 180 (Bolzano-Weierstrass for Sequences) A bounded sequence in R has a convergent subsequence. Proof. Let A =<x n > be bounded. If there are only a Tnite number of distinct values in the sequence, then at least one of these values must occur inTnitely often. If we deTne a subsequence <x g(n) > of <x n > by selecting this element each time it appears we obtain a convergent (constant) subsequence. If the sequence <x n >contains inTnitely many distict values, then A =< x n > is inTnite and bounded. By the Bolzano-Weirstrass Theorem 118 for sets (which rested upon the Nested Cells Property, which in turn rested upon the Completeness Axiom), there is a cluster point x ? of A =<x n >.Then by Theorem 142 there is a subsequence <x g(n) >→ x ? . Theorem 181 (Cauchy Convergence Criterion) A sequence in Ris con- vergent i? it is a Cauchy sequence. Proof. (?)istrueinanymetricspacebyLemma172. (?)Let<x n > beaCauchysequenceinR.Then it is bounded by Lemma 174 and by Theorem 180 there is a convergent subsequence <x n k >→ x. Then by Lemma 173, the whole sequence <x n > converges to x. Hence, since completeness requires that any Cauchy sequences converges, we know from Theorem 181 that (R,|·|) is complete. 4.2.1 Completion of a metric space. Everymetricspacecanbemadecompete. Theideaisasimpleone. Let (X,d)beametricspacethatisnotcomplete. LetCS[X]bethesetof all Cauchy sequences on the incomplete metric space and let <a n >,< b n >∈ CS[X]. DeTne (as in DeTnition 26) the equivalence relation ?~ ?by <x n >?<y n > i? lim n→∞ d(x n ,y n ) = 0. This relation forms a partition of CS[X] where in every equivalence class there are all sequences which have thesamelimit.LetX ? be the set of all equivalence classes of CS[X]. Then X ? with the metric b d([<a n >],[<b n >]) = lim n→∞ d(a n ,b n )isacomplete metric space. 4.2. COMPLETENESS 81 Example 182 Reconsider Example 171. The completion of ((0,1),|·|) is ([0,1],|·|). Notice that we added two Cauchy sequences < 1 n > and < 1? 1 n >. Next we demonstrate the process of completion of a metric space (Q,|·|), which we know by example 178 is not complete since -? 1+ 1 n ¢ n ? is a non- convergent Cauchy sequence (in Q). Let CS(Q)bethesetofallCauchy sequences. An equivalence relation, deTned in 26, partitions CS(Q)into classes like those shown in Figure 4.2.1: classes of convergent Cauchy se- quences such as - 1+ 1 n ? and - 1? 1 n ? (which converges to the rational num- ber 1) and classes of non-convergent (in Q) like -? 1+ 1 n ¢ n ? (which converges to e which is not in Q). Loosely speaking, we can then assign the number 1 to the class of convergent Cauchy sequences and can assign e to the non- convergent Cauchy sequence. How can we compare two metric spaces with completely di?erent objects (e.g. one containing classes of Cauchy sequences and the other containing real numbers)? DeTnition 183 Let (X,d X ) and (Y,d Y ) be two metric spaces. Let f : X → Y have the following property d X (x,y)=d Y (f(x),f(y)). (4.1) Afunctionf having this property is called an isometry. By (4.1) it is clear that an isometry is always an injection. If f is also a surjection, then it is a bijection and in this case we say that (X,d X )and (Y,d Y )areisometric. Two isometric spaces might have completely di?erent objects, but due to (4.1) and the fact that f is a bijection, they are exact replicas of one another. Their objects just have di?erent names. Gettingbacktoourexample, consider the spaces (R,|·|)and(CS(Q), b d). Let f : R→CS(Q)givenbyf(x)={<x n > 1 ,<x n > 2 ,...}where <x n > 1 → x, < x n > 2 → x,etc. are Cauchy sequences of one class converging to x (e.g. <x n > 1 = - 1+ 1 n ? and <x n > 2 = - 1? 1 n ? which converge to 1). f is a surjection because (R,|·|) is complete. One can show that f is an isometry. Thus, these two metric spaces are isometric. The above construction implies the fact that for every real number x ∈R there exists a sequence <x n > of rational numbers converging to x (i.e. limx n = x where x n ∈Q). 82 CHAPTER 4. METRIC SPACES 4.3 Compactness When we listed the three important theorems at the beginning of this chap- ter, there was a common assumption; the domain of f was taken to be the closed interval [a,b]. What properties of [a,b] guarantee the validity of these theorems? What properties of the domain of f are necessary for the valid- ity of comparable theorems in more general metric spaces? In more general metric spaces, closed intervals may not even be deTned. If we replaced the closed interval above with a closed ball, would the results continue to hold? As we will see later in the Chapter, they may not. In Chapter 3 we showed (Theorem 121) that [a,b] has the Bolzano- Weierstrass property such that any sequence of elements of [a,b]hasasub- sequence that converges to a point in [a,b]. As we will see, this property can also be deTned for a subset A of a general metric space (X,d)toguarantee the validity of the theorems we?re interested in. In fact, if we dealt with metric spaces only, we could have deTned compactness only in terms of sets which satisfy the Bolzano-Weierstrass property. The more general approach we take next, which can be applied in any topological space, uses a di?erent deTnition which is seemingly unrelated to the Bolzano-Weierstrass property. DeTnition 184 A collection C = {A i : i ∈ Λ,A i ? X} covers ametric space (X,d) if X = ∪ i∈Λ A i . C is called an open covering if its elements A i are open subsets of X. DeTnition 185 Ametricspace(X,d) is compact if every open covering C of X contains a Tnite subcovering of X. 7 AsubsetH of (X,d) is compact if every open covering of H by open sets of X has a Tnite subcovering of H. In order to apply this deTnition to show that a set H is compact we must examine every open covering of H and hence it is virtually impossible to use it in determining compactness of a set. The exception is the case of a Tnite subset H of a metric space X.Forifeverypointx n ∈ H is in some open set A i ∈C,thenat most m carefully selected subsets of C will have the property that their union contains H. Thus any Tnite subset H ? X is compact. On the other hand, to show that a set H is not compact, it is su?cient to show only one open covering cannot be replaced by a Tnite subcollection that also covers H. 7 That is, if every open covering C of X contains a Tnite subcollection {A i 1 ,A i 2 ,....,A i k } with A i j ∈C that also covers X. 4.3. COMPACTNESS 83 Example 186 Let (X,d)=(R,|·|) and H = {x ∈R : x ≥ 0} =[0,∞).Let C = {A n : n ∈ N,A n =(?1,n)} so that every A n ? R and H ? {∪ n∈N A n }. If {A n 1 ,A n 2 ,...,A n k } is a Tnite subcollection, let M =max{n 1 ,n 2 ,...,n k } so A n j ? A M and hence A M = ∪ k j=1 A n j . However, since A n is open, M/∈ A M and hence the real number M>0 does not belong to a Tnite open subcovering of H. Thus we have provided one particular covering of H by open sets (?1,n) which cannot be replaced by a Tnite subcollection that also covers H. This was su?cient to show that H is not compact. This example shows that boundedness of a set is a likely necessary condition for compactness. 8 See Figure 4.3.1. Lemma 187 Let H ? X. If H is compact, then H is bounded. Proof. Let x 0 ∈ H.Let A m = {x ∈ X : d(x 0 ,x) <m}. Here we construct an increasing nested sequence of open sets A m whose countable union contains H.That is, H ?∪ ∞ m=1 A m = X and A 1 ? A 2 ? ... ? A m ? ... It follows from the deTnition of compactness that there is a Tnite number M such that A 1 ? A 2 ? ... ? A M covers H.ThenH ? A M and hence bounded. Example 188 H =[0,1) cannot be covered by a Tnite subcollection of sets A n = ? ?1,1? 1 n ¢ for n ∈N. It is simple to see that H ?{∪ n∈N A n } and each A n ? R. However, if {A n 1 ,A n 2 ,...,A n k } is a Tnite subcollection and we let M =sup{n 1 ,n 2 ,...,n k },then A n j ? A M and hence A M = ∪ k j=1 A n j . However, since A n is open, 1? 1 M /∈ A M and hence the real number 1? 1 M ∈ H does not belong to a Tnite open subcovering of H. This example shows that closedness of a set is a likely necessary condition for compactness. 9 Figure 4.3.2. Lemma 189 Let H ? X. If H is compact, then H is closed. Proof. H is closed ? X\H is open. Let x ∈ X\H and construct an increasing nested sequence of open sets A k around but not including x with the property that their countable union is H\{x}. That is, A k = {y ∈ X : d(x,y) > 1 k ,k ∈ N} in X.Then{H\{x}} = ∪ k∈N A k .Since x/∈ H, each element of H is in some set A k by an application of Corollary 100 so that H ? 8 Unboundedness in this example was really just an application of the Archimedean Theorem 99. 9 Lack of closedness in this example was really just an application of the Corollary to the Archimedean Theorem 100. 84 CHAPTER 4. METRIC SPACES ∪ k∈N A k .Since H is compact, it follows from the deTnition of compactness that there is a Tnite K ∈ N such that H ?∪ K k=1 A k = A K .In that case, there is an open ball around x such that B 1 K+1 (x) ? X\H with B 1 K+1 (x)∩H = ?. Since x was arbitrary, each point in the complement of H iscontainedinan open ball in X\H. Thus X\H is open in which case H is closed. See Figure 4.3.3. Theorem 190 A closed subset of a compact set is compact. Proof. Let X be compact and H ? X be closed. Let C = {A i } be an open covering of H. Then G = {A i }∪(X\H)isanopencoveringofX since X\H is open (because H is closed). Since X is compact, there exists a Tnite subcollection F of G covering X.Since F also covers H,thenF\{X\H} also covers H and is a subcollection of C.ThenH is compact. Lemmas 187 and 189 provide necessary conditions for a set to be compact. Butwewouldliketohavesu?cient conditions that guarantee compactness of a set. To that end, Theorem 190 is useful but has limited applicability. The original space has to be compact in order to be able to use it. Are the necessary conditions of Lemmas 187 and 189 in fact su?cient? Not necessarily, as the next example shows. Example 191 Consider the metric space (R,d 0 ) with d 0 (x,y)=min{|x ? y|,1}.In this case, R is bounded since |x?y| ≤ 1, ?x,y ∈R.Wealsoknow that R is closed. It is clear, however, that R is not compact in (R,d 0 ) since a collection A = {(?n,n),n∈ N} covers R but it doesn?t contain a Tnite subcollection that also covers R. Exercise 4.3.1 Show that d 0 isametriconR. The space (R,|·|) provides a clue as to a set of su?cient conditions to establish compactness. Since Lemmas 187and189applytoanymetricspace, we know compactness implies boundedness and completeness. But in (R,|· |), boundedness and completeness is equivalent to the Bolzano-Weierstrass property by Theorem 121 (which we attributed to Heine-Borel). Thus, isn?t the Bolzano-Weierstrass propertysu?cient for compactness? In (R,|·|),this is true and we now show that the Bolzano-Weierstrass property is also su?cient in any metric space. Before we do this, we begin by formulating compactness in terms of sequences (consistent with the approach we are taking in this chapter). 4.3. COMPACTNESS 85 DeTnition 192 AsubsetH of a metric space X is sequentially compact if every sequence in H has a subsequence that converges to a point in H. Next we turn to establishing that the Bolzano-Weierstrass property, se- quential compactness, and compactness are equivalent in any metric space. Theorem 193 Let (X,d) be a metric space. Let H ? X. The following are equivalent: (i) H is compact; (ii) Every inTnite subset of H has a cluster point; (iii) Every sequence in H has a convergent subsequence. Proof. (Sketch) (i ? ii)Itissu?cient to prove the contrapositive that if A ? H has no cluster point, then A must be Tnite. If A has no cluster point, then A contains all its cluster points (because the set of cluster points is empty and every set contains the empty set.). Therefore A is closed. Since A is a closed subset of a compact space H, it is compact. For each x ∈ A,?ε>0suchthatB ε (x)∩{A\{x}} = ? since x is not a cluster point of A by DeTnition 140. Thus, the collection {B ε (x),x∈ A} forms an open covering of H.SinceH is compact, it is covered by Tnitely many B ε (x). Since each B ε (x) contains only one point of H, H is Tnite. (ii ? iii)Given <s i >, consider the set S = {s i ∈ H : i ∈ N}.IfS is Tnite, then s ? = s i for inTnitely many values of i in which case <s i > has a subsequence that is constant (and hence converges automatically). If S is inTnite, then by (ii) it has a cluster point s ? . Since s ? is a cluster point, we know by DeTnition 140 that ?ε = 1 n there exist x i n ∈ B1 n (s ? )suchthat s i n 6= s ? . This allows us to construct a subsequence <s i 1 ,s i 2 ,... > which converges to s ? . 10 (iii ? i) First, we show that ?ε>0, ? a Tnite subcovering of H (by ε-balls). Once again, it is su?cient to prove the contrapositive: If for some ε>0,Hhas no Tnite subcover, then H has no convergent subsequence. If H cannotbecoveredwithaTnite number of balls, construct <s i > as follows: Choose any s ∈ H,say s 1 . Since B ε (s 1 ) is not all of S (which would contradict that S has no Tnite subcover), choose s 2 /∈ B ε (s 1 ). In general, 10 More speciTcally, deTne the subsequence <s g(i) > approaching s ? inductively as follows: Choose i 1 such that s i 1 ∈ B 1 (s ? ). Since s ? is a cluster point of the set X, it is also a cluster point of the set S 2 = {s i ∈ S : i ∈ N,i≥ 2} obtained by deleting a Tnite number of elements of S. Therefore, there is an element s i 2 of S 2 whichisanelement of B1 2 (s ? )withi 2 >i 1 . Continuing by induction, given i n?1 , choose i n >i n?1 such that s i n ∈ B1 n (s ? ). 86 CHAPTER 4. METRIC SPACES given <s 1 ,s 2 ,...,s n >, choose x n+1 /∈ B ε (s 1 ) ∪ B ε (s 2 ) ∪ ... ∪ B ε (s n )since these balls don?t cover S. By construction d(s n+1 ,s i ) ≥ ε for i =1,...,n. Thus, <s i > can have no convergent subsequence. The above procedure can be used to construct a Tnite subcollection that covers S. 11 . When we put this theorem together with the result that every closed and bounded set has the Bolzano-Weierstrass Property we get a simple criterion for determining compactness of a subset in (R, |·|).In particular, all we have to do is establish that a set in (R, |·|) is closed and bounded to know it is compact. Corollary 194 (Heine-Borel) Given (R, |·|), H ?R is compact i? H is closed and bounded. Proof. Follows from Theorem 193 together with the Heine Borel Theorem 121. Corollary 194 is the ?more familiar? version of the Heine-Borel Theorem and can easily be extended to R n with the Euclidean metric. Is there any relation between compactness and completeness? While it may not appear so by their deTnitions, the next result establishes that they are in fact related. Lemma 195 Let (X,d) be a metric space. If X is compact, then it is com- plete. Proof. From Theorem 193 every Cauchy sequence has a convergent subse- quence. Completeness follows by Lemma 173. The converse of Lemma 195 does not necessarily hold; that is, it doesn?t follow that if X is complete, then it is compact as Example 191 shows. We need a stronger condition than boundedness to prove an analogue of the Heine-Borel Corollary 194 for general metric spaces. The condition was actually already used in part (iii) of Theorem 193. DeTnition 196 Ametricspace(X,d) is totally bounded if ?ε>0, there is a Tnite covering of X by ε?balls. 11 See the Lebesgue Number Lemma, p. 179, of Munkres for this construction. For the case of X = R n , see Bartle Theorem23.3 p. 160. 4.4. CONNECTEDNESS 87 As we can see, the deTnition of total boundedness is quite similar to DeTnition 185 of compactness. One might ask how to check if a metric space is totally bounded. Though there is no satisfactory answer in a general metric space, there are various criteria for speciTc spaces. For instance, total boundedness in R n is equivalent with boundedness. Total boundedness of a metric space implies boundedness, but the con- verse is not true. Example 197 Whileweestablishedthat(R,d 0 ) with d 0 (x,y)=min{|x ? y|,1} was bounded in Example 191, it is not totally bounded. This follows because all of R cannot be covered with Tnitely many balls of radius, say 1 4 . NextweestablishananalogueoftheHeine-Borel Theorem for general metric spaces. Theorem 198 Ametricspace(X,d) is compact i? it is complete and totally bounded. Proof. (?) Completeness follows by Lemma 195. Total boundedness follows from DeTnition 196 given X is compact. (?)ByTheorem193itsu?ces to show that if <x n > is a sequence in X, then there exists a subsequence <x g(n) > that converges. Since X is complete,itsu?ces to construct a subsequence that is Cauchy. Since X is totally bounded, there exist Tnitely many ε =1ballsthatcoverX. At least one of these balls, say B 1 contains y n for inTnitely many indices. Let J 1 ? N denote the set of all such indices for which s n ∈ B 1 . Next cover X by Tnitely many ε = 1 2 balls. Since J 1 is inTnite, at least one of these balls, say B 2 contains y n for inTnitely many indices. Let J 2 ? J 1 denote the set of all such indices for which n ∈ J 1 and y n ∈ B 2 . Using this construction, we obtain a sequence <J k > such that J k ? J k+1 .Ifi,j ≥ k,then n i ,n j ∈ J k and y n i ,x n j are contained in a ball B k of radius 1 k . Hence <x n i > is Cauchy. See Figure 4.3.4. So how does one test for total boundedness? ???EXAMPLE???? 4.4 Connectedness Connectedness of a space is very simple. A space is ?disconnected? if it can be broken up into separate globs, otherwise it is connected. More formally, 88 CHAPTER 4. METRIC SPACES DeTnition 199 Let (X,d) beametricspace. S ? X is disconnected (or separated)ifthereexistapairofopensetsT,U such that S ∩U and S∩T are disjoint, non-empty and have union S. S is connected if it is not disconnected. See Figure 4.4.1. Example 200 (a) Let (X,d)=(R,|·|) and H = N.ThenN is disconnected in R since we can take T = {x ∈ R: x> 3 2 } and U = {x ∈ R: x< 3 2 }. Then T ∩N 6= ? 6= U ∩N, (T ∩N)∩(U ∩N)=?, and N = T ∪U.(b)Let (X,d)=(R,|·|) and H = Q + .ThenQ + is disconnected in R sincewecan take T = {x ∈R: x> √ 2} and U = {x ∈R: x< √ 2}.See Figure 4.4.2. Theorem 201 I =[0,1] is a connected subset of R. Proof. Suppose, to the contrary, that there are two disjoint non-empty open sets A,B whose union is I.SinceA and B are open, they do not consist of asinglepoint.WLOGleta ∈ A and b ∈ B such that 0 <a<b<1. Let c =sup{x ∈ A : x<b}, which exists by Axiom 3. Since 0 <c<1,c∈ A∪B. If c ∈ A,thenc 6= b and since A is open, there is a point a 1 ∈ A with c<a 1 such that [c,a 1 ]iscontainedin{x ∈ A : x<b}. But this contradicts the deTnition of c. A similar argument can be made if c ∈ B. This result can easily be extended to any (open, closed, half open, etc.) subset of R and to show R itself is connected. Furthermore, it is possi- ble to construct cartesian products of connected sets which are themselves connected. 12 4.5 Normed Vector Spaces Before moving onto the next topological concept (continuity), we give an example of a speciTc type of metric space called a normed vector space. Normed vector spaces are by far the most important type of metric space we will deal with in this book. A normed vector space has features that a metric space doesn?t have in general; it possesses a certain algebraic structure. Elements of a vector space (called vectors) can be added, subtracted, and multiplied by a number (called a scalar). See Figure 4.5.1 for the relation between metric spaces and vector spaces. 12 See Munkres p.150. 4.5. NORMED VECTOR SPACES 89 DeTnition 202 A vector space (or linear space) is a set V of arbitrary elements (called vectors) on which two binary operations are deTned: (i) closed under vector addition (if u,v ∈ V,thenu + v ∈ V) and (ii) closed under scalar multiplication (if a ∈R and v ∈ V, then av ∈ V) which satisfy the following axiom(s): C1. u+v = v +u, ?u,v ∈ V C2. (u+v)+w = u+(v +w), ?u,v,w ∈ V C3. ?0 ∈ V 3 v +0=v =0+v, ?v ∈ V C4. For each v ∈ V, ?(?v) ∈ V 3 v +(?v)=0=(?v)+v C5. 1v = v, ?v ∈ V C6. a(bv)=(ab)v, ?a,b ∈R and ?v ∈ V C7. a(u+v)=au+av, ?u,v ∈ V C8. (a+b)v = av +bv, ?a,b ∈R and ?v ∈ V Example 203 R is the simplest vector space. The elements are real numbers where ?+?and?·? were introduced in Axiom 1. R 2 is also a vector space whose basic elements are 2?tuples, say (x 1 ,x 2 ). We interpret (x 1 ,x 2 ) not as a point in R 2 with coordinates (x 1 ,x 2 ) but as a displacement from some location. For instance, the vector (1,2) means move one unit to the right and two units up from your current location. See Figure 4.5.2a for an example of the vector (1,2) from two di?erent initial locations. Often we take the inital location to be the origin. Vector addition (see Figure 4.5.2b.) is then deTned as (x 1 ,x 2 )+(y 1 ,y 2 )=(x 1 + y 1 ,x 2 + y 2 ) and scalar multiplication (see Figure 4.5.2c.) is deTned as a(x 1 ,x 2 )=(ax 1 ,ax 2 ). Let F(X,R) be the set of all real valued functions f : X → R. Then we can deTne (f + g)(x)=f(x)+g(x) and (αf)(x)=αf(x). These two operations satisfy Axioms C1 ? C8 and hence F(X,R) is a vector space. We will consider such sets extensively in Chapter 6. DeTnition 204 A vector subspace U ? V is a subset of V which is a vector space itself. 90 CHAPTER 4. METRIC SPACES Example 205 Let V = R 2 = {(x,y):x,y ∈ R} and U = {(x,y):y =2x, where x,y ∈ R}.Then U is a vector subspace of V. Note that Z = {(x,y): y =2x +1, where x,y ∈ R} is a subset of V but it is not a vector subspace of V since 0 /∈ Z. The algebraic structure of a vector space by itself doesn?t allow us to measure distance between elements and hence doesn?t allow us to deTne topological properties. This can be accomplished in a vector space through a distance function called the norm. DeTnition 206 If V is a vector space, then a norm on V is a function from V to R, denoted k·k: V → R, which satisTes the following properties ?u,v ∈ V and ?a ∈ R : (i) kvk ≥ 0,(ii)kvk =0i? v =0, (iii) kavk = |a|kvk,and(iv)ku + vk ≤ kuk+kvk. A vector space in which a norm has been deTned is called a normed space. Notice that the algebraic operations vector addition and scalar multipli- cationareusedindeTning a norm. Thus, a norm cannot be deTnedina general metric space which is not equipped with these operations. But a vector space equipped with a norm can be seen as a metric space and a met- ric space which has a linear structure is also a normed vector space. The following theorem establishes this relationship. Theorem 207 Let V be a vector space then (i) If (V,d) is a metric space then (V,k.k) isanormedvectorspacewith the norm k.k :V →R deTned kxk = d(x,0), ?x ∈ V (ii) If (V,k.k) is a normed vector space then (V,ρ) is a metric space with the metric ρ : V ×V →R deTned ρ(x,y)=kx?yk, ?x,y ∈ V Exercise 4.5.1 Prove Theorem 207. Note that whenever a metric space has the additional algebraic structure given in 202, we will use the norm rather than the metric and hence work in normed vector spaces. Exercise 4.5.2 RedeTne convergence, open balls, and boundedness in terms of normed vector spaces. DeTnition 208 A complete normed vector space is called a Banach space. 4.5. NORMED VECTOR SPACES 91 Some vector spaces are endowed with another operation, called an inner (or dot) product that assigns a real number to each pair of vectors. The inner product enables us to measure the ?angle? between elements of a vector space. 13 DeTnition 209 If V is a vector space, then an inner product is a function < ·,· >: V ×V →R which satisTes the following properties ?u,v,w ∈ V and ?a ∈R : (i) <v,v>≥ 0,(ii)<v,v>=0i? v =0, (iii) <u,v>=<v,u>, (iv) <u,(v + w) >=<u,v>+ <u,w>,(v)< (au),v>= a<u,v>=< u,(av) >. A vector space in which an inner product has been deTned is called an inner product space. The inner product can be used to deTne a norm (in particular the Eu- clidean measure of distance) in the following way. Theorem 210 Let V be an inner product space and deTne kvk = √ <v,v>. Then k·k: V →R is a norm which satisTes the Cauchy-Schwartz inequality <u,v>≤kukkvk. Proof. (Sketch) Since <v,v>≥ 0 by part (i) of deTnition 209, √ <v,v> exists and exceeds zero, establishing part (i) of deTnition 206. Part (ii) also follows from(ii) of deTnition 209. By part (v) of deTnition209, p < (av),(av) > = p a 2 <v,v>= |a| √ <v,v>= |a|kvk, establishing part (iii). To establish Cauchy-Schwartz, let w = au?bv for a,b ∈ R and u,v∈V.BydeTnition 209, w ∈ V. Then 0 ≤ <w,w>= a 2 <u,u>?2ab < u,v > +b 2 <v,v> = kvk 2 kuk 2 ?2kvkkuk <u,v>+kuk 2 kvk 2 =2kukkvk(kukkvk? <u,v>) where the second equality follows by letting a = kvk and b = kuk,which were free parameters in the Trst place. To get some intuition for this result, notice that if θ is the angle between vectors u and v, then the relationship between the inner product and norms of the vectors is given by <u,v>= kvkkukcosθ.The inequality then follows since cosθ ∈ [?1,1].SeeFigure4.5.6forthisgeometricinterpretationofthe Cauchy Schwartz inequality. 13 For instance, orthogonality is just <u,v>=0. 92 CHAPTER 4. METRIC SPACES Exercise 4.5.3 Finish the proof of Theorem 210 (i.e. establish the triangle inequality in part (iv) of deTnition 206). Whereas some norms (eg. the Euclidean norm) can be induced from an inner product, other norms (eg sup norm) cannot be. DeTnition 211 A complete inner product space is called a Hilbert space. Note that a Hilbert space is also a Banach space. 4.5.1 Convex sets DeTnition 212 We say that a linear combination of x 1 ,...,x n ∈ V is { P n i=1 α i x i ,α i ∈ R,i =1,...,n}.We say that a convex combination of x 1 ,...,x n ∈ V is { P n i=1 α i x i ,α i ≥ 0, P n i=1 α i =1,i=1,...,n}. DeTnition 213 AsubsetS of a vector space V is a convex set if for every x,y ∈ S,the convex combination αx+(1?α)y ∈ S, for 0 ≤ α ≤ 1. 14 Example 214 In R, any interval (e.g. (a,b))isconvexbut(a,b) ∪ (c,d) with b<cis not convex. See Figure 4.5.3 for convex sets in R 2 . DeTnition 215 The sum (di?erence) of two subsets S 1 and S 2 of a vector space V is S 1 ±S 2 = {v ∈ V : v = x±y, x∈ S 1 ,y∈ S 2 }.See Figure 4.5.4. Theorem 216 (Properties of Convex Sets) If K 1 and K 2 are convex sets, then the following sets are convex: (i) K 1 ∩K 2 ; (ii)λK 1 ; (iii) K 1 ±K 2 . Proof. (iii) Let x,y ∈ K 1 + K 2 so that x = x 1 + x 2 ,x 1 ∈ K 1 ,x 2 ∈ K 2 , y = y 1 + y 2 ,y 1 ∈ K 1 ,y 2 ∈ K 2 .Thenαx +(1?α)y = α(x 1 +x 2 )+(1? α)(y 1 +y 2 )=(αx 1 +(1?α)y 1 )+(αx 2 +(1?α)y 2 ) ≡ z 1 +z 2 . Since K 1 and K 2 are convex, z 1 ∈ K 1 ,z 2 ∈ K 2 . Thus, x+y ∈ K 1 +K 2 . Example 217 It is simple to show cases where K 1 and K 2 areconvexsets, but K 1 ∪K 2 is not convex. See Example 214 in R. 14 For instance, for x 1 ,...,x n ∈ S, the convex combination is n X i=1 α i x i ,where n X i=1 α i =1 and α i ≥ 0. 4.5. NORMED VECTOR SPACES 93 As we will see later, convexity of a set is a desirable property. If a set S is not convex, we may replace it with the smallest convex set containing S called the convex hull. DeTnition 218 Let S ? V. The convex hull of S is the set of all convex combinations of elements from S, denoted co(S). That is, co(S)={x ∈ V : x = P n i=1 α i x i ,x i ∈ S,α i ≥ 0, P n i=1 α i =1}. Example 219 In R n , consider two vectors A 6= B.Thenco({A,B}) is just a line segment with endpoints A, B. If A,B,C do not lie on the same line, then co({A,B,C}) is the triangle A,B,C. See Figure 4.5.5. Exercise 4.5.4 Show that if V is convex, then (i) co(V)=V, and (ii)if S ? V,thenco(S) ? V is the smallest convex set containing S. 4.5.2 A Tnite dimensional vector space: R n ThemostfamiliarvectorspaceisjustR n , with n ∈ N and n<∞. R n is the collection of all ordered n-tuples (x 1 ,x 2 ,...,x n )withx i ∈ R, i = 1,2,...,n. Vector addition is deTned as (x 1 ,x 2 ,...,x n )+(y 1 ,y 2 ,...,y n )=(x 1 + y 1 ,x 2 +y 2 ,...,x n +y n ) and scalar multiplication is deTned as a(x 1 ,x 2 ,...,x n )= (ax 1 ,ax 2 ,...,ax n ). Exercise 4.5.5 Verify R n is a vector space under these operations. Example 220 Since (R,|·|) is a complete metric space with absolute value metric, it is a Banach space with the norm kxk = |x|. Since (R n ,kxk) is a complete metric space with Euclidean metric, it is a Banach space with Euclidean norm kxk 2 = p P n i=1 (x i ) 2 . Since (R n ,kxk ∞ ) is a complete metric space with supremum metric, it is also a Banach space with sup norm kxk ∞ =max{|x 1 |,....|x n |}. The next result provides a useful characterization of the relationship be- tween kxk ∞ and kxk. Theorem 221 If x =(x 1 ,...,x n ) ∈R n , then kxk ∞ ≤kxk 2 ≤ √ nkxk ∞ . 94 CHAPTER 4. METRIC SPACES Proof. Since (kxk 2 ) 2 = P n i=1 (x i ) 2 , it is clear that |x i |≤kxk 2 , ?i. Similarly, if M =max{|x 1 |,....|x n |},then(kxk 2 ) 2 ≤ nM 2 , so kxk≤ √ nM. Example 220 shows that R n can be endowed with two di?erent norms. One might ask if these two normed vector spaces are somehow related and if so, in what sense? Distances between two points with respect to these two norms are generally di?erent. See Figure 4.5.7 In this case, these two normed vector spaces are not isometric. On the other hand these two spaces have identical topological properties like openness, closeness, compactness, connectedness, and continuity. In this case, we say that these two normed vector spaces are homeomorphic or topologically equivalent. To show that two metric spaces (or normed vector spaces according to Theorem 207) are topologically equivalent it su?ces to show that the collec- tions of open sets in both spaces are identical. This is because all topological properties can be deTned in terms of open sets. The fact that open sets are identical follows Theorem 221. To see this, let A be open in R n under Euclidean norm. Then ?x ∈ A, ?ε>0suchthat{y ∈ R n : ky ? xk < ε} ? A. But from the Trst part of the inequality in Theorem 221 we know {y ∈ R n : ky?xk ∞ ≤ ky?xk <ε} ? A. Hence A is open in R n under the supnorm. Theinversecanbeshownthesamewayusingthesecondpart of the inequality. We will discuss this at further length after we introduce continuity. Example 222 InR n ,deTne < (x 1 ,x 2 ,...,x n ),(y 1 ,y 2 ,...,y n ) >= x 1 y 1 +x 2 y 2 + ...+x n y n . R n with inner product deTned this way is a Hilbert space. Exercise 4.5.6 Verify the dot product in Example 222 deTnes an inner prod- uct on R n . Theorem 223 In the Euclidean space R n a sequence of vectors <x m > con- verges to a vector x =(x 1 ,....,x n ) ifandonlyifeachcomponent<x i m > converges to x i , i =1,...,n. Exercise 4.5.7 Prove Theorem 223. We next introduce the simplest kind of convex set in R n . DeTnition 224 A nondegenerate simplex in R n is the set of all points S = {x ∈R n : x = α 0 v 0 +α 1 v 1 +....+α n v n ,α 0 ≥ 0,...,α n ≥ 0 and n X i=0 α i =1} (4.2) 4.5. NORMED VECTOR SPACES 95 where v 0 ,v 1 ,......,v n are vectors from R n such that v 1 ?v 0 ,v 2 ?v 0 ,.....,v n ? v 0 are lineary independent. Vectors v 0 ,v 1 ,.....,v n arecalledvertices. The numbers α 0 ,...,α n are called barycentric coordinates (the weights of the convex combinations with respect to n+1Txed vertices) of the point x. Example 225 A nondegenerate simplex in R 1 is a line segement, in R 2 is a triangle, in R 3 is a tetrahedron. In R 3 , for example, the simplex is determined by 4 vertices, any 3 vertices determine a boundary face,any2 vertices determine a boundary segment. See Figure 4.5???(4.8.2.???) A simplexisjust the convex hull of the set of allverticesV = {v 0 ,v 1 ,....,v n }. By the following theorem, any point of a convex hull of V canbeexpressedas a convex combination of these vertices. Do not confuse the n+1 barycentric coordinates (the α i )ofx with the n cartesian coordinates of x. Theorem 226 (Caratheodory) If X ?R n and x ∈ co(X),then x = n+1 X i=1 λ i x i for some λ i ≥ 0, n+1 X i=1 λ i =1,x i ∈ X, ?i. Proof. (Sketch) Since x ∈ co(X),it can be written as a convex combination of m points by Theorem ??.Ifm ≤ n + 1, we are done. If not, then the generated vectors · x 1 1 ? , · x 2 1 ? ,..., · x m 1 ? are linearly dependent, so a combination of them will be zero (i.e. m X i=1 μ i · x i 1 ? =0 with μ i not all zero. If λ i are coe?cients of x i ,wecanchooseα to reduce the number of vectors with nonzero coe?cients below m by setting θ i ≡ λ i ?αμ i . We know that in R n each vector can be written as a linear combination of n-linearly independent vectors (called a basis). That is, x = n X i=1 α i x i , 96 CHAPTER 4. METRIC SPACES {x 1 ,...,x n } is a basis. There is no restriction on the coe?cients α i . In The- orem 226 there are additional assumptions put on α i (i.e. n X i=1 α i =1and α i ≥ 0). Now adding one more variable (from n to n + 1) to the system yields a unique solution for vectors belonging to the co(V) and no solution for other vectors. The following two examples demonstrate the di?erence between cartesian coordinates and barycentric coordinates in R 2 . Example 227 Let V = {(0,1),(1,0),(1,1)}. Say we want to express the vector ( 2 3 , 2 3 ) as a linear combination of (0,1) and (1,0), two basis vectors. That is ( 2 3 , 2 3 )= 2 3 (1,0) + 2 3 (0,1), but this is not a convex combination since 2 3 + 2 3 = 4 3 6=1.Butanypointfromco(S) can be uniquely expressed as the convex combination of vectors from S. For instance, ( 2 3 , 2 3 )=α(1,0) +β(0,1) + (1?α?β)(1,1) where 0 ≤ α,β ≤ 1. Letting α = β = 1 3 , we have ( 2 3 , 2 3 )= 1 3 (1,0) + 1 3 (0,1) + 1 3 (1,1) On the other hand, a vector outside co(V) (like ( 1 3 ,0)) cannot be expressed as a convex combination of vectors from V. See Figure 4.5.8. Example 228 Let the Txed vertices be given by v 0 =(0,1),v 1 =(0,3),v 2 = (2,0) and consider the point x 1 =(1,1) on the interior of the simplex. See Figure 4.5????(4.8.3.) The barycentric coordinates of x 1 with respect to vertices v 0 ,v 1 ,v 2 are ? 1 4 , 1 4 , 1 2 ¢ since (1,1) = 1 4 (0,1) + 1 4 (0,3) + 1 2 (2,0). Inthecaseofx 2 =(0,2), the barycentric coordinates are ? 1 2 , 1 2 ,0 ¢ since (0,2) = 1 2 (0,1)+ 1 2 (0,3)+0(0,2). Inthecaseofx 3 =(2,0), the barycentric coordinates are (0,0,1) since (2,0) = 0·(0,1)+0·(0,3)+1·(2,0). Notice that x 1 is an interior point of the simplex so that all its barycentric coordinates are positive, that x 2 is on the boundary so that one barycentric coordinate is 0, and x 3 isavertexsothatithas2 barycentric coordinates which are zeros. In this section, we will always mean by α i barycentric coordinates of a point inside the Txed simplex (including boundary points). 4.5. NORMED VECTOR SPACES 97 The next result, while purely combinatorial, will be used in the proof of Brouwer?s Fixed Point Theorem 302. While this can be proven for R n ,here we present it for R 2 . First we must introduce an indexing scheme for points in the simplex as follows. Let Z be a set of labels in R 2 given by {0,1,2}. While the index function I : S →Z canobtainanyvaluefromZ for points x inside the simplex, it must satisfy the following restrictions on the boundary: I(x)= ? ? ? 0or1 onthelinesegment(v 0 ,v 1 ) 0or2 onthelinesegment(v 0 ,v 2 ) 1or2 onthelinesegment(v 1 ,v 2 ) (4.3) For example on the boundary (v 0 ,v 1 ), I (x) can?t obtain the value 2. Thus I (X) = 0 or 1 on the line segment (v 0 ,v 1 ). See Figure 4.5???(4.8.6???). Note that the (4.3) implies that I(v 0 )=0,I(v 1 )=1, and I(v 2 ) = 2 at the vertices. Lemma 229 (Sperner) Form the barycentric subdivision of a nondegener- ate simplex. Label each vertex with an index I (x)=0,1,2 that satisTes the restrictions (4.3)on the boundary. Then there is an odd number of cells (thus at least 1 ) in the subdivision that have vertices with the complete set of labels 0,1,2. Proof. By induction on n. We show just the Trst step (i.e. for n =1)to get the idea. If n =1,a nondegenerate simplex is a line segment and a face is a point. To obey the restrictions (4.3), one end has label 0 the other has label 1, and the rest is arbitrary. See Figure 4.5(4.8.11????). Next deTne a counting function F,where by F (a,b) we mean the number of elements in the simplex of type (a,b). For example, in Figure 4.5(4.8.11????), F (0,0) = 2, F (0,1) = 3,F(1,1) = 1,F(0) = 4,F(1) = 3. Permutations don?t matter (i.e. (0,1) and (1,0) are the same type which is why F(0,1)=3sincewehave two occurences of (0,1) and one of (1,0). Consider the single points labeled 0. Two labels 0 occur in each cell of type (0,0), one label 0 occurs in each cell of type (0,1). The sum 2F (0,0)+F (0,1) counts every interior 0 twice, since every interior 0 is the point that is shared by two cells and the sum counts every boundary 0 once. Therefore 2F (0,0)+F (0,1) = 2F i (0)+F b (0) where F i (0) is the number of i (for interior) 0 0 s and F b (0) is the number of b (for boundary) 0 0 s. Clearly F b (0) = 1. Hence F (0,1) = 2[F i (0)?F (0,0)] + 1. (4.4) 98 CHAPTER 4. METRIC SPACES In Figure 4.5(4.8.9????) these numbers are F (0,1) = 3,F i (0) = 3,F(0,0) = 2. In 1 dimension the number of cells having vertices with the complete set of labels 0,1isF (0,1) and from (4.4) we see that it is always an odd number. Example 230 The values of counting functions F for the simplex in Figure 4.5(4.8.9????) are: F (0) = 7,F(0,0) = 4,F(1,1) = 8,F(0,0,0) = 0 F (1) = 9,F(0,1) = 16,F(1,2) = 8,F(0,0,1) = 5 F (2) = 5,F(0,2) = 8,F(2,2) = 1,F(0,0,2) = 2 F (0,1,1) = 6,F(0,2,2) = 1,F(1,1,1) = 2,F(2,2,2) = 0 F (0,1,2) = 7,F(1,1,1) = 2,F(1,2,2) = 0 4.5.3 Series The fact that a normed vector space is the synthesis of two structures - topological and algebraic - enables us to introduce the notion of an inTnite sum (i.e. a sum containing inTnitely many terms). These objects are called series. As we will see in the subsection on ! p spaces, norms will be deTned in terms of functions of inTnite sums so understanding when they converge or diverge is critical. Let (V,k·k) be a normed vector space and let <x n > be a sequence in V. We can deTne a new sequence <y n > by y n = n X i=1 x i . The sequence <y n > is called the sequence of partial sums of <x n >.Since X is also a metric space, we can ask if <y n > is convergent (i.e. if there exists an element y ∈ X such that <y n >→ y or equivalently ky n ?yk X → 0. If such an element exists we say that the series ∞ X i=1 x i is convergent and write y = ∞ X i=1 x i .If<y n > is not convergent, we say that ∞ X i=1 x i is divergent. Example 231 Consider (R,|·|) and let <x n >= - 1 2 n ? , which is just a geometric sequence with quotient 1 2 . The sequence of partial sums is y 1 = 1 2 , 4.5. NORMED VECTOR SPACES 99 y 2 = 1 2 + 1 4 = 3 4 ,y 3 = 1 2 + 1 4 + 1 8 = 7 8 ,...,y n = 1 2 + 1 4 +...+ 1 2 n = 1 2 3 1? 1 2 n 1? 1 2 ′ =1? 1 2 n . Since <y n >= - 1? 1 2 n ? → 1,wewrite ∞ X i=1 1 2 i =1. Example 232 While we have already seen that < 1 n > converges (to 0), the harmonic series P ∞ i=1 1 n diverges (i.e. is not bounded). To see this, note ∞ X n=1 1 n =1+ 1 2 + 1 3 + 1 4 + 1 5 + 1 6 + 1 7 + 1 8 +... ≥ 1+ 1 2 + 1 4 + 1 4 + 1 8 + 1 8 + 1 8 + 1 8 +... =1+ 1 2 + 1 2 + 1 2 +... The right hand side is the sum of inTnitely many halves which is not bounded. Elements of a series can also be functions. We will deal with series of functions in Chapter 6. 4.5.4 An inTnite dimensional vector space: ! p The example in the above subsection is of a Tnite dimensional vector space; that is, the Euclidean space R n with either norm (there are at most n lineary independent vectors inR n ). Now we introduce an inTnite dimensional vector space. As you will see, results from Tnite dimensional vector spaces cannot be generalized in inTnite dimensional vector spaces. DeTnition 233 Let R ω be the set of all sequences in R. Let 1 ≤ p ≤∞ and let ! p be the subset of R ω whose elements satisfy the ∞ P i=1 |x i | p < ∞. 15 The ! p -norm of a vector x ∈ ! p is deTned by kxk p =( ∞ X i=1 |x i | p ) 1 p for 1 ≤ p<∞ and ! ∞ is the subset of all bounded sequences equipped with the norm kxk ∞ = sup{|x 1 |,....,|x n |,...}. 15 Recall, R ω = {f : N→R} where ω = card(N). 100 CHAPTER 4. METRIC SPACES We note that there is a set of inTnitely many linearly independent vectors in ! p ,namely {e i =<x j >, i ∈N where x j =0fori 6= j and x j =1fori = j} whichiscalledabasis. Before proving that ! p is a Banach space, we use the following exam- ple to illustrate some di?erences between Tnite dimensional Euclidean space R n and inTnite dimensional ! 2 . In particular, convergence by components is not su?cient for convergence in ! 2 (i.e. the result of Theorem 223 is not necessarily true). Example 234 Let K = {e i =<x j >, i ∈ N where x j =0for i 6= j and x j =1for i = j}. That is, e 1 = < 1, 0, 0, 0, ... > e 2 = < 0, 1, 0, 0, ... > e 3 = < 0, 0, 1, 0, ... > e 4 = < 0, 0, 0, 1, ... > . .... . .... ↓↓↓↓ 0000 Observe that each component <x i n > converges to 0 in (R ,|·|)foreach i ∈ N . But the sequence <e i > doesn?t converge to 0 since ke i ?0k 2 =1, ?i ∈ N. In fact <e i > has no convergent subsequence since the distance between any two elements e i and e j ,i6= j, is ke i ?e j k 2 = √ 2. Thus according to Theorem 193, K is not compact in ! 2 . But notice that K is both bounded and closed and these two properties are su?cient for compactness in R n . Notice that K is not totally bounded. For if ε = 1 2 , the only non-empty subsets of K with diameter less than ε are the singleton sets with one point. Accordingly, the inTnite subset K cannot be covered by a Tnite number of disjoint subsets each with diameter less than 1 2 . Now we prove that the ! p space is a complete normed vector space (and hence that it is a Banach space for any p satisfying 1 ≤ p ≤∞)andthat ! 2 is a Hilbert space with the inner product deTned by <x,y>= ∞ P i=1 x i y i . First, we need to show that k·k p deTnes a norm. On ! p , 1 ≤ p ≤∞.The important role in investigating ! p plays another space ! q whose exponent q is associated with p by the relation 1 p + 1 q =1wherep,q are non-negative 4.5. NORMED VECTOR SPACES 101 extended real numbers. Two such numbers are called (mutually) conjugate numbers. If p = 1 the conjugate is q = ∞ since 1 1 + 1 ∞ =1+0=1. Also notice that q = p p?1 > 1forp>1. If p =2, then q =2. It is straightforward to show that k·k p satisTes the Trst three properties of a norm. The triangle property is a tricky one. Before showing it we shall establish some important inequalities. Lemma 235 Let a,b > 0 and p,q ∈ (1,∞) with 1 p + 1 q =1. Then ab ≤ a p p + b q q , with equality if a p = b q . Proof. Since the exponental function is convex, we have exp(λA+(1?λ)B) ≤ λexpA +(1? λ)expB, for any real numbers A and B. By substituting A = ploga, λ = 1 p ,B= qlogb, and 1?λ = 1 q , we get the desired inequality. See Figure 4.5.9. The next result is the analogue of Cauchy-Schwartz in inTnite dimensions. Theorem 236 (H?older inequality) Let p,qd[1,∞] with 1 p + 1 q =1. If the sequences hx n i∈ ! p and hy n i∈ ! q ,thenhx n y n i∈ ! 1 and ∞ X n=1 |x n y n | ≤ khx n ik p khy n ik q 3 = kxk p kyk q ′ (4.5) where x = hx n i and y = hy n i Proof. For p =1,q= ∞, we have ∞ X i=1 |x i y i |≤ ( sup{y n ,ndN}· ∞ X n=1 |x n | ) = khx n ik 1 khy n ik ∞ . Next, let p,qd(1,∞). If hx n i or hy n i is a zero vector, we have equality in (4.5). Now let hx n i 6=0,hy n i 6=0. 16 Substituting x n = |hx N i| kxk p ,y n = |hy N i| kyk q for ab in lemma 235, we have ∞ X n=1 |hx n i| kxk p · |hy n i| kyk q ≤ 1 p ∞ X n=1 μ |hx n i| kx p k ? p + 1 q ∞ X n=1 ? |hy n i| kyk q ! q ≤ 1 p 1 3 kxk p ′ p · 3 kxk p ′ p + 1 q 1 3 kyk q ′ q · 3 kyk q ′ q ≤ 1 p + 1 q =1. 16 Note this means that not all terms in the sequence equal 0 (i.e. there is at least one term di?erent from 0). 102 CHAPTER 4. METRIC SPACES By multiplying kxk p ·kyk q we get the result. Note that if p = q =2, Inequality (4.5) is called the Cauchy-Schwartz inequality. Nowwecanprovethatthek·k p norm satisTes the triangle inequality. Theorem 237 (Minkowski) Let 1 ≤ p ≤∞,x = hx n i,y= hy n i ∈ ! p . Then kx+yk p ≤kxk p +kyk p . (4.6) Proof. If p =1orp = ∞, the proof is trival. Let p ∈ (1,∞). By multiplying both sides of (4.6) by 3 kx+yk p ′ p?1 we get the equivalent in- equality 3 kx+yk p ′ p ≤ 3 kxk p +kyk p ′3 kx+yk p ′ p?1 . Asimplecalculation showsthisisequivalentto P ∞ i=1 |x i |(|x i +y i |) p?1 + P ∞ i=1 |y i |(|x i +y i |) p?1 ≤ kxk p · 3 kx+yk p ′ p?1 + kyk p · 3 kx+yk p ′ p?1 . Due to symmetry of x,y, it now su?ces to show that ∞ X i=1 |x i |(|x i +y i |) p?1 ≤kxk p · 3 kx+yk p ′ p?1 . (4.7) Let z i =(|x i +y i |) p?1 thenkzk q =( P ∞ i=1 (z i ) q ) 1 q = 3 P (|x i +y i |) (p?1)q ′1 q = ( P ∞ i=1 |x i +y i | p ) p?1 p = kx+yk p?1 p whereweusedthefactthatq(p?1) = p and 1 q = p?1 p . Now by Hyolder inequality (4.5), we have P ∞ i=1 |x i ·z i | ≤ kxk p ·kzk q which by plugging in z i yields P ∞ i=1 |x i |(|x i +y i |) p?1 ≤ kxk p · 3 kx+yk p ′ p?1 is just inequality (4.7). Now that we showed that for 1 ≤ p ≤∞,! p with k·k p is a normed vector space, we ask ?Is it complete?? The answer is yes as the following theorem shows. Theorem 238 For 1 ≤ p ≤∞, the ! p space is a complete normed vector space (i.e. a Banach space). Proof. Firstweshowitfor1≤ p<∞. Let hx m i be a Cauchy sequence in ! p , where x m = D ξ (m) i E (Note that hx m i is a sequence of sequences) such 4.5. NORMED VECTOR SPACES 103 that P ∞ i=1 ˉ ˉ ˉξ (m) i ˉ ˉ ˉ p < ∞ (m =1,2, ). Since hx m i is Cauchy with respect to k·k p , this means that for ε ∈ (0,1), ? N such that kx m ?x n k p = ? ∞ X i=1 ˉ ˉ ˉξ (m) i ?ξ (n) i ˉ ˉ ˉ p !1 p <ε,?m,n ≥ N (4.8) =? ˉ ˉ ˉξ (m) i ?ξ (n) i ˉ ˉ ˉ <ε,?m,n ≥ N,i =1,2. This shows that for each Txed i the sequence D ξ (m) i E ∞ m=1 ( ith compement of hx n i ) is a Cauchy sequence in R. Since (R,|·|) is complete, it converges in R. Let ξ (m) i → ξ ? i as m →∞which generates a sequence x =<ξ ? 1 ,ξ ? 2 ,.... > . We must show that x ∈ ! p and x n → x with respect to the k·k p norm. From (4.8) we have P k i=1 ˉ ˉ ˉξ (m) i ?ξ (n) i ˉ ˉ ˉ p <ε p for m,n ≥ N, k ∈N. Letting n →∞ we obtain P k i=1 ˉ ˉ ˉξ (m) i ?ξ i ˉ ˉ ˉ p ≤ ε p ,?m ≥ N,k∈N and letting k →∞gives ∞ X i=1 ˉ ˉ ˉξ (m) i ?ξ ? i ˉ ˉ ˉ p ≤ ε p ≤ ε,?m ≥ N. (4.9) This shows that x m ? x = D ξ (m) i ?ξ ? i E ∈ ! p .Sincex m ∈ ! p , it fol- lows by the Minkowski Theorem 237 that kxk p = kx m +(x?x m )k p ≤ kx m k p + k(x?x m )k p for x ∈ ! p . Furthermore, if p = ∞, from (4.9) we obtain kx m ?xk p <ε,?m ≥ N which means x m → x with respect to the k·k p norm. The proof works by taking a Cauchy sequence in ! p (say << x 1 >,< x 2 >,... < x m >,.. >) and showing that a sequence of components (say the Trst one is <ξ 1 1 ,ξ 2 1 ,...,ξ m 1 ,... >)isalsoCauchyinR (convergingtosayξ ? 1 ). Then we show the original sequence of sequences converges to the sequence <ξ ? 1 ,ξ ? 2 ,...,ξ ? m ,... > . . The following theorem shows that ! p spaces can be ordered with respect to the set relation ?? ?. That is, if a sequence belongs to ! 1 ,thenit belongs to ! 2 , etc. For example, < 1 n >/∈ ! 1 ,but< 1 n >∈ ! p for p>1. Theorem 239 If 1 <p<q<∞, then ! 1 ? ! p ? ! q ? ! ∞ and khx n ik q ≤ khx n ik p. 104 CHAPTER 4. METRIC SPACES Proof. Start with ! p ? ! ∞ . Let x ∈ ! p (i.e. P ∞ i=1 |x i | p < ∞)sothat< |x n | > is bounded. Then sup{|x n |,n∈N} < ∞ so that x ∈ ! ∞ . We also have ?j , |x j | p ≤ ∞ X i=1 |x i | p ?? |x j |≤ ? ∞ X i=1 |x i | p !1 p . Therefore sup{|x j |,j∈N}≤ ( P ∞ i=1 |x i | p ) 1 p or kxk ∞ ≤kxk p . Next we show ! p ? ! q for p<qand 1 ≤ p,q < ∞. 3 kxk q ′ q = ∞ X i=1 |x i | q = 3 kxk p ′ q ∞ X i=1 ? |x i | kxk p ! q ≤ 3 kxk p ′ q ∞ X i=1 ? |x i | kxk p ! p = 3 kxk p ′ q kxk p p kxk p p = 3 kxk p ′ q . where the inequality follows since |x i | kxk q ≤ 1andq>p.Taking the q-root of the above inequality gives kxk q ≤ kxk p . Now if x ∈ ! p (i.e. kxk p < ∞),then kxk q < ∞ and x ∈ ! q . Example 240 Note that the inclusion l p ? l q for p<qis strict. To see this, consider the sequence hx n i = D 1 n 1 p E ∞ n=1 . Itissimplertoworkwiththe pth power of a norm to avoid using the pth root. Hence, take μ ° ° ° 1 n 1 p ° ° ° p ? p = P ∞ n=1 3 1 n 1 p ′ p = P ∞ i=1 1 n which is inTnitely large (we showed this in the exam- ple of a harmonic series). Hence, D 1 n 1 p E ∞ n=1 /∈ ! p . However D 1 n 1 p E ∞ n=1 ∈ ! q .To see this, μ ° ° ° 1 n 1 p ° ° ° q ? q = P ∞ n=1 1 n q p where q p > 1, this series is bounded (this can be shown by using the integral criterion - See Bartle). The fundamental di?erence between ! p with 1 ≤ p<∞ and l ∞ is the behavior of their tails. While it?s easy to see that for 1 ≤ p ≤∞if x ∈ ! p 4.6. CONTINUOUS FUNCTIONS 105 then lim n?→∞ P ∞ i=n |x i | p =0. It is not true in ! ∞ . For instance the sequence hx i i = h1,1,....1,.....i ∈ ! ∞ but the norm of its tail is 1. This is the reason why there are properties of ! ∞ that are di?erent from those of ! p, 1 ≤ p<∞. One of these properties is separability (i.e. the existance of a dense countable subset.) Theorem 241 ! p is separable for 1 ≤ p<∞. Proof. Let {e i ,idN} be a basis of unit vectors. Then the set of all linear combinations H = { P n i=1 α i e i ,α i dQ} is countable and dense in ! p because if x =(x 1 ,x 2 ,......) ∈ ! p , then the tail of x (given by) ° ° ° ° ° x? n X i=1 x i e i ° ° ° ° ° p = ? ∞ X i=n+1 |x i | p !1 p n?→∞ → 0. Thus, x is approximated by an element of H. Theorem 242 ! ∞ is not separable. Proof. Let S be the set of all sequences containing only 0 and 1; that is S = {0,1} N . Clearly S ? ! ∞ and if x = hx n i , y = hy n i are two distinct elements of S, then kx?yk ∞ =1. Hence B1 2 (x)∩B1 2 (y)=? for any x,y ∈ ! ∞ ,x6= y. Let A be a dense set in ! ∞. Then for ε = 1 2 and given x ∈ S ? ! ∞ , there exists an element a ∈ A such that kx?ak ∞ < 1 2 . Because S is uncountable A must be uncountable, thus any dense set in ! ∞ must be uncountable. 4.6 Continuous Functions Now we return to another important topological concept in mathematics that is employed extensively in economics. Before deTning continuity, we amend DeTnition 49 of a function in Section 5.2 in terms of general metric spaces. DeTnition 243 Afunctionf from a metric space (X,d X ) intoametric space (Y,d Y ) is a rule that associates to each x ∈ X a unique y ∈ Y. 106 CHAPTER 4. METRIC SPACES DeTnition 244 Given metric spaces (X,d X ) and (Y,d Y ),the function f : X → Y is (pointwise) continuous at x if, ?ε>0, ?δ(ε,x) > 0 such that if d X (x 0 ,x) <δ(ε,x),thend Y (f(x),f(x 0 )) <ε.Thefunctioniscontinuous if it is continuous at each x ∈ X. See Figure 4.6.1. Example 245 Let (X,d X )=((?∞,0)∪(0,∞),|·|), (Y,d Y )=(R,|·|), and deTne f(x)= ? 1 if x>0 ?1 if x<0 . Then f : X →Y is continuous on (X,d X ).SeeFigure4.6.2. Example 246 Let (X,d X )=(R,|·|), (Y,d Y )=(R,|·|), and deTne f(x)= bx, b ∈ R\{0}.Thenf : X → Y is continuous on (X,d X ) sincewecan simply let δ(ε,x)= ε |b| . Then, for any ε>0, if |x 0 ?x| <δ(ε,x) we have |bx 0 ?bx| = |b||x 0 ?x| <ε.Notice that in the case of linear functions, δ is independent of x. Figure 4.6.3. Example 247 Let (X,d X )=(R\{0},|·|), (Y,d Y )=(R,|·|), and deTne f(x)= 1 x .Foranyx ∈ X, then |f(x 0 )?f(x)| = ˉ ˉ ˉ ˉ 1 x 0 ? 1 x ˉ ˉ ˉ ˉ = |x 0 ?x| |xx 0 | . We wish to Tnd a bound for the coe?cient of |x 0 ?x| which is valid around 0.If|x 0 ?x| < 1 2 |x|, then 1 2 |x| < |x 0 | in which case |f(x 0 )?f(x)|≤ 2 |x| 2 |x 0 ?x|. In this case, δ(ε,x)=inf{ 1 2 |x|, 1 2 ε|x| 2 }.Figure 4.6.4. There is an equivalent way to deTne pointwise continuity in terms of the inverse image (DeTnition 53) and in terms of sequences. Theorem 248 Given metric spaces (X,d X ) and (Y,d Y ),the following state- ments are equivalent: (i) function f : X → Y is continuous; (ii) if for each open subset V of Y,thesetf ?1 (V) is an open subset of X; and (iii) if for every convergent sequence x i →x in X,the sequence f(x i ) → f(x). 4.6. CONTINUOUS FUNCTIONS 107 Proof. (Sketch)(ii?i) Any ε-ball around f(x)isopensothereisaδ-ball around x inside f ?1 (B ε (f(x)). (iii)?(ii) If not, then there is an x ∈ f ?1 (V) such that for any 1 n neighborhood of it , we can Tnd a point x n such that f(x n ) /∈ V.But<x n > contradicts (iii). (i)?(iii) From (i) for x n close enough to x, f(x n )willbeasclosetof(x) as we want, so that f(x n ) → f(x). The previous two examples go against ?conventional wisdom? that the graph of a continuous function is not interupted and may raise the question of the existence of a function that is not continuous. The following example provides such a function. Example 249 Let (X,d X )=(R,|·|), (Y,d Y )=(R,|·|), and deTne 17 f(x)= ? ? ? 1 if x>0 0 if x =0 ?1 if x<0 . Then f ?1 ((? 1 2 , 1 2 )) = {0}, the inverse image of an open set is closed, therefore this function is not continuous in (X,d).See Figure 4.6.5. Next we show that the composition of continuous functions preserves continuity. Theorem 250 Given metric spaces (X,d X ),(Y,d Y ),and (Z,d Z ),andcon- tinuous functions f : X → Y and g : Y → Z, then h : X → Z given by h = g?f is continuous. Proof. Let U ? Z be open. Then g ?1 (U)isopeninY and f ?1 (g ?1 (U)is open in X.Butf ?1 (g ?1 (U)=(g?f) ?1 . It follows that certain simple operations with continuous functions pre- serve continuity. Theorem 251 Given a metric space (X,d X ) and a normed vector space (Y,d Y ), and continuous functions f : X → Y and g : X → Y, then the following are also continuous: (i) f ±g; (ii) f ·g; (iii) f g ; (iv) |f|. Exercise 4.6.1 Prove Theorem 251. 17 This is known as the ?sgn? function. 108 CHAPTER 4. METRIC SPACES It should be emphasized that Theorem 250 does not say that if f is continuous and U is open in X then the image f(U)={f(x),x∈ U} is open in Y. Example 252 Let (X,d X )=(R,|·|), (Y,d Y )=(R,|·|), and deTne f(x)= x 2 .Thenf((?1,1)) = [0,1) is the image of an open set which is not open. SeeFigure4.6.7. Therefore continuity does not preserve openness. It does not preserve closedness either as the next example shows. Example 253 Let (X,d X )=(R\{0},|·|), (Y,d Y )=(R,|·|), and deTne f(x)= 1 x .Thenf([1,∞)) = (0,1] is the image of a closed set which is not closed. See Figure 4.6.8. There are, however, important properties of a set which are preserved under continuous mapping. The next subsections establish this. 4.6.1 Intermediate value theorem Theorem 254 (Preservation of Connectedness) The image of a con- nected space under a continuous function is connected. Proof. Let f : X → Y be a continuous function on X and let X be connected. We wish to prove that Z = f(X) is connected. Assume the contrary. Then there exists open disjoint sets A and B such that Z = (A∩Z)∪(B ∩Z)and(A∩Z),(B∩Z) is a separation of Z into two dis- joint, non-empty sets in Z.Thenf ?1 (A∩Z)=f ?1 (A)∩f ?1 (Z)=f ?1 (A)∩ X = f ?1 (A)andf ?1 (B ∩ Z)=f ?1 (B) are disjoint sets whose union is X (= f ?1 (A∩Z)∪f ?1 (B∩Z)). They are open in X because f is continu- ous and non-empty because f : X → f(X) is a surjection. Therefore f ?1 (A) and f ?1 (B) form a separation of X which contradicts the assumption that X is connected. Inthespecialcasewherethemetricspace(Y,d Y )=(R,|·|) then the corollary of this theorem is the well-known Intermediate Value Theorem. Corollary 255 (Intermediate Value Theorem) Let f : X → R be a continuous function of a connected space X into R.Ifa,b ∈ X and if r ∈ Y such that f(a) ≤ r ≤ f(b), then ?c ∈ X such that f(c)=r. See Figure 4.6.9. 4.6. CONTINUOUS FUNCTIONS 109 Exercise 4.6.2 Prove Corollary 255. Note that it is connectedess that is required for the Intermediate value theorem and not compactness. Example 256 Let (X,d X )=([?2,?1]∪[1,2],|·|), (Y,d Y )=(R,|·|), and deTne f(x)= ? 1 if x ∈ [1,2] ?1 if x ∈ [?2,?1] . Then f : X → Y is continuous on the compact set X but for r =0, there doesn?t exist c ∈ X such that f(c)=0. A nice one dimensional example of how important the intermediate value theorem is for economics, is the following Txed point theorem. Corollary 257 (One Dimensional Brouwer) Let f :[a,b] → [a,b] be a continuous function. Then f has a Txed point. Proof. Let g :[a,b] → R be deTned by g(x)=f(x) ?x. Clearly g(a)= f(a)?a ≥ 0sincef(a) ∈ [a,b]andg(b)=f(b)?b ≤ 0forthesamereason. Since g(x) is a continuous function 18 with g(b) ≤ 0 ≤ g(a), we know by the Intermediate Value Theorem 255 that ?x ∈ [a,b]suchthatg(x)=0or equivalently that f(x)=x. The proof is illustrated in Figure 4.6.10. For a more general version of this proof, see Section 4.8. The next series of examples shows how connectedness of R + can be used to construct a continuous ?utility? function u(x) that represents a preference relation %. Before establishing this, however, we need to deTne continuity in terms of relations. DeTnition 258 Thepreferencerelation% on X is continuous if for any sequence of pairs < (x n ,y n ) > ∞ n=1 with x n % y n ?n, x = lim n→∞ x n ,and y = lim n→∞ y n , then x%y. 18 To see g(x) is a continuous function, we must show ?ε>0andx,y ∈ [a,b],?δ g > 0 such that if |x?y| <δ g then |g(x)?g(y)| <ε.But |g(x)?g(y)| = |(f(x)?f(y))?(x?y)|≤|f(x)?f(y)|+|x?y| by the triangle inequality. Continuity of f implies ?ε>0,?δ f > 0suchthat|x?y| <δ f and |f(x)?f(y)| <ε.Thus, let if we let δ g =min{δ f ,ε}/2, then |g(x)?g(y)| <ε. 110 CHAPTER 4. METRIC SPACES An equivalent way to state this notion of continuity is that ?x ∈ X,the upper contour set {y ∈ X : y %x} and the lower contour set {y ∈ X : x%y} are both closed; that is, for any <y n > ∞ n=1 such that x % y n , ?n and y =limy n ,we have x%y (just let x n = x,?n). There are some preference relations that are not continuous as the fol- lowing example shows. Example 259 Lexicographic preferences (on X = R 2 + ) are deTned in the following way: x%y if either ?x 1 >y 1 ? or ?x 1 = y and x 2 ≥ y 2 ?.Tosee they are not continuous, consider the sequence of bundles <x n =( 1 n ,0) > and <y n =(0,1) >. For every n we have x n ? y n . But lim n→∞ y n =(0,1) ? (0,0) = lim n→∞ x n . Thatis,aslongastheTrst component of x is larger than that of y, x is preferred to y even if y 2 is much larger than x 2 . But as soon as the Trst components become equal, only the second components are relevant so that the preference ranking is reversed at the limit points. Now we establish that we can ?construct? a continuous utility function. Example 260 If the rational preference relation %on X is continuous, then there is a continuous utility function u(x) that represents %.Toseethis,by continuity of %, we know that the upper and lower contour sets are closed. Then the sets A + = {α ∈R + : αe % x} and A ? = {α ∈R + : x % αe},where e is the unit vector, are nonempty and closed. By completeness of %, R + ? (A + ∪A ? ). The nonemptiness and closedness of A + and A ? ,alongwiththe fact that R + is connected, imply A + ∪A ? 6= ?.Thus,?α such that αe ~ x. By monotonicity of %, α 1 e ? α 2 e whenever α 1 >α 2 . Hence, there can be at most one scalar satisfying αe ~ x. This scalar is α(x),whichwetakeasthe utility function. 4.6.2 Extreme value theorem The next result is one of the most important ones for economists we will come across in the book. Theorem 261 (Preservation of Compactness) The image of a compact set under a continuous function is compact. Proof. Let f : X → Y be a continuous function on X and let X be compact. Let G be an open covering of f(X)bysetsopeninY.The 4.6. CONTINUOUS FUNCTIONS 111 collection {f ?1 (G),G ∈ G} is a collection of sets covering X. These sets are open in X because f is continuous. Hence Tnitely many of them, say f ?1 (G 1 ),...,f ?1 (G n )coverX.ThenthesetsG 1 ,...,G n cover f(X). Againinthespecialcasewhere(Y,d Y )=(R,|·|),a direct consequence of this theorem is the well known Extreme Value Theorem of calculus. 19 Corollary 262 (Extreme Value Theorem) Let f : X → R be a con- tinuous function of a compact space X into R.Then?c,d ∈ X such that f(c) ≤ f(x) ≤ f(d) for every x ∈ X. f(c) is called the minimum and f(d) is called the maximum of f on X. Proof. Sincef is continuous andX is compact, the setA = f(X)iscompact. We show that A has a largest element M and a smallest element m.Then since m and M belong to f(X), we must have m = f(c)andM = f(d)for some points c and d of X. If A has no largest element, then the collection {(?∞,a),a∈ A} forms an open covering ofA. SinceAis compact, some Tnite subcollection{(?∞,a 1 ),..., (?∞,a n )} covers A.Leta M =max{a 1 ,...a n } then a M ∈ A belongs to none of these sets, which contradicts the fact that they cover A. A similar argument can be used to show that A has a smallest element. Exercise 4.6.3 Let X =[0,1) and f(x)=x. Why doesn?t a maximum exist? See Figure 4.6.11. 4.6.3 Uniform continuity One might believe from part (iii) of the Theorem 248 that if <x n > is Cauchy and if f is continuous, then <f(x n ) > is also Cauchy. The following examples show this is false if f is pointwise continuous. Example 263 Take the sequence <x n >= D (?1) n n E and consider the func- tion f deTned in Example 245. While <x n > is Cauchy in (?∞,0)∪(0,∞), <f(x n ) >=< ?1,1,?1,1,.... > which is not Cauchy. See Figure 4.6.12. Example 264 Let f(x)= 1 x on (0,1] which was shown to be pointwise con- tinuous in Example 247. Consider the Cauchy sequence < 1 n > on (0,1]. It is clear that <f(x n ) >=<n>,which is obviously not Cauchy.See Figure 4.6.13. 19 Sometimes this is called the Maximum and Minimum Value Theorem. Since in the next section we will introduce the Maximum Theorem, we choose the above terminology. 112 CHAPTER 4. METRIC SPACES For the above intuition to hold, we need a stronger concept of continuity. DeTnition 265 Given metric spaces (X,d X ) and (Y,d Y ),the function f : X → Y is uniformly continuous if ?ε>0,?δ(ε) > 0 such that ?x,x 0 ∈ X with d X (x 0 ,x) <δ(ε),thend Y (f(x),f(x 0 )) <ε. While this deTnition looks similar to that of pointwise continuity in Def- inition 244, the di?erence is that while δ generally depends on both ε and x in the case of pointwise continuity, it is independent of x in case of uniform continuity. Theorem 266 Given metric spaces (X,d X ) and (Y,d Y ), let the function f : X → Y be uniformly continuous. If <x n > is a Cauchy sequence in X, then <f(x n ) > isaCauchysequenceinY. Proof. Let <x n > be a Cauchy sequence in X. Because f : X → Y is uniformly continuous then ?ε>0, ?δ(ε) > 0 such that ?x,x 0 ∈ X with d X (x 0 ,x) <δ(ε), then d Y (f(x),f(x 0 )) <ε.Since<x n > is Cauchy for given δ(ε) > 0there?N such that ?m,n ∈ N with m,n > N then d X (x m ,x n ) < δ(ε). But then d Y (f(x m ),f(x n )) <ε. Hence <f(x n ) > is a Cauchy sequence in Y. AccordingtothistheoremthefunctionsinExamples263and264arenot uniformly continuous. Notice that the domains of each of the functions in the examples are not compact in (R,|·|). Let?s consider another example. Example 267 Let f :[0,∞) → R given by f(x)=x 2 . This function is continuous on R. Is it uniformly continuous? No. We show this by Tnding an ε>0 such that ?δ>0, ?x 1 ,x 2 such that d X (x n ,x) <δand d Y (f(x 1 )),f(x 2 )) ≥ ε. Let ε =2and take any δ>0. Then ?n ∈N such that 1 n <δ.DeTne x 1 = n+ 1 n and x 2 = n.Thend X (x 1 ,x 2 )=(n+ 1 n ?n)= 1 n <δ and d Y (f(x 1 )),f(x 2 )) = (n+ 1 n ) 2 ?n 2 =2+ 1 n 2 > 2. Notice that the domain of this function [0,∞) is not compact in (R,|·|). If the domain of a continuous function is compact, then the function is also uniformly continuous as the following theorem asserts. Theorem 268 (Uniform Continuity Theorem) Let f : X → Y be a continuous function of a compact metric space (X,d X ) to the metric space (Y,d Y ). Then f is uniformly continuous. 4.7. HEMICONTINUOUS CORRESPONDENCES 113 Proof. (Sketch) For a given ε>0, by continuity of f around any x ∈ X we canTnd aδ( 1 2 ε,x)-ball suchthat forx 0 ∈ B δ( 1 2 ε,x) (x)wehaved Y (f(x)),f(x 0 )) < 1 2 ε. Since the collection of such open balls is an open covering of X and X is compact, there exists a Tnite (say n)subcoverofthem.Thenforx,x 0 ∈ X such that d X (x 0 ,x) <δ(ε)= 1 2 min{δ( 1 2 ε,x 1 ),....,δ( 1 2 ε,x n )}, there exists k such that x ∈ B δ( 1 2 ε,x k ) (x k )andx 0 ∈ B δ( 1 2 ε,x k ) (x k ). Therefore by the triangle inequality d Y (f(x)),f(x 0 )) <ε. The number δ(ε) that we constructed in the proof of Theorem 268, is called the Lebesgue number of the covering G. Exercise 4.6.4 Why is f(x)= 1 x not uniformly continuous on X =(0,1] butitison[10 ?1000 ,1]? 4.7 Hemicontinuous Correspondences Many problems in economics result in set-valued mappings or correspon- dences as deTned in Section 2.3. For instance, if preferences are linear, a household?s demand for goods may described by a correspondence and in game theory we consider best response correspondences. Before deTning hemicontinuity, we amend DeTnition 48 of a correspon- dence in Section 2.3 in terms of general metric spaces. DeTnition 269 A correspondence Γ from a metric space (X,d X ) into ametricspace(Y,d Y ) is a rule that associates to each x ∈ X asubset Γ(x) ∈ Y.Itsgraph is the set A = {(x,y) ∈ X × Y : y ∈ Γ(x)} which we will denote Gr(Γ). The image of a set D ? X, denoted Γ(D) ? Y, is the set Γ(D)=∪ x∈D Γ(x). A correspondence is closed valued at x if the image set Γ(x) is closed in Y. A correspondence is compact valued at x if the image set Γ(x) is compact in Y.SeeFigure4.7.1. Unlike a (single-valued) function, there are two ways to deTne the inverse image of a correspondence Γ of subset D. DeTnition 270 For Γ : X 3 Y and any subset D ? Y we deTne the in- verse image (also lower or weak) as Γ ?1 (D)={xdX : Γ(x)∩D 6=0} and the core (also upper or strong inverse image) Γ +1 (D)={xdX : Γ(x) ? D}. 114 CHAPTER 4. METRIC SPACES It is clear that Γ +1 (D) ?Γ ?1 (D). Also observe that Γ +1 (Y\D)=X ?Γ ?1 (D)and Γ ?1 (Y\D)=X\Γ +1 (D). See Figure 4.7.2. These two types of inverse image naturally coincide when Γ is single-valued. To make the notion of correspondence clearer we present a number of examples (see Figure 4.7.3a-3f). Example 271 Γ :[0,1]3[0,1] deTned by Γ(x)= ? ? ? 1 if x< 1 2 {0,1} if x = 1 2 0 if x> 1 2 . Example 272 Γ :[0,1] 3[0,1] deTned by Γ(x)= ? ? ? 1 if x< 1 2 [0,1] if x = 1 2 0 if x> 1 2 . Example 273 Γ :[0,1]3[0,1] deTned by Γ(x)=[x,1]. Example 274 Γ :[0,1]3[0,1] deTned by Γ(x)= ? £ 0, 1 2 ¤ if x 6= 1 2 [0,1] if x = 1 2 . Example 275 Γ :[0,1]3[0,1] deTned by Γ(x)= ? [0,1] if x 6= 1 2£ 0, 1 2 ¤ if x = 1 2 . Example 276 Γ :[0,∞) 3R deTned by Γ(x)=[e ?x ,1]. We next deTne a set valued version of continuity. DeTnition 277 Given metric spaces (X,d X ) and (Y,d Y ), the correspon- dence Γ : X 3 Y is lower hemicontinous (lhc) at x ∈ X if Γ(x) is non-empty and if for every open set ???CHANGE TO S,TV ? Y with Γ(x)∩V 6= ?, there exists a neighborhood U of x such that Γ(x 0 )∩V 6= ? for every x 0 ∈ U.The correspondence is lower hemicontinuous if it is lhc at each x ∈ X. 20 SeeFigure4.7.4. 20 There are various names given to this concept. In many math books, this is called semicontinuity. 4.7. HEMICONTINUOUS CORRESPONDENCES 115 Note that the correspondences presented in Examples 273, 275, and 276 are Ihc. As in the case of continuity of a function, there are equivalent characteri- zations of Ihc in terms of open (closed) sets or sequences as the next theorem shows. Theorem 278 Given metric spaces (X,d X ) and (Y,d Y ), for a correspon- dence Γ : X 3 Y the following statements are equivalent. (i) Γ is Ihc; (ii) Γ ?1 (V) is open in X whenever V ? Y is open in Y; (iii) Γ +1 (U) is closed in X whenever U ? Y is closed in Y;and(iv)?xdX, ?y ∈ Γ(x) and every sequence <x n >→ x, ?N such that <y n >→y and y n ∈Γ(x n ), ?n ≥ N. Proof. (i) ?? (ii)Let V be open in Y , Γ ?1 (V)={x ∈ X : Γ(x)∩V 6=0} and take x 0 ∈Γ ?1 (V). Since Γis Ihc at x 0 then ? U open such that, X 0 ∈ U, Γ(x 0 )∩V 6=0foreveryx 0 ∈ U. Hence U ?Γ ?1 (V)sothatΓ ?1 (V)isopen. (ii) ?? (iii) follows immediatly from X\Γ ?1 (U)=Γ +1 (Y\U). (i) ?? (iv) First start with (?). Let <x n >→ x and Tx an arbitrary point y ∈ Γ(x). For each k ∈ N, B1 k (y)∩Γ(x) 6= ?.SinceΓ is lhc at x, ?k thereexistsanopensetU k ofxsuch that?x 0 k ∈ U k we have Γ(x 0 k )∩B1 k (y) 6= ?. Since <x n >→ x, ?k we can Tnd n k such that x n ∈ U k , ?n ≥ n k and they can be assigned so that n k+1 >n k .Also,sincex n ∈ U k , ?n ≥ n k , then Γ(x n )∩B1 k (y) 6= ?. Hence we can construct a companion sequence <y n >, with y n chosen from the set Γ(x n )∩B1 k (y)foreachn ≥ n k . As k, and hence n, increases the radius of the balls B1 k (y) shrinks to zero, implying <y n >→ y. Next we prove (?). In this case, it is su?cient to prove the contrapositive. Assume Γ is not lhc at x.Then?V with Γ(x) ∩ V 6= ? such that every neighborhood U of x contains a point x 0 u with Γ(x 0 u ) ∩ V = ?.Takinga sequence of such neighborhoods, U n = B1 n (x) and a point in each of them, we obtain a sequence <x n >→ x by construction and has the property Γ(x n )∩V = ?. Hence every companion sequence <y n > with y n ∈Γ(x n )is contained in the complement of V, and if <y n >→ y then y is contained in the complement of V since Y\V is closed. Thus no companion sequence of <x n > can converge to a point in V. Thus, Γ is lhc at x if any y ∈Γ(x) can be approached by a sequence from both sides. Also, if the correspondence F is a function, then F ?1 (U)isthe inverse image of a function so (ii)statesF is Ihc i? F is continuous. 116 CHAPTER 4. METRIC SPACES DeTnition 279 Given metric spaces (X,d X ) and (Y,d Y ), the correspon- dence Γ : X 3 Y is upper hemicontinous (uhc) at x ∈ X if Γ(x) is non-empty and if for every open set V ? Y with Γ(x) ? V,thereexistsa neighborhood U of x such that Γ(x 0 ) ? V for every x 0 ∈ U.Thecorrespon- dence is upper hemicontinuous if it is uhc at each x ∈ X.SeeFigure 4.7.5. The correspondences presented in Examples 271-276 are uhc. Again, uhc can be characterized in terms of open (closed) sets or se- quences. Theorem 280 Given metric spaces (X,d X ) and (Y,d Y ), for a correspon- dence Γ : X 3 Y the following statements are equivalent: (i) Γ is uhc; (ii) Γ +1 (V) is open in X whenever V ? Y is open in Y; (iii) Γ ?1 (U) is closed in X whenever U ? Y is closed in Y; and if Γ is compact valued, then (iv) for every sequence <x n >→ x and every sequence <y n > such that y n ∈Γ(x n ), ?n, there exists a convergent subsequence <y g(n) >→ y and y ∈Γ(x). Proof. (Sketch)(i) and Γ compact?(iv). First, we must show that the the companion sequence <y n > is bounded. Since <y n > is bounded, it is containedinacompactsetsothatbyTheorem 193, there exists a convergent subsequence. Finally, we must show that the limit of this subsequence is in Γ(x). (i)?(iv) Again, it is su?cient to prove the contrapositive; If Γ is not uhc at x,then there is no subsequence converging to a point in Γ(x). Exercise 4.7.1 Finish the proof of Theorem 280. 21 ???Thus, Γ is uhc at x if any y ∈Γ(x) can be approached by a sequence from ????. If the correspondence F is a function, then F +1 (U)=F ?1 (U)isthe inverse image of the function and so by (ii),F is uhc i? F is continuous. Each type of hemicontinuity can be interpreted in terms of the restrictions of the ?size? of the set Γ(x)asx changes. ? Suppose Γ is uhc at x and Tx V ? Γ(x). As we move from x to a nearby point x 0 ,thesetV gives an ?upper bound? on the size of Γ(x 0 ) 21 de la Fuente (Theorem 11.2, p. 110). 4.7. HEMICONTINUOUS CORRESPONDENCES 117 sincewerequireΓ(x 0 ) ? V. Hence uhc requires the image set Γ(x) does not ?explode? with small changes in x, but allows it to suddenly ?implode?. ? SupposeΓis lhc at x. As we move from x to a nearby point x 0 ,thesetV gives a ?lower bound? on the size ofΓ(x 0 ) since we requireΓ(x 0 )∩V 6= ?. Hence lhc requires the image set Γ(x) does not ?implode? with small changes in x, but allows it to suddenly ?explode?. DeTnition 281 Given metric spaces (X,d X ) and (Y,d Y ), a correspondence Γ : X 3Y is continuous at x ∈ X if it is both lhc and uhc at x. The correspondences in Examples 273 and 276 are continuous. Example 282 Consider the following example of a best response correspon- dence derived from game theory. The game is played between two individuals who can choose between two actions, say go up (U) or go down (D).Ifboth choose U or both choose D, they meet. If one chooses U and the other chooses D,they don?t meet. Meetings are pleasurable and yield each player payo? 1, while if they don?t meet they receive payo? 0. This is known as a coordination game. The players choose probability distributions over the two actions: say player 1 chooses U with probability p and D with probability 1?p while player 2 chooses U with probability q and D with probability 1?q. We represent this game in ?normal form? by the matrix in Table 4.7.1. Agent 1?s payo? from playing U with probability p while his opponent is playing U with probability q is denoted π 1 (p,q) and given by π 1 (p,q)=p·[q ·1+(1?q)·0] + (1?p)·[q ·0+(1?q)·1] =1?q?p+2pq Agent 1 chooses p ∈ [0,1] to maximize π 1 (p,q). We call this choice a best response correspondence p ? (q). It is simple to see that: for any q< 1 2 , proTts are decreasing in p so that p ? =0is a best response, for any q> 1 2 , proTts are increasing in p so that p ? =1is a best response, and at q = 1 2 , proTts are independent of p so that any choice of p ? ∈ [0,1] is a best response. 22 22 To see this, note that dπ dp =2q?1sothat dπ dp < 0ifq< 1 2 dπ dp =0 ifq = 1 2 dπ dp > 0ifq> 1 2 . 118 CHAPTER 4. METRIC SPACES Obviously, p ? (q) is a correspondence. It is not lhc at q = 1 2 since if we let V =( 1 4 , 3 4 ), then p ? ( 1 2 ) ∩ V 6= ? and there exists no neighborhood U around 1 2 such that p ? (q) ∩ V 6= ? for q ∈ U (e.g. for any 1 2 ≥ ε>0, p ? ( 1 2 ? ε)=0/∈ ( 1 4 , 3 4 ) and p ? ( 1 2 + ε)=1/∈ ( 1 4 , 3 4 )). It is, however, uhc at q = 1 2 sincewemusttakeV =(a,b) with a<0 and b>1 to satisfy p ? ( 1 2 )=[0,1] ? V. But then there exist many neighborhoods U around 1 2 such that p ? (q) ? V for q ∈ U (e.g. for any 1 2 ≥ ε>0,p ? ( 1 2 ?ε)=0∈ (a,b) and p ? ( 1 2 + ε)=1∈ (a,b)). See Figure 4.7.6. Finally, you should recognize that this game is symmetric so that agent 2?s payo?s and hence best response correspondence is identical to that of agent 1. Table 4.7.1 player 2 q 1?q 1 U D p U 1,1 0,0 1?p D 0,0 1,1 Just as it was cumbersome to apply DeTnition 185 to establish compact- ness, it is similarly cumbersome to apply DeTnitions 277 and 279 to establish hemicontinuity. In the case of compactness, we provided simple su?cient conditions (e.g. the Heine-Borel Corollary 194). Here we supply another set of simple su?cient conditions to establish hemicontinuity. Theorem 283 Let Γ : X 3 Y be a non-empty valued correspondence and let A be its graph. If (i) A is convex and (ii) for any bounded set b X ? X, there is a bounded set b Y ? Y such that Γ(x)∩ b Y 6= ?, ?x ∈ b X, then Γ is lhc at every interior point of X. Proof. Let bx be an interior point of X, by ∈Γ(bx), and <x n >? X with x n → bx. Since x n is convergent, choose ε>0suchthat b X = B ε (bx) ? X.LetD denote the boundary set of b X.We can represent x n as a convex combination of bx and a point in D.Thatis,?α n ,d n such that x n = α n d n +(1?α n )bx where α n ∈ [0,1] and d n ∈ D.SinceD is a bounded set, α n → 0asx n → bx. Choose b Y such that Γ(x)∩ b Y 6= ?,?x ∈ b X. Then for each n, choose by n ∈Γ(d n )∩ b Y so that y n = α n by n +(1?α n )by. Since (d n ,by n ) ∈ A,?n, (bx,by) ∈ A, and A is convex, then (x n ,y n ) ∈ A,?n.Sinceα n → 0andby n ∈ b Y,y n → by. Hence < (x n ,y n ) >? A and converges to (bx,by). 4.7. HEMICONTINUOUS CORRESPONDENCES 119 Theorem 284 Let Γ : X 3 Y be a non-empty valued correspondence and let A be its graph. If (i) A is closed and (ii) for any bounded set b X ? X,the set Γ( b X) is bounded, then Γ is compact valued and uhc. Proof. Compactness follows directly from (i) and (ii). Let x n → x ∈ X with <x n >? X.SinceΓ is non-empty, ?y n ∈ Γ(x n ),?n.Sincex n → x, there is a bounded set b X ? X such that <x n >? b X with x ∈ b X by Theorem 164. Then by (ii), Γ( b X) is bounded. Hence <y n >? Γ( b X)hasa convergent subsequence, <y g(n) >→ y. Thus, < (x g(n) ,y g(n) ) > is a sequence in A converging to (x,y). Since A is closed, (x,y) ∈ A. In a future section we will use the following relationship between uhc of a correspondence and the closedness of its graph. Theorem 285 The graph of an uhc correspondence Γ : X 3 Y with closed valuesisclosed. Proof. We have to prove that X ×Y\Gr(Γ)isopen. Take(x,y)dX × Y\Gr(Γ)sothaty/∈Γ(x). Now we can choose an open neighborhood V y of y in Y and V Γ(x) ofΓ(x)inY such that V y ∩V Γ(x) = ?. By (ii) of Theorem 280, U x = Γ +1 ? V Γ(x) ¢ is an open neighborhood of x in X,consequentlyU x ×V y is an open neighborhood of (x,y)inX ×Y . Because U x ×V y ∩Gr(Γ)=? we have U x ×V y ? X ×Y\Gr(Γ) and hence X ×Y\Gr(Γ) is open. See Figure 4.7.7. The converse of this theorem doesn?t hold as the following example indi- cates. Example 286 Consider the function F : R→R given by F (x)= ? 1 x , x 6=0 0 , x =0 . F has a closed graph but is not uhc since it is clear that for an open set (?ε,ε) in R, Γ +1 (?ε,ε)=(?∞,? 1 ε )∪{0}∪( 1 ε ,∞), which is not open. However if the image F (X) is compact, or a subset of a compact set, then the converse of Theorem 285 holds (i.e. a closed graph implies uhc). Hence, closedness of the graph can be used as a criterion of uhc. See Figure 4.7.8. Theorem 287 Let Γ : X 3 Y be a correspondence such that Γ(X) ? K where K is compact and the graph Gr(Γ) is closed. Then Γis uhc. 120 CHAPTER 4. METRIC SPACES Proof. Assume to the contrary that Γ is not uhc at x 0 . Then there exists an open neighborhood V Γ(x 0 ) of Γ(x 0 )inY such that for every open neigh- borhood U x 0 of x in X we have that Γ(U x 0 ) is not contained in V Γ(x 0 ) . We take U x 0 = B1 n (x 0 ),n∈N. Then for every n we get a point x n dB1 n (x 0 )such that Γ(x n )isnotcontainedinV Γ(x 0 ) . Let y n ∈Γ(x n )andy n /∈ V Γ(x 0 ) . Then we have hx n i → x 0 and hy n i ? K. Since K is compact, there exists a subse- quence - y g(n) ? → y ∈ K. Since y n /∈ V Γ(x 0 ) , ? n , this implies y n ∈ Y\V Γ(x 0 ) . Since Y\V Γ(x 0 ) is closed, then y ∈ Y\V Γ(x 0 ) so that y/∈ V Γ(x o ) .Then we have - x n ,y g(n) ? ? Gr(Γ)and - x n ,y g(n) ? → (x,y). Since the Gr(Γ)isclosed, (x,y)dGr(Γ). But this contradicts y/∈ V Γ(x 0 ) . Now we state a few lemmas that will be very useful in the next chapter. DeTnition 288 Let (X,d) be a metric space and (Y,k·k) be a normed vector space. Let Γ : X 3 Y be a correspondence. Then we can deTne two new correspondences: Γ(the closure of Γ)andco(Γ) (the convex hull of Γ)bythe following Γ : X 3Y given by Γ(x)=Γ(x), ?xdX co(Γ):X →Y given by (co(Γ))(x)=coΓ(x), ?xdX. Note Γ is by deTnition always closed valued and co(Γ)isbydeTnition always convex valued. Example 289 Γ :[0,1] 3R given by Γ(x)=[0,x). Then Γ(x)=[0,x]. See Figure 4.7.9. Example 290 Γ :[0,1] 3 R given by Γ(x)={0,1}. Then co(Γ(x)) = [0,x].See Figure 4.7.10. Lemma 291 If Γ : X 3Y is lhc then Γis also lhc. Proof. The proof uses the following result: If G is open in Y and if A ? Y, then A∩G 6= ? i? ˉ A∩G 6= ?. (4.10) Since A∩G ? ˉ A∩G, one direction is clear. Let ˉ A∩G 6= ?.If X ∈ ˉ A∩G, then X ∈ ˉ A and if X ∈ G, then ? <x n >→ x and x n ∈ A, ?n ∈ N. Since G is open, ?ε such that B ε (x) ? G. Since <x n >→ x, we have x n dB ε (x) ? G, ?n su?ciently large. Hence x n ∈ A∩G so that A∩G 6= ?. Now we need to prove that Γ ?1 (V)isopeninX if V is open in Y. But from (4.10), Γ ?1 (V)=Γ ?1 (V) which is open because Γ is lhc. 4.7. HEMICONTINUOUS CORRESPONDENCES 121 Lemma 292 If Y isanormedvectorspaceandΓ : X 3 Y is lhc, then co(Γ) is lhc. Proof. Let x ∈ X, <x n >→ x, and y ∈ co(Γ (x)). We need to show that ? <y n > such that <y n >→ y and y n ∈ co(Γ(x n )). Since y ∈ co(Γ(x)), then y = P m i=1 λ i y i ,wherey i ∈ Γ(x)and P m i=1 λ i =1. Since Γ is lhc, ? <y i n > ∞ n=1 such that y i n ∈Γ(x n )and<y i n >→ y i for each i =1,...,m. Let y n = P m i=1 λ i y i n . Then <y n >→y and y n ∈ co(Γ(x n )). Given two correspondences Γ 1 : X 3Y and Γ 2 : X 3Y ,providedthat Γ 1 (x)∩Γ 2 (x) 6= ?,?xdX, we can deTne a new correspondence Γ 1 ∩Γ 2 : X 3Y given by (Γ 1 ∩Γ 2 )(x)=Γ 1 (x)∩Γ 2 (x) Also let (X,d)beametricspaceandA ? X. The subset A can be expanded by a non-negative factor β denoted by β + A where β + A = ∪ aDA B β (a)={x ∈ X i : d(x i ,A) <β}. 23 . See Figure 4.7.11. Then for a correspondence Γ : X 3 Y where Y is a normed vector space, we have β+ Γ(x)={y ∈ Y : kΓ(x)?yk <β}. We say that β+ Γ(x)isaβ? band around the set Γ(x). See Figure 4.7.12. We need the following lemma for Michael?s selection theorem which is critical for the proof of a Txed point of a correspondence. Lemma 293 If Y is a normed vector space, if a correspondence F is deTned by F(x)=β + f(x) where f is a continuous function from X to Y, and if Γ : X 3Y is a lhc correspondence, then F∩ Γ is lhc. 24 Proof. If <x n >?→ x and y ∈ F(x)∩ Γ(x), then y ∈ Γ (x). Since Γ is lhc, ? <y n > such that y n ∈ Γ(x n )and<y n >→ y. We need to show that y n ∈ F(x n ) (i.e.y n ∈ (f(x n )?β, f(x n )+β)forn large enough. But by the triangle property of a norm, we have k y n ?f(x n )k≤ky n ?yk+ky?f(x)k+kf(x)?f(x n )k. (4.11) 23 Note that this distance between a point and a set is deTned in 127. 24 Remember that F is a correspondence not a function; F(x)=(f(x)? β,f(x)+β) which is an interval for every x. 122 CHAPTER 4. METRIC SPACES The Trst term is su?ciently small because <y n >→ y and the third term is su?ciently small because <x n >?→ x and f is continuous. Since y ∈ F(x)=(f(x)?β,f(x)+β), the second term is less than β.Hence for n large enough, the right hand side of (4.11) is less than β and thus y n ∈ F(x n ). 4.7.1 Theorem of the Maximum In economics, often we wish to solve optimization problems where households maximize their utility subject to constraints on their purchases of goods or Trms maximize their proTts subject to constraints given by their technology. In particular, consider the following example. Example 294 A household has preferences over two consumption goods (c 1 ,c 2 ) characterized by a utility function U : R 2 + →R given by U(c 1 ,c 2 )=c 1 + c 2 . The household has a positive endowment of good 2 denoted ω ∈ R + . The household can trade its endowment on a competitive market to obtain good 1 where the price of good 1 in terms of good 2 is given by p ∈ R + . The household?s purchases are constrained by its income; its budget set is given by B(p,ω)={(c 1 ,c 2 ) ∈R 2 + : pc 1 +c 2 ≤ ω}. Taking prices as given, the household?s problem is v(p,ω)= max (c 1 ,c 2 )∈B(p,ω) U(c 1 ,c 2 ) (4.12) The Trst question we might ask is does a solution to this problem exist? When is it unique? How does it change as we vary parameters? The maximum theorem gives us an answer to these questions. Before turning to the theorem, let us continue to work with Problem (4.12). First, let us establish properties of the budget set. In particular, we establish that if p ∈ R ++ , then B(p,ω) is a compact-valued, continuous correspondence. In this case, we will establish that the graph of the budget correspondence A = {(p,ω,c 1 ,c 2 ) ∈ R 2 + ×R 2 + :(c 1 ,c 2 ) ∈ B(p,ω)} satisTes the conditions of Theorems 283 and 284 only when p>0. It is obviously non-empty since (0,0) ∈ B(p,ω)forany(p,ω) ∈ R 2 + . The problem is that for any bounded set b X ? R 2 + of prices and incomes, there may not be a bounded set b Y ? R 2 + of consumptions. In particular, if p>0,B(p,ω)is bounded since 0 ≤ c 2 ≤ ω and 0 ≤ c 1 ≤ ω p but if p =0,c 1 is unbounded. 4.7. HEMICONTINUOUS CORRESPONDENCES 123 See Figure 4.7.13. Under the assumption that p>0, however, we have that B(p,ω) is a non-empty, compact valued, continuous correspondence. Next we establish continuity of the utility function U. In particular, we show that ?ε>0, ?δ>0 such that if p (c 1 ?x 1 ) 2 +(c 2 ?x 2 ) 2 <δ, (4.13) then |U(c 1 ,c 2 )?U(x 1 ,x 2 )| <ε. (4.14) Now rewrite the lhs of (4.14) as |c 1 +c 2 ?x 1 ?x 2 |≤|c 1 ?x 1 |+|c 2 ?x 2 | where the inequality follows from the triangle inequality. If we let δ = ε 2 , then (4.13) implies (4.14), establishing continuity of U. It is also instructive to graph the level sets (or ?indi?erence curves?) of U. These are just given by the equations c 2 = U ?c 1 in Figure 4.7.14 as we vary U. In the same Tgure we also plot budget sets with p>1,p=1,and0<p<1. It is simple to see from the Tgure that the solution, which we denote by ? ? ?, to the household?s problem (4.12) is given by the demand correspondence (c ? 1 ,c ? 2 )= ? ? ? (0,ω)ifp>1 (x,ω?x)withx ∈ [0,ω]ifp =1 ( ω p ,0) if p<1 . That is, if goods 1 and 2 are perfect substitutes for each other from the household?s preference perspective, then if good 1 is expensive (inexpensive), the household consumes none of (only) it, while if the two goods are the samepricethepossibilitiesareuncountable!Noticethatthevaluefunction is continuous and increasing v(p,ω)= ? ω if p ≥ 1 ω p if 1 >p>0 . There is a more formal way of establishing the existence of a solution to such mathematical programming problems and how the solution varies with parameters. In general, let X ? R n ,Y? R m , f : X × Y → R be a single valued function, Γ : X 3 Y be a non-empty correspondence and consider the problem sup y∈Γ(x) f(x,y). If for each x, f(x,·) is continuous in y and the 124 CHAPTER 4. METRIC SPACES set Γ(x) is compact, then we know from the Extreme Value Theorem 262 that for each x the maximum is attained. In this case, v(x)=max y∈Γ(x) f(x,y) (4.15) is well deTned and the set of values y which attain the maximium G(x)={y ∈Γ(x):f(x,y)=v(x)} (4.16) is non-empty (but possibly multivalued). The Maximum theorem puts fur- ther restrictions on Γ to ensure that v and G vary in a continuous way with x. The proof works in the following way. Consider a convergent sequence of elements in the constraint set x n →x ∈ X (which we can always Tnd since Γ is compact valued). By the extreme value theorem, there is a corresponding sequence of optimizing choices y n ∈ G(x n )andy n → y. We must show that the limit of that sequence y is the optimizing choice in the constraint set deTned at x. There are two parts to demonstrating this result. First we must show that y is in the constraint set ( y ∈Γ(x)). Then we must show y is the optimizing choice in Γ(x). Theorem 295 (Berge?s Theorem of the Maximum) Let X ?R n ,Y? R m , f : X×Y →R be a continuous function, and Γ : X 3Y be a nonempty, compact-valued, continuous correspondence. Then v : X → R deTned in (4.15) is continuous and the correspondence G : X → Y deTned in (4.16) is nonempty, compact valued, and uhc. Proof. The Extreme Value Theorem 262 ensures that for each x the max- imum is attained and G(x) is nonempty. Since G(x) ? Γ(x)andΓ(x)is compact, G(x) is bounded. To show G(x) is closed, we suppose y n → y with y n ∈ G(x),?n andneedtoshowthaty ∈ G(x). 25 Since Γ(x)isclosed, y ∈ Γ(x). Since v(x)=f(x,y n )?n and f is continuous, then v(x)=f(x,y) and y ∈ G(x). Thus, G(x) is nonempty and compact for each x. To see that G(x)isuhc,letx n → x and choose y n ∈ G(x n ). We need to show that there exists a convergent subsequence <y g(n) >→ y and y ∈ G(x). Since Γ is uhc, ? <y g(n) > converging to y ∈ Γ(x) by Theorem 280. Consider an alternative z ∈ Γ(x). Since Γ is lhc, ? <z g(n) > converging to z with z g(n) ∈ Γ(x g(n) ), ?g(n) by Theorem 278. Since f(x g(n) ,y g(n) ) ≥ 25 This follows from DeTnition 111. 4.7. HEMICONTINUOUS CORRESPONDENCES 125 f(x g(n) ,z g(n) ),?g(n) by optimality and f is continuous, f(x,y) ≥ f(x,z). Since this holds for any z ∈Γ(x), then y ∈ G(x),satisfying uhc. To see that v(x)iscontinuous,Tx x and let x n → x. Choose y n ∈ G(x n ), ?n.Letv =limsupv(x n )andv =liminfv(x n ). We can choose <x g(n) > (a subsequence corresponding to <y g(n) > above) such that v =limf(x g(n) ,y g(n) ). Since G is uhc, ? <y h(g(n)) > converging to y ∈ G(x). Hence v = limf(x h(g(n)) ,y h(g(n)) )=f(x,y)=v(x). An analogous argument establishes that v = v(x). Hence <v(x n ) > converges to v(x). The next three examples illustrate the Maximum theorem with simple mathematical problems. Example 296 Let X = R, Y = R, f : Y → R be given by f(y)=y and Γ : X3Y be given by Γ(x)= ? [0,1] if x ≤ 1 1 2 x>1 . Consider the problem v(x)=max y∈Γ(x) f(y). Then v(x)= ? 1 if x ≤ 1 1 2 x>1 and G(x)= ? 1 if x ≤ 1 1 2 x>1 . Notice that v(x) is not continuous and that G(x) is not uhc. What condi- tion of Theorem 295 did we violate? The constraint correspondence is not continuous; in particular, while Γ(x) is uhc, it is not lhc. See Figure 4.7.15. Example 297 Let X = R, Y = R, f : Y → R be given by f(y)=cos(y), and Γ : X3Y be given Γ(x)={y ∈ Y : ?x ≤ y ≤ x for x ≥ 0 and x ≤ y ≤?x for x<0}. Consider the problem v(x)=max y∈Γ(x) f(y). Then v(x)=1,?x and G(x)= ? ? ? ? ? ? ? {0} ?2π<x<2π {?2π,0,2π} ?4π<x<4π {?4π,?2π,0,2π,4π} ?6π<x<6π etc etc . Notice that G(x) is uhc but not lhc since, for example, if we take V =(2π? ε,2π + ε) with 2π>ε>0,thenG(2π)∩V 6= ? but ?δ>0 ?x 0 ∈ B δ (2π) such that G(x 0 )∩V = ? (in particular all those x 0 < 2π). See Figure 4.7.16. 126 CHAPTER 4. METRIC SPACES Example 298 Let X = R, Y = R, f : Y → R be given by f(y)=y 2 and Γ : X3Y be given Γ(x)={y ∈ Y : ?x ≤ y ≤ x for x ≥ 0 and x ≤ y ≤?x for x<0}. Consider the problem v(x)=max y∈Γ(x) f(y). Then v(x)=x 2 ,?x and G(x)={?x,x}. Notice that G(x) is uhc and lhc but not convex valued. See Figure 4.7.17. If we put more restrictions on the objective function and the constraint correspondence we can show that the set of maximizers G(x) is single-valued and continuous. Theorem 299 Let X ? R n ,Y? R m .LetΓ : X 3 Y be a nonempty, compact- and convex- valued, continuous correspondence. Let A be the graph of Γ and assume f : X →R is continuous function and that f(x,·) is strictly concave, for each x ∈ X. 26 If we deTne g ? (x)=argmax y∈Γ(x) f(x,y), then g ? (x) is a continuous function. If X is compact, then g ? (x) is uniformly continuous. Exercise 4.7.2 Prove Theorem 299. We illustrate Theorem 299 through the next exercise. Exercise 4.7.3 In Example 294 let the utility function U : R 2 + →R be given by u(c 1 )+u(c 2 ) where u : R + →R is a strictly increasing, continuous, strictly concave function. Establish the following: (i) The objective function U(c 1 ,c 2 ) is continuous and strictly concave on R 2 ; (ii) The budget correspondence B(p,y) is compact and convex; (iii) Existence and uniqueness of the set of maximizers; (iv) v(p,y) is increasing in y and decreasing in p; (iv) v(p,y) is continuous (try this as a proof by contradiction). 26 We say f : R→R is strictly concave if f(αx +(1?α)z) >αf(x)+(1?α)f(z)for x,z,∈R and α ∈ [0,1]. 4.8. FIXED POINTS AND CONTRACTION MAPPINGS 127 4.8 Fixed Points and Contraction Mappings One way to prove the existence of an equilibrium of an economic environment amounts to showing there is a zero solution to a system of excess demand equations. In the case of Example 294, households take as given the relative price p and optimization may induce a continuous aggregate excess demand correspondence ED(p) for good 1. If there is excess demand (supply), prices rise (fall) until equilibrium ( ED(p) = 0) is achieved. We may represent this ?tatonnement? process by the mapping f(p)=p +ED(p). In that case, an equilibrium is equivalent to a Txed point p = f(p). DeTnition 300 Let (X,d) be a metric space and f : X → X be a function or correspondence. We call x ∈ X a Txed point of the function if x = f(x) or of the correspondence if x ∈ f(x). We now present four di?erent Txed point theorems based upon di?erent assumptions on the mapping f. 4.8.1 Fixed points of functions The Trst Txed point theorem does not require continuity of f but uses only the fact that f is nondecreasing. Theorem 301 (Tarsky) Let f :[a,b] → [a,b] be a non-decreasing function (that is, if x>yfor x,y ∈ [a,b], then f(x) ≥ f(y)), ?a,b ∈ R with a<b. Then f has a Txed point. Proof. Let P = {x ∈ [a,b]:x ≤ f(x)}. We prove this in 4 parts. (i) Since f(a) ∈ [a,b] implies a ≤ f(a),then a ∈ P and hence P is non-empty. (ii) Since P ? [a,b]and[a,b] is bounded, then P is bounded. Therefore, by the Completeness Axiom 3, x =supP exists. (iii) Since?x ∈ P,x≤ x by (ii), we have f(x) ≤ f(x) because f is nondecreasing. Since x ∈ P, x≤ f(x) ≤ f(x) so that f(x)isanupperboundofP. Therefore, x ≤ f(x)sincex is the least upper bound and hence x ∈ P. (iv) Since x ≤ f(x) implies f(x) ≤ f[f(x)], we know f(x) ∈ P. Therefore, x ≥ f(x)sincex is an upper bound of P. Given that x ≤ f(x)andx ≥ f(x)weknowthatx = f(x). Note that we have not ruled out that there may be other points x 0 such that x 0 = f(x 0 ). If so, then for all such points x 0 ∈ P.Oursolutionx is the maximal Txed point. 128 CHAPTER 4. METRIC SPACES The proof is illustrated in Figure 4.8.1. For a more general version of this proof, see Aliprantis and Border (1999, Theorem 1.8). The next result by Brouwer requires that f be a continuous function. We saw a one dimensional version of it in section 4.6 which used the Intermediate value Theorem 255. That proof was very simple but the method we used there cannot be extended to higher dimensions. As it turns out proving it in R n where n ≥ 2 is quite di?cult. There are proofs that use calculus but we are going to present an elementary one based on simplexes which were introduced in section 4.5.2. Brouwer?s Txed point theorem could be stated for a non-empty convex, compact subset of R n . Because a nondegenerate simplex is homeomorphic with (i.e. topologically equivalent to) a nonempty, convex, compact subset of R n it su?ces to state Brouwer?s theorem for the simplex. Exercise 4.8.1 Show that a simplex is homeomorphic with a nonempty, con- vex, compact subset P = {(p 1 ,p 2 ) ∈R 2 :0≤ p 1 ,p 2 ≤ M,M Tnite}. For notational simplicity and better intuition we prove Brouwer?s theorem in R 2 but this simpliTcation has no e?ectwhatsoeveronthelogicofthe proof. The proof for general R n can be replicated with only minor notational changes. Theorem 302 (Brouwer) If f (x) maps a nondegenerate simplex continu- ously into itself then there is a Txed point x ? = f (x ? ). Proof. (Sketch) The farther a point is from a vertex, the smaller is its barycentric coordinate. Thus, in Figure 4.8.12, a 0 s largest barycentric coor- dinate is the Trst one while b?s largest barycentric coordinate is the second one. For a given f, we introduce an indexing function I(x)asfollows.Let y = f(x)andI(x)=min{i : x i >y i }.If b = f(a),then I(a) = 0 because a 0 >b 0 . (the arrow connecting a with f(a) points away from the vertex v 0 ). x ? is a Txed point of f if α ? i = β ? i ,i=0,1,2whereα ? i and β ? i = f i (x ? ) are barycentric coordinates of x ? and f(x ? ). See Figure 4.8.4. In the case of barycentric coordinates, instead of equality (4.24) it su?ces to show the following inequalities: α ? i ≥ β ? i ,i=0,1,2 (4.17) because α ? i ≥ 0,β ? i ≥ 0and P 2 i=0 α ? i =1= P 2 i=0 β ? i .SpeciTcally, If f doesn?t have a Txed point, then I(x)iswelldeTned for all x ∈ S and obtains values 4.8. FIXED POINTS AND CONTRACTION MAPPINGS 129 0,1, or 2 with certain restrictions on the boundary. Divide the simplex into m 2 equal subsimplexes and index all the vertices of the the subsimplexes using I(x) obeying restrictions on the boundary. Sperner?s lemma guarantees that for each m there is at least one simplex with a complete set of indices (i.e. arrows originating at these verteces point inside the triangle). By choosing onevertexofsuchsimpexforeachm we get an inTnite sequence of poins from S that is the sequence is bounded. Hence by the Bolzano-Weierstrass theorem there exists a convergent subsequence with the limit point x ? ∈ S. As m →∞, a triangle collapses into one point (which is x ? ) (at this point all arrows point inside itself). Since f is continuous, it preserves inequalities so that x ? is a Txed point of f. In Chapter 6, we will introduce an inTnite dimensional version of Brouwer?s Txed point theorem by Schauder. Example 303 (On Existence of Equilibrium) Consider the following 2 period t =1,2 exchange problem with a large number I of identical agents. Let c i t ,y i t ,q t denote an element (date t) of agent i?s consumption and en- dowment vector, as well as the price vector, respectively. Let a represen- tative agent i?s budget set be given by B(q,y i )={c i ∈ R 2 : P 2 t=1 q t c i t ≤ P 2 t=1 q t y i t }. Notice that an agent?s budget set is homogeneous of degree zero in q. 27 That is, B(λq,y i )=B(q,y i ). Thus, we are free to take λ = 1 P 2 t=1 q t > 0 and set p = 3 q 1 P 2 t=1 q t , q 2 P 2 t=1 q t ′ .This deTnes a one dimensional price sim- plex S 1 = ? p ∈R 2 t : P 2 t=1 p t =1 a .Let the representative agent i?s utility function be given by U(c i )= P 2 t=1 log(c i t ). Exercise 4.7.3 establishes that B(p,y i ) is a non-empty, compact- and convex-valued continuous correspon- dence and that U(c i ) is strictly concave. Thus by version 299 of the The- orem of the Maximum the set of maximizers {c i 1 (p,y),c i 2 (p,y)} are single valued and continuous functions. Since the sum of continuous functions is continuous, the aggregate excess demand function z : S 1 → R 2 given by z(p)= P I i=1 c i (p,y)?y i is continuous. It is a consequence of Walras Law that the inner product <p,z(p) >=0. To prove existence of equilibrium, we need to show that at the equilibrium price vector p ? there is no excess demand (i.e. z(p ? ) ≤ 0).SpeciTcally, we must show that if z : S 1 →R 2 is continuous and satisTes <p,z(p) >=0, then ?p ? ∈ S 1 such that z(p ? ) ≤ 0 (in the 27 We say a function f(x) is homogeneous of degree k =0,1,2... if for any λ>0, f(λx)=λ k f(x). 130 CHAPTER 4. METRIC SPACES case that all goods are desireable, this is z(p ? )=0). To this end, deTne the mapping which raises the price of any good for which there is excess demand: f t (p)= p t +max(0,z t (p)) 1+ P 2 j=1 max(0,z j (p)) for t =1,2. Notice that f t (p) is continuous since z t and max(·,·) are continuous func- tions and that f(p) lies in S 1 since P 2 t=1 f t (p)=1.ByBrouwer?sFixed Point Theorem 302, there is a Txed point where f(p ? )=p ? . But this can be shown (by applying Walras Law) to imply that z t (p ? ) ≤ 0.You should con- vince yourself that with these preferences and endowments, the markets for current and future goods are cleared if p ? implies q 2 q 1 = y 1 y 2 so that the relative price of future goods in terms of current goods is lower the more plentiful future goods or less plentiful current goods are. Since 1 1+r = q 2 q 1 , this means that interest rates are higher the smaller is current output relative to future output. In other words, identical (representative) agents would like to borrow against plentiful future output to smooth consumption if current output is low; this would drive up the interest rate. 4.8.2 Contractions Note that while the above theorems proved existence, they said nothing about uniqueness. The next set of conditions on f provide both. DeTnition 304 Let (X,d) beametricspaceandf : X →X be a function. Then f satisTes a Lipschitz condition if ?γ>0 such that d(f(x),f(ex)) ≤ γd(x,ex), ?x,ex ∈ X.Ifγ<1,then f is a contraction mapping (with modulus γ). One way to interpret the Lipschitz condition is as a restriction on the slope of f.Thatis, ?y ?x = d(f(x),f(ex)) d(x,ex) ≤ γ. Then a contraction is simply a function whose slope is everywhere less than 1. If f is Lipschitz, then it is uniformly continuous since we can take δ(ε)= ε γ in which case d(x,ex) <δ? d(f(x),f(ex)) <ε.On the other hand, if f is uniformly continuous, it may not satisfy the Lipschitz condition as the next example shows. Example 305 Let f :[0,1] → R be given by f(x)= √ x. To see that f is uniformly continuous, for any ε>0, let δ(ε)= ε 2 2 . Then if |x?y| <δ,we have 4.8. FIXED POINTS AND CONTRACTION MAPPINGS 131 | √ x? √ y|≤ p 2|x?y| < q 2· ε 2 2 = ε where the weak inequality follows since ? √ x? √ y ¢ 2 = x?2 √ xy+y ≤ 2(max{x,y}?min{x,y})=2|x?y|.Tosee that f is not Lipschitz, suppose so. Then for some γ>0, | √ x? √ y|≤ γ|x?y| or | √ x? √ y| |x?y| ≤ γ, ?x,y ∈ [0,1].But | √ x? √ y| |x?y| = | √ x? √ y| | √ x? √ y|| √ x+ √ y| = 1 | √ x+ √ y| . Choose x = 1 (1+γ) 2 and y =0so x,y ∈ [0,1].Then 1 | √ x+ √ y| =1+γ which contradicts | √ x? √ y| |x?y| ≤ γ. The next result establishes conditions under which there is a unique Txed point and provides a result on speed of convergence helpful for computational work. Theorem 306 (Contraction Mapping) If (X,d) is a complete metric space and f : X → X is a contraction with modulus γ,then f has a unique Txed point x ∈ X and (ii) for any x 0 ∈ X, d(x,f n (x 0 )) ≤ γ n 1?γ d(f(x 0 ),x 0 ) where f n are iterates of f. 28 Proof. Choose x 0 ∈ X and deTne <x n > ∞ n=0 by x n+1 = f(x n )sothatx n = f n (x 0 ). Since f is a contraction d(x 2 ,x 1 )=d(f(x 1 ),f(x 0 )) ≤ γd(x 1 ,x 0 ). Continuing by induction, d(x n+1 ,x n )=d(f(x n ),f(x n?1 )) ≤ γd(x n ,x n?1 ) ≤ γ n d(x 1 ,x 0 ),n=1,2,... (4.18) For any m>n, d(x m ,x n ) ≤ d(x m ,x m?1 )+... +d(x n+2 ,x n+1 )+d(x n+1 ,x n ) (4.19) ≤ £ γ m?1 +...+γ n+1 +γ n ¤ d(x 1 ,x 0 ) = γ n [γ m?n?1 +... +γ +1]d(x 1 ,x 0 ) ≤ γ n 1?γ d(x 1 ,x 0 ) where the Trst line uses the triangle inequality and the second uses (4.18). It follows from (4.19) that <x n > is a Cauchy sequence. Since X is complete, x n → x.Thatx is a Txed point follows since d(f(x),x) ≤ d(f(x),f n (x 0 )) +d(f n (x 0 ),x) ≤ γd(x,f n?1 (x 0 )) +d(f n (x 0 ),x) 28 The iterates of f (mappings {f n }), are deTned as n-fold compositions f 0 (x)=x, f 1 (x)=f(x),f 2 (x)=f(f 1 (x)),...,f n (x)=f(f n?1 (x)). 132 CHAPTER 4. METRIC SPACES where the Trst line uses the triangle inequality and the second simply uses that f is a contraction. Since γ<1, (4.18) implies lim n→∞ d(x,f n?1 (x 0 )) = 0=lim n→∞ d(f n (x 0 ),x)sothatd(f(x),x)=0orx is a Txed point. To prove uniqueness, suppose to the contrary there exists another Txed point x 0 . Then d(x 0 ,x)=d(f(x 0 ),f(x)) ≤ γd(x 0 ,x) implies γ ≥ 1, contrary to γ<1foracontraction. Finally, the speed of convergence follows since d(x,f n (x 0 )) ≤ d(x,f m (x 0 )) +d(f m (x 0 ),f n (x 0 )) ≤ γ n 1?γ d(f(x 0 ),x 0 ) where the Trst line follows from the triangle inequality and the second from (4.19) and lim m→∞ d(x,f m (x 0 )) = 0.See Figure 4.8.13. Sometimes it is useful to establish a unique Txed point on a given space X and then apply Theorem 306 again on a smaller space to characterize the Txed point more precisely. Corollary 307 Let (X,d) be a complete metric space and f : X → X be a contraction with Txed point x ∈ X. If X 0 is a closed subset of X and f(X 0 ) ? X 0 , then x ∈ X 0 . Proof. Let x 0 ∈ X 0 . Then <f n (x 0 ) > is a sequence in X 0 converging to x. Since X 0 is closed, x ∈ X 0 . 4.8.3 Fixed points of correspondences In considering Txed points of correspondences we would like to utilize Txed point theorems (particularly Brouwer?s Txed point theorem) of functions. How can we reduce multiple valued case to the single-valued one? This can be done by means of selection (i.e. a single-valued function that is selected from a multiple valued correspondence). Depending on circumstances we might have extra conditions on these choice functions. For instance, we might look for a continuous choice function (called continuous selection) or for a measurable choice function (called measurable selection, which we will deal with next chapter). DeTnition 308 Let Γ : X 3Y be a correspondence, then the single-valued function Γ 0 : X ?→ Y such that Γ 0 (x) ∈Γ(x),?x ∈ X is called a selection. SeeFigure4.8.14. 4.8. FIXED POINTS AND CONTRACTION MAPPINGS 133 The existence of a Txed point of a function proven by Brouwer requires continuity. Hence in this section we will deal with the problem of continuous selection. After proving the existence of a continuous selection, we will use Brouwer?s Txed point theorem for functions to show that the selection has a Txed point, which is then obviously, a Txed point for the original correspon- dence. There are two main results of this subsection, Michael?s continuous selection theorem and Kakutani?s Txed point theorem for correspondences. First we introduce a new notion, a partition of unity, that will be used in the proof of the selection theorem. Existence of a partition of unity is based on a well-known result from topology. Lemma 309 (Urysohn) Let A,B be two disjoint, closed subsets of a metric space X. Then there exists a continuous function f : X ?→ [0,1] such that f(x)=0,?x ∈ A and f(x)=1,?x ∈ B. For a proof, see Kelley (1957). To continue we need to introduce the following topological concept. DeTnition 310 Let X be a metric space and let {G i ,idΛ} be an open cover of X. Then a partition of unity subordinate to the cover {G i } is a family of continuous real-valued functions ? i : X ?→ [0,1] such that ? i (x)= 0,?x ∈ X\G i , and such that ?x ∈ X, P iDΛ ? i (x)=1. Example 311 Let X =[0,1],G 1 =[0, 2 3 ),G 2 =( 1 3 ,1] be an open cover of [0,1].Let ? 1 = ? ? ? 1 , 0 ≤ x ≤ 1 3 ?3 ? x? 2 3 ¢ , 1 3 <x≤ 2 3 0 , 2 3 <x≤ 1 ? ? ? ? 2 = ? ? ? 0 , 0 ≤ x ≤ 1 3 3 ? x? 1 3 ¢ , 1 3 <x≤ 2 3 1 , 2 3 <x≤ 1 ? ? ? See Figure 4.8.15. Then {? 1 ,? 2 } is partition of unity subordinate to {G 1 ,G 2 }. Lemma 312 (Partition of unity) Let X be a metric space and let {G 1 , ....,G n } be a Tnite open cover of X. Then there exists a partition of unity subordinate to this cover. 134 CHAPTER 4. METRIC SPACES Proof. We begin by constructing a new cover {H 1 ,...,H n } of X by open sets such that (i) H i ? H i ? G i for i =1,2,...,n and (ii) {H i : i ≤ j}∪{G i ,i> j} is a cover for each j. This is done inductively. Let F 1 = X\∪ n i=2 G i . Then F 1 is closed and F 1 ? G 1 .The sets F 1 and X\G 1 are closed disjoint subsets of the metric space X and hence can be separated by two disjoint open sets (see the separation axioms in Chapter 7), H 1 and X ?H 1 . We have F 1 ? H 1 ? H 1 ? G 1 . This satisTes (ii) for j =1.Now suppose H 1 ,H 2 ,...,H k?1 have been con- structed. Then since {H i : i ≤ k ?1}∪{G i : i>k?1} is a cover for X, F k = X\( ? ∪ k=1 i=1 H i ¢ ∪ ? ∪ n i=k+1 G i ¢ ) ? G k . Again by separating F k and X\G k we get H k such that F k ? H k ? H k ? G k . Clearly the collection {H 1 ,...,H k } satisTes (ii) with j = k.By Urysohn?s lemma 309we can construct real-valued functions ψ i on X such that ψ i (x)=0ifx ∈ X\G i and ψ i (x)=1ifx ∈ H i and 0 ≤ ψ i ≤ 1. Finally, let ? i (x)= ψ i (x) (Σ n j=1 ψ j (x)) . Since the collection {H i ,i=1,2,....,n} is a cover, we have P n j=1 ψ j (x) 6=0 for each x and hence ? i (x) is well-deTned. {? i } n i=1 is the partition of unity subordinate to cover {G i } n i=1 . Theorem 313 (Michael) Let X be a metric space and Y be a Banach space, Γ : X 3 Y be lhc and Γ(x) closed and convex for every x ∈ X. Then Γ admits a continuous selection. Proof. We prove the theorem under a stronger assumption, that X is a compact metric space, than is necessary. 29 We Trst show that for each pos- itive real number β there exists a continuous function f β : X ?→ Y such that f β (x) ∈ β + Γ(x)foreachx ∈ X. The desired selection will then be constructed as a limit of a suitable Cauchy sequence in such functions (that?s why we need Y to be a complete normed vector space). For each y ∈ Y , let U y = Γ ?1 (B β (y)) where B β (y) is an open ball around y of diameter β. Since Γ is lhc and B β (y)isopeninY , U y is open in X. The collection 29 To prove the more general version we would need to use the concept of paracom- pactness which goes beyond the scope of this book. For the more general result, see Aubin-Frankowska (1990). 4.8. FIXED POINTS AND CONTRACTION MAPPINGS 135 {U y ,y∈ Y} is then an open cover of X. Since X is compact there exists a Tnite subcollection {U y i ,i=1,.....,n,y i ∈ Y} which by Lemma 312 (requir- ing a Tnite collection) has a partition of unity {π i ,i=1,...,n} subordinate to U y .Letf β be deTned by f β (x)= P n i=1 π i (x)y i ,wherey i is chosen in such a way that π i =0inX\U y i . Since f β (x)isthesumofTnitely many continuous functions it is a continuous function from X to Y. f β is a convex combination of those points y i for which π i (x) 6=0. But π i (x) 6= 0 only if xdU y i . Thus Γ(x)∩B β (y i ) 6= ? and so y i ∈ β +Γ(x). Thus f β is a convex combination of points y i which lie in the convex set β +Γ(x)andsof β is also in that set (i.e. f β ∈ β +Γ(x)). Next we construct a sequence of such functions f i to satisfy the following two conditions: f i (x) ∈ 1 2 i?2 +f i?1 (x),i=2,3,4,..... (4.20) f i (x) ∈ 1 2 i +Γ(x),i=1,2,3,..... (4.21) For f 1 we take the function f β already constructed with β = 1 2 . Suppose that f 1 ,f 2 ,.......f n have already been constructed. Let Γ n+1 (x)=Γ(x) ∩ ? 1 2 n +f n (x) ¢ . Then since f n (x)satisTes condition (4.21), Γ n+1 (x)isnon- empty and being the intersection of two convex sets is convex. Moreover Γ n+1 (x) is lhc (see Lemma 293). Therefore by the Trstpartoftheproofand with β = 1 2 n+1 , there exists a function f n+1 with the property that f n+1 (x) ∈ 1 2 n+1 +Γ n+1 (x). Since Γ n+1 (x) ? Γ(x), we have f n+1 (x) ∈ 1 2 n+1 +Γ(x)so that condition (4.21) is satisTed. Furthermore, since Γ n+1 (x) ? 1 2 n + f n (x) we have f n+1 (x) ∈ ? 1 2 n + 1 2 n+1 ¢ + f n (x) ? 1 2 n?1 + f n (x)whichmeansthat condition (4.20) is satisTed. We constructed the sequence hf i ,i∈Ni of func- tions for which kf n+1 (x)?f n (x)k Y < 1 2 n+1 for all n and all x. Therefore sup xDX kf m (x)?f n (x)k Y < 1 2 n?2 for all m,n with m>n.Thus the sequence hf i i is a Cauchy sequence in the space of bounded continuous functions from X to Y which is complete because Y is complete (which we will see in Theo- rem 452 in Chapter 6). Then there exists a continuous function f : X ?→ Y such that hf i i?→ f (with respect to the sup norm). Since (4.21) states that kf n (x)?Γ(x)k < 1 2 n for all n, it follows that the limit function f has the property that f (x) ∈ Γ(x) (the closure of Γ). By assumption that Γ(x)is closed we have f (x) ∈Γ(x)sinceΓ(x)=Γ(x). Note that a correspondence that is uhc does not guarantee a continuous selection. See Example 272. 136 CHAPTER 4. METRIC SPACES Combining Brouwer?s Txed point theorem 302 with Michael?s selection theorem 313 we immediately get the existence of a Txed point for lhc corre- spondences. Corollary 314 Let K be a non-empty, compact, convex subset of a Tnite demensional space R m and let Γ : K 3 K be a lhc, closed, convex valued correspondence. Then Γhas a Txed point. Kakutanis theorem is usually stated with the condition that the corre- spondence Γ be uhc and closed valued. However, since we are dealing with acompactsetK , by Theorems 285 and 287, uhc together with the closed valued property are equivalent to having a closed graph. It seems that this condition is somewhat easier to check. In order to make the switch from uhc (or equivalently from closedness of graph) to lhc, we use the following lemma. Lemma 315 Let X and Y be compact subsets of a Tnite dimensional normed vector space R m and let Γ : X 3 Y be a convex-valued correspondence which has a closed graph (or equivalently is closed-valued and uhc). Then given β>0 , there exists a lhc, convex-valued correspondence F : X 3 Y such that Gr(F) ? β +Gr(Γ). Proof. Consider Trst the new correspondences b F ε deTned for all ε>0by b F ε (x)=∪ bx∈X,kx?bxk<ε Γ(bx). To see that b F ε is lhc at x 0 ,consider an open set G such that b F ε (x 0 )∩G 6= ?. Then there exists bx ∈ X with kbx?x 0 k <εand Γ(bx)∩G 6= ?.If μis su?ciently small (μ<ε?kbx?x 0 k)andifkx 0 ?xk <μ, then kbx?xk <ε,and so b F ε (x)∩G 6= ? because Γ(bx) ? b F ε (x). Thus b F ε is lhc at an arbitrary x 0 dX and hence lhc on X. It follows from Lemma 292 that F ε = co( b F ε ) is also lhc. Since F ε is certainly convex-valued the proof is Tnished by showing that Gr(F ε ) ? β + Gr(Γ)ifε is su?ciently small. Supposethatitisnotso.Thatis,forsomeβ>0andalln ∈N , Gr 3 F1 n ′ is not contained in β +Gr(Γ). Then there exists a sequence h(x n ,y n ),ndNi in X × Y such that (x n ,y n ) ∈ Gr 3 F1 n ′ but d((x n ,y n ),Gr(Γ)) ≥ β. To say that (x n ,y n ) ∈ Gr 3 F1 n ′ means that y n = P m+1 i=1 λ n,i y n,i with λ n,i ≥ 0 , P m+1 i=1 λ n,i =1, and y n,i dΓ(bx n,i )wherekbx n,i ?x n k < 1 n . Here we used Caratheodory?s theorem 226 saying that in R m if y n is a convex combination of certain points, it can always be expressed as a convex combination of 4.8. FIXED POINTS AND CONTRACTION MAPPINGS 137 di?erent((m+1)points).SinceX and Y are compact and λ n,i d[0,1] (which is also compact) all the above sequences contain subsequences (we will use the same indexes for subsequences) that converge, that is hx n i → x, hy n i → y , hλ n,i i→λ i andhbx n,i i→ bx i .Sincekbx n,i ?x n k < 1 n , bx i = x,fori =1,...,m+1. We also have that P m+1 i=1 λ i =1andy n = P m+1 i=1 λ n,i y n,i → P m+1 i=1 λ i y i = y. Now (bx n,i ,y n,i ) ∈ Gr(Γ)andso(bx i ,y i )=(x,y i ) ∈ Gr(Γ)=Gr(Γ) (since Gr(Γ)isclosed). Thusy i ∈ Γ(x)and,sinceΓ(x)isconvex,y ∈ Γ(x) (being a convex combination of y i ). Hence (x,y)dGr(Γ). But since d((x n ,y n ),Gr(Γ)) ≥ β for all n, this is not possible. This contradiction completes the proof. Corollary 316 In Lemma 315 we may also take F to be closed-valued. Proof. Let F = F ε for su?ciently small ε. Then by lemma 291, F is lhc. It is of course still convex-valued and if Gr(F ε ) ? β 2 + Gr(Γ), then Gr(F) ? β +Gr(Γ). Theorem 317 (Kakutani) Let K be a non-empty, compact, convex subset of Tnite-dimensional space R m and let Γ : K 3 K beaclosed,convex valued, uhc correspondence (or convex valued with closed graph). Then Γ has a Txed point. Proof. By lemma 315 and corollary 316, for each n ∈ N there exists a lhc correspondence F n : K 3 K such that Gr(F n ) ? 1 n + Gr(Γ)andF n has values which are closed and convex. Then by Michael?s selection theorem 313, there is a continuous selection f n for F n . The function f n is continous mapping of K into itself and so, by Brouwer Txed point theorem 302, there exists x n dK with f n (x n )=x n . The compactness of K means that there exists a convergent subsequence of the sequence hx n i such that - x g(n) ? →x ? . Since (x n ,x n ) ∈ Gr(F n ) ? 1 n + Gr(Γ), it follows that (x ? ,x ? ) ∈ Gr(Γ)=Gr(Γ). Thus x ? ∈Γ(x ? )isaTxed point of Γ. We now use an example to illustrate an important result due to Nash (1950). Nash?s result says that every Tnite strategic form game has a mixed strategy equilibrium. Example 318 Reconsider the Tnite action coordination game in Example 282. We say that the mixed strategy proTle (p ? ,q ? ) is a Nash Equilibrium 138 CHAPTER 4. METRIC SPACES if π 1 (p ? ,q ? ) ≥ π 1 (p,q ? ) and π 2 (p ? ,q ? ) ≥ π 2 (p ? ,q), ?p,q ∈ [0,1]. In Example 282, we showed p ? (q)= ? ? ? 0 if q< 1 2 [0,1] if q = 1 2 1 if q> 1 2 and q ? (p)= ? ? ? 0 if p< 1 2 [0,1] if p = 1 2 1 if p> 1 2 . Given that the two agents are symmetric, to prove that the above game has a mixed strategy equilibrium, it is su?cient to show that p ? :[0,1] 3 [0,1] has a Txed point p ∈ p ? (p). From Kakutani?s theorem, it is su?cient to check that p ? (p) is a non-empty, convex-valued, uhc correspondence all of which was shown in Example 282. See Figure 4.8.16. Exercise 4.8.2 Using Kakutani?s theorem, prove Nash?s result generally. See Fudenberg and Tirole p.29. 4.9 Appendix - Proofs in Chapter 4 Proof of Caratheodory Theorem 226. . x ∈ co(X) implies x = P m i=1 λ i x i , (x 1 ,...,x m ) ∈ X, λ i > 0 ?i, and P m i=1 λ i =1byTheorem??. Suppose m ≥ n+2. Then the vectors · x 1 1 ? , · x 2 1 ? ,..., · x m 1 ? ∈R n+1 are linearly dependent. Hence there exist μ 1 ,...,μ m , not all zero, such that m X i=1 μ i · x i 1 ? =0 (i.e. m X i=1 μ i x i =0and m X i=1 μ i 1=0). Letμ j > 0forsomej, 1 ≤ j ≤ m. DeTne α = λ j μ j =min n λ i μ i : μ i 6=0 o so that λ j ? αμ j =0.If we deTne θ i ≡ λ i ? αμ i ,then θ j =0, m X i=1 θ i = m X i=1 λ i ? α m X i=1 μ i =1? α0=1, and m X i=1 θ i x i = m X i=1 λ i x i ?α m X i=1 μ i x i = m X i=1 λ i x i = x. Henceweexpressedx as a 4.9. APPENDIX - PROOFS IN CHAPTER 4 139 convex combination of m?1pointsofX with θ j =0forsomej,reducingit from m points. If m?1 >n+1, then the process can be repeated until x is expressed as a convex combination of n+1pointsofX. Proof of Theorem 248.. (i ? ii)Leta ∈ f ?1 (V). Then ?y ∈ V such that f(a)=y. Since V is open, then ?ε>0suchthatB ε (y) ? V.Sincef is continuous for this ε, ?δ(ε,a) > 0 such that ?x ∈ X with d X (x,a) <δ(ε,a) we have d Y (f(x),f(a)) <ε. Hence f(B δ (a)) ? B ε (f(a)) or equivalently B δ (a) ? f ?1 (B ε (f(a))) ? f ?1 (V). (ii ? iii)Let<x n >→ x. Take any open ε-ball B ε (f(x)) ? V. Then x ∈ f ?1 (B ε (f(x))) and f ?1 (B ε (f(x))) is open (by assumption ii). Now ?δ>0 such that B δ (x) ? f ?1 (B ε (f(x))). Since <x n >→ x,?N such that n ≥ N, x n ∈ B δ (x) ? f ?1 (B ε (f(x))). Hence f(x n ) ∈ B ε (f(x)) ?n ≥ N so f(x n ) → f(x). See Figure 4.6.5. (iii ?i)Itissu?cient to prove the contrapositive. Thus, suppose?ε>0 such that ?δ = 1 n , ?x n such that d X (x n ,x) < 1 n and d Y (f(x n )),f(x)) ≥ ε. Thus we have a sequence <x n >→ x but none of the elements of <f(x n ) > is in an ε-ball around f(x)). Hence <f(x n ) > doesn?t converge to f(x). 30 Proof of Theorem 268.. In the Trst step, for a given ε>0,we construct δ that depends only on ε.Thus,takeε>0. Since f is continuous on X, then for any x ∈ X there is a number δ( 1 2 ε,x) > 0suchthatif x 0 ∈ X and d(x,x 0 ) <δ( 1 2 ε,x), then d Y (f(x)),f(x 0 )) < 1 2 ε. The collection of open balls G ={B δ( 1 2 ε,x) (x),x∈ X} is an open covering of X.SinceX is compact there exists a Tnite subcollection, say {B(x 1 ),...B(x n )} of these balls that covers X.ThendeTne δ(ε)= 1 2 min{δ( 1 2 ε,x 1 ),....,δ( 1 2 ε,x n )} which is obviously independent of x. In the second step, we use the δ(ε) constructed above to establish uni- form continuity. Suppose that x,x 0 ∈ X and d X (x 0 ,x) <δ(ε). Because {B(x 1 ),...B(x n )} covers X, x∈ B(x k )forsomek.Thatis d X (x,x k ) < 1 2 δ( 1 2 ε,x k ). (4.22) By the triangle inequality it follows that d X (x 0 ,x k ) ≤ d X (x 0 ,x)+d X (x,x k ) ≤ 2δ(ε) ≤ δ( 1 2 ε,x k ). (4.23) 30 Munkres p.127 Th10.1, sequences see Munkres p. 128, Th 10.3. 140 CHAPTER 4. METRIC SPACES Then (4.22) and continuity of f at x k imply d Y (f(x)),f(x k )) < 1 2 ε, while (4.23) and continuity of f at x k imply d Y (f(x 0 )),f(x k )) < 1 2 ε. Again by the triangle inequality it follows that d Y (f(x)),f(x 0 )) ≤ d Y (f(x)),f(x k )) +d Y (f(x 0 )),f(x k )) < 1 2 ε+ 1 2 ε = ε. Thus, we have shown that if x,x 0 ∈ X for which d X (x 0 ,x) <δ(ε), then d Y (f(x)),f(x 0 )) <ε. Proof of Brouwer?s Fixed Point Theorem 302. (in R 2 ). Let f : S ?→ S be continuous, where S is a Txed nondegenerate simplex with vertices v 0 ,v 1 ,v 2 .x ? = f (x ? ) implies that α ? i = β ? i ,i=0,1,2 (4.24) where α ? i and β ? i = f i (x ? ) are barycentric coordinates of x ? and f(x ? ). See Figure 4.8.4. In the case of barycentric coordinates, instead of equality (4.24) it su?ces to show the following inequalities. α ? i ≥ β ? i ,i=0,1,2 (4.25) (4.24) and (4.25) are equivalent because α ? i ≥ 0,β ? i ≥ 0and P 2 i=0 α ? i =1= P 2 i=0 β ? i . To see this, note that α 0 ≥ β 0 ≥ 0, α 1 ≥ β 1 ≥ 0, α 2 ≥ β 2 ≥ 0, α 0 +α 1 +α 2 = β 0 +β 1 +β 2 (= 1), and (α 0 ?β 0 )+(α 1 ?β 1 )+(α 2 ?β 2 )=0 implies α 0 = β 0 ,α 1 = β 1 ,α 2 = β 2 . Let y = f (x). If y 6= x, then some coordinates β i 6= α i . Therefore, since P α i = P β i we must have both some α k >β k and some α i <β i . Focus on the Trst inequality, α k >β k ; this cannot occur when α k =0.In other words this cannot occur on the boundary segment opposite vertex v k (see the calculations in Example 228). For example, on the boundary line segment joining v 0 and v 1 opposite v 2 (wewilldenotethislinesegment(v 0 ,v 1 ),the inequality α 2 >β 2 cannot occur because α 2 = 0 for all these points but 0 >β 2 ≥ 0 is false. See Figure 4.8.5. 4.9. APPENDIX - PROOFS IN CHAPTER 4 141 Now we introduce an indexing scheme for points in the simplex as follows. Given functions y = f (x)(f : S ?→ S),foreachx ∈ S such that x 6= y = f (x)wehaveseenthatx i >y i for some i. Now deTne I(x) as the smallest such i (that is, I (x)=min{i : α i >β i }). Hence I(x) can obtain values 0,1,2 (in our case for R 2 ). These values depend on the function y = f (x)of course but on the boundary, we know that I(x) is restricted. For example on the boundary (v 0 ,v 1 ) where α 2 =0wecan?thaveα 2 >β 2 so that I(x) can?t obtain 2. Thus I (x) = 0 or 1 on the line segment (v 0 ,v 1 ). In general, I(x)satisTes the same set of restrictions as I(x) in (4.3) in Section 4.5.2 and hence we can use the results of Sperner?s Lemma 229. Why are we doing this? We are looking for a Txed point of y = f (x). That is a point x whose barycentric coordinates satisfy all the inequalities α 0 ≥ β 0 ,α 1 ≥ β 1 ,α 2 ≥ β 2 .To do so, for m =2,3,4,..., we form the mth barycentric subdivision of our simplex S. For example, see Figure 4.8.7 for m =2.The vertices in the subdivision are points z = 1 2 (μ 0 ,μ 1 ,μ 2 ) where the μ i are integers (and μ i 2 , i =0,1,2 are barycentric coordinates with respect to the mth subdivision) with all μ i ≥ 0and P 2 j=0 μ j =2. In general for the m?th subdivision, the vertices are the points x = 1 m (μ 0 ,μ 1 ,μ 2 ). Where μ i are integers satisfying all μ i ≥ 0, P 2 i=0 μ i = m. We will call a little shaded triangle a cell. The original simplex is the whole body. For m =5wesee25 cells in Figure 4.8.8. Each cell is small; the diameter of each cell is 1 5 of the diameter of the body. In general, in the m?th subdivision of a simplex, the number of cells m 2 tends to inTnity as m ?→ ∞ and the diameter of each cell tends to zero. If ? is the diameter of the body, then the diameter of each cell is ? m . We are given a continuous function y = f (x) that maps the simplex into itself. We assume that f (x)hasnoTxed point and we show that this leads to contradiction. Since we assume x 6= y = f (x)(i.e. noTxed point) we may use the indexing function I(x)foreachpointx ∈ S. The index takes one of the values 0,1,2 at each point of the body and on the boundary of the simplex. The index satisTes the restrictions (??). For example in Figure 4.8.9 there are 21 vertices. Label each vertex x with an index I (x)=0,1,2 arbitrarily except that this indexing has to obey the restrictions (??)onthe boundary. That means you must use I = 0 or 1 on the bottom side, I =0 or 2 on the left and I = 1 or 2 on the right. Also I =0atv 0 , I =1atv 1 and I =2atv 2 . This leaves 6 interior vertices, each to be labeled arbitralily 0,1, or 2. Try to label these vertices such that none of the 25 cells has all the 142 CHAPTER 4. METRIC SPACES labels 0,1,2. No matter how hard you try, at least one of the cells must have a complete set of labels. This is guaranteed by Sperner?s Lemma 229 which follows immediately after this proof. In particular, the lemma guarantees that for any m, in the m-th subdivision there is a cell with a complete set of labels, say I = 0 at the vertex x 0 (m) (4.26) I = 1 at the vertex x 1 (m) I = 2 at the vertex x 2 (m) What does this mean for the function y = f (x)? If I = j then for barycentric coordinates of the points x and y we have α j >β j . Therefore, (4.26) implies α 0 >β 0 at x 0 (m) (4.27) α 1 >β 1 at x 1 (m) α 2 >β 2 at x 2 (m) If m is large all the vertices of the cell are close to each other, since the diameter of the cell is ? m . Therefore max 0≤i<j≤2 ˉ ˉ x i (m)?x j (m) ˉ ˉ = ? m ?→ 0asm ?→ ∞. (4.28) As m →∞, what can be said about the vertices (say x 0 (m))? This vertex might move unpredictably through the simplex in some bounded inTnite sequence. See Figure 4.8.10. Since S is compact, by the Bolzano-Weierstrass Theorem 180 this sequence contains a subsequence that has a limit, say x 0 (m s ) →x ? as s →∞. The limit point x ? ∈ S because S is closed. But because of the closeness of the vertices, (4.28) implies that all tend to x ? as m s →∞.x p (m s ) → x ? as s →∞, p =0,1,2. Now the continuity of f (x) implies f (x p (m s )) → f (x ? )=y ? as s →∞, p =0,1,2. But the barycentric coordinates of a point x depend continuously on x. Therefore, if we let m = m s →∞in (4.27) we obtain the limiting inequalities α 0 ≥ β 0 at the limit x ? ?? α ? 0 ≥ β ? 0 = f 0 (x ? ) α 1 ≥ β 1 at the limit x ? ?? α ? 1 ≥ β ? 1 = f 1 (x ? ) α 2 ≥ β 2 at the limit x ? ?? α ? 2 ≥ β ? 2 = f 2 (x ? ) Butweknowby(4.25)thattheseinequalities imply equalities, thus x ? = y ? = f (x ? ). 4.9. APPENDIX - PROOFS IN CHAPTER 4 143 Proof. Figures for Sections ?? to 4.8 Figure ??.1: Open Sets Figure ??.2: Sup Balls and Open Neighborhoods Figure ??.3: (0,1) vs (0,1] Figure ??.4: {(x,y)|0 <x<1,y=2} Figure ??.5: Closure and Boundary Points Figure 4.1.1: On Cluster points and the Limit of < (?1) n > Figure 4.1.2: On the Limit of < ? 1 n ¢ > Figure 4.1.3: On the Limit of < ? x n ¢ > Figure 4.1.4: On the Limit of <x n > Figure 4.3.1: Construction of (?)H closed. Figure 4.3.2: Compactness for General Metric Spaces. Figure 4.4.1: A disconnected set Figure 4.6.1: Pointwise continuity in R See Figure 4.7.1: Lower Hemicontinuity SeeFigure4.7.2:UpperHemicontinuity See Figure 4.7.3: Best Response Correspondence See Figure 4.7.4: Budget Sets with p =0andp>0 See Figure 4.7.5: Demand Correspondence with Linear Preferences Figure 4.8.1: Tarski?s Fixed Point Theorem in [a,b] Figure 4.8.2: Brouwer?s Fixed Point Theorem in [a,b] Figure 4.8.3: Fixed Point of a Contraction Mapping in [a,b] Figure 4.8.4: Kakutani?s Fixed Point Theorem in [a,b] Figure 4.8.5: Existence of Nash Equilibria Figure 4.5.1: Open Sets 144 CHAPTER 4. METRIC SPACES 4.10 Bibilography for Chapter 4 Sections ?? to are based on Royden (Chapters 2 and 7) and Bartle (Sections 9,14-16). Section 4.2 is based on Royden (Chapter 7, Section 4) and Munkres (Chapter 7, Section 1). Section 4.3 is based on Munkres (Chapter 3, Sections 5 and 7, Chapter 7, Section 3), Royden (Chapter 7, Section 7), and Bartle (Chapter 11). Section 4.6 is based on Munkres (Chapter 3, ). Section 4.5 is from Bartle (Sec 8). 4.11. END OF CHAPTER PROBLEMS 145 4.11 End of Chapter Problems 1) The next results (from DeTnition 319 to Theorem 164) require a total ordering of a set X, so we restrict X to be R. DeTnition 319 Let <x n > be a bounded sequence in R.Thelimit su- perior of <x n >,denotedlimsupx n or limx n , is given by inf n sup k≥n x k . The limit inferior of <x n >, denoted liminf x n or limx n ,isgivenby sup n inf k≥n x k . That is, l ∈ R is the limit superior of <x n > i? given ε>0, there are at most a Tnite number n ∈N such that l +ε<x n but there are an inTnite number such that l?ε<x n .The limit superior is just the maximum cluster point and the limit inferior is just the minimum cluster point. Example 320 Recall Example 141 where we considered the sequence < (?1) n > whichhadtwoclusterpoints. There,liminf x n = ?1 and limsupx n =1. To see why the limit inferior is ?1, consider: n =1has inf k≥1 x k = ?1, n =2has inf k≥2 x k = ?1; and any given n has inf k≥n x k = ?1.But then the sup{?1,?1,...} is just ?1. Theorem 321 Let <x n > be a bounded sequnce of real numbers. Then limx n exists i? liminf x n = limsupx n =limx n . Exercise 4.11.1 Prove Theorem 321. While Theorem 321 hinges on the fact that R is totally ordered, a similar result holds for any totally ordered set. Example 322 Recall Example 137 where we considered the sequence < ? 1 n ¢ > . It is simple to see that it has a cluster point at 0 since any open ball around 0 of size δ has an inTnite number of elements in the sequence past N(δ)=w( 1 δ )+1contained in it. Furthermore, liminf x n =0=limsupx n . To see why the limit superior is 0, consider: n =1has sup k≥1 =1, n =2has supx k = 1 2 ; and any given n has supx k = 1 n .But then the inf{1, 1 2 ,..., 1 n , 1 n+1 ,...} is just 0. 146 CHAPTER 4. METRIC SPACES Example 323 Consider - (?1) n + 1 n ? n∈N . Then <x n >= - ?2, 3 2 ,? 2 3 , 5 4 ,? 4 5 , 7 6 ,... ? . See Figure 4.1.5. The cluster points of <x n > are ?1 and 1, which are also the limit inferior and limit superior, respectively. Notice that the subsequence ofoddnumberedindiceshx 2k?1 i = - (?1) 2k?1 + 1 2k?1 ? ∞ k=1 = - ?2,? 2 3 ,? 4 5 ,... ? →?1 and the subsequence of even numbered indices hx 2k i = - (?1) 2k + 1 2k ? ∞ k=1 = - 3 2 , 5 4 , 7 6 ,... ? → 1. Note that while a limit point is unique, we saw in Example 141 that a sequence can have many cluster points. In that case, the smallest cluster point is called the limit inferior and the largest cluster point is called the limit superior. 2) We provide another useful criterion in R to establish convergence, which is true only because R is totally ordered and complete. Theorem 324 (Monotone Convergence) Let <x n > be a monotone in- creasing sequence (i.e. x 1 ≤ x 2 ≤ ... ≤ x i ≤ x i+1 ≤ ....) in the metric space (R,|·|). 31 Then <x n > converges i? it is bounded and its limit is given by limx n =sup{x n |n ∈N}. Proof. (?) Boundedness follows by Lemma 164, so all we must show is x = sup{x n }. Convergence implies x?δ<x n < x +δ, ?n ≥ N(δ)bydeTnition 136. As a property of the supremum, we know that if x n <y n ,?n ∈ N, then supx n ≤ supy n ,?n ∈ N.Thisimpliesx ? δ ≤ sup{x n } ≤ x + δ or |sup{x n }?x|≤ δ. (?)If<x n > is a bounded, monotone increasing sequence of real numbers, then by the Completeness Axiom 3.3 its supremum exists (call it x 0 =sup{x n }). Since x 0 is a sup, x 0 ?δ is not an ub and ?K(δ) ∈N such that x 0 ? δ<x K(δ) for any δ>0. 32 Since <x n > is monotone, x 0 ? δ< x n ≤ x 0 <x 0 +δ, ?n ≥ K(δ)or|x n ?x 0 | <δ. Example 325 Re-consider Example 137 where < 1 n > n∈N . It is clear that this sequence is monotone decreasing with inTmum 0, which is also its limit. Exercise 4.11.2 Consider the sequence f : N→R given by < (1+ 1 n ) n > n∈N .Show that this sequence is increasing and bounded above so that by the Mono- tone Convergence Theorem 324, the sequence converges in (R,|·|). 31 That is, x 1 ≤ x 2 ≤ ... ≤ x i ≤ x i+1 ≤ .... 32 Existence of this index follows from property (ii) in the footnote to deTnition ?? of a supremum. 4.11. END OF CHAPTER PROBLEMS 147 3) Exercise 4.11.3 Let (X,d) be totally bounded. Show that X is separable. 148 CHAPTER 4. METRIC SPACES Chapter 5 Measure Spaces Many problems in economics lend themselves to analysis in function spaces. For example, in dynamic programming we deTne an operator that maps functions to functions. As in the case of metric spaces, we need some way to measuredistancebetweentheelementsin the function space. Since function spaces are deTnedonuncountablyinTnite dimensional sets, the distance measure involves integration. 1 In this chapter we will focus primarily on Lebesgue integration. Since Lebesgue integration can be applied to a more general class of functions than the more standard Riemann approach, this will allow us to consider, for example, successive approximations to a broader class of functional equations in dynamic programming. To understand Lebesgue integration we focus on measure spaces. This has the added beneTt of introducing us to the building blocks of probability theory. In probability theory, we start with a given underlying set X and assign a probability (just a real valued function) to subsets of X.Forin- stance, if the experiment is a coin toss, then X = {H,T} and the set of all possible subsets is given by P(X)={?,{H},{T},X} describedinDeTnition 9. Then we assign zero probability to the event where the ?ip of the coin results in neither an H nor a T (i.e. μ(?) = 0), we assign probability one to the event where the ?ip results in either H or T (i.e. μ(X)=1),andwe assign probability 1 2 totheeventwherethe?ip of the fair coin results in H (i.e. μ(H)= 1 2 ). 1 In Section 4.5, we saw that in the (I p ) space of (countably) inTnite sequences, the dis- tance measure involved countable sums; that is, d(<x n >,< y n >)=( P ∞ n=1 (x n ?y n ) p ) 1 p . Integration is just the uncountable analogue of summation. 149 150 CHAPTER 5. MEASURE SPACES One of the important results we show in this Chapter is that the collec- tion of Lebesgue measurable sets is a σ-algebra in Theorem 341 and that the collection of Borel sets is a subset of the Lebesgue measurable sets in Theorem 346. Then we introduce the concept of measurability of a function and a correspondence and deTne the Lebesgue integral of measurable func- tions. Then we provide a set of convergence theorems for the existence of a Lebesgue integral which are applicable under di?erent conditions. These are the Bounded Convergence Theorem 386, Fatou?s Lemma 393, the Monotone Converge Theorem 396, the Lebesgue Dominated Convergence Theorem 404, and Levi?s Theorem 407. Essentially these provide conditions under which alimitcanbeinterchangedwithanintegral. Thenweintroducegeneral and signed measures. Here we have two important results, namely the Hahn Decomposition Theorem 427 of a measurable space with respect to a signed measure and the Radon-Nikodyn Theorem 434 where a signed measure can be respresented simply by an integral. The chapter is concluded by introduc- ing an example of a function space (which is the subject of the next chapter 6). In particular, we focus on the space of integrable functions, denoted L 1 , and prove it is complete in Theorem 443. 5.1 Lebesgue Measure Before embarking on the general deTnition of a measure space, in the context of a simple set X = R we will introduce the notion of length (again just a real-valued function deTnedonasubsetofR), describe desireable properties of a measure space, and describe a simple measure related to length. DeTnition 326 A set function associates an extended real number to each set in some collection of sets. In R,thelength l(I) of an interval I ? R is the di?erence of the endpoints of I. 2 Thus, in the case of the set function length, the domain is the collection of all intervals. We would like to extend the notion of length to more complicated sets than intervals. For instance, we could deTne the ?length? of an open set to be the sum of the lengths of open intervals of which it is composed. Since the collection of open sets is quite restrictive, we would like to construct a set 2 That is l(I)=b?a with a,b ∈R∪{?∞,∞},a<b,andI =[a,b],(a,b),[a,b),(?∞,b], etc. 5.1. LEBESGUE MEASURE 151 function f that assigns to each set E in the collection P(R) a non-negative extended real number fEcalled the measure of E (i.e. f : P(R)→R + ∪{∞}). Remark 1 The ?ideal? properties of the set function f : P(R)→R + ∪{∞} are: (i) fE is deTned for every set E ?R; (ii) for an interval I, fI = l(I); (iii) f is countably additive; that is, if {E n } n∈N is a collection of disjoint sets (for which f is deTned), f(∪E n∈N )= P n∈N fE n ;and(iv)f is translation invariant; that is, if E is a set for which f is deTned and if E +y is the set {x + y : x ∈ E} obtained by replacing each point x ∈ E by the point x + y, then f(E +y)=fE. 3 Unfortunately, it is impossible to construct a set function having all four of the properties in Remark 1. As a result at least one of these four properties must be weakened. ? Following Henri Lebesgue, it is most useful to retain the last three properties (ii)-(iv) and to weaken the property in (i) so that fE need not be deTned on P(R). ? It is also possible to weaken (iii) by replacing it with Tnite addi- tivity (i.e., require that for each Tnite collection {E n } N n=1 , we have f(∪E N n=1 )= P N n=1 fE n ). 5.1.1 Outer measure Another possibility is to retain (i),(ii),(iv), and weaken (iii) in Remark 1 to allow countable subadditivity (i.e., f(∪E n∈N ) ≤ P n∈N fE n ). A set function which satisTes this is called the outer measure. DeTnition 327 For each set A ?R,let{I n } n∈N denote a countable collec- tion of open intervals that covers A (i.e. collections such that A ?∪ n∈N I n ) and for each such collection consider P n∈N l(I n ).Theouter measure m ? : P(R)→R + ∪{∞} is given by m ? (A)= inf {I n } n∈N ( X n∈N l(I n ):A ?∪ n∈N I n ) . 3 For instance, translation invariance simply says the length of a unit interval starting at 0 should be the same as a unit interval starting at 3. 152 CHAPTER 5. MEASURE SPACES Thus, the outer measure is the least overestimate of the length of a given set A. The outer measure is well deTned since each element of P(R) (i.e. subset A ? R) can be covered by a countable collection of open intervals which follows from Theorem 108. We establish the properties of the outer measure in the next series of theorems. Theorem 328 (i) m ? (A) ≥ 0. (ii) m ? (?)=0. (iii) If A ? B, then m ? (A) ≤ m ? (B) (i.e. monotonicity). (iv) m ? (A)=0for every singleton set A.(v) m ? is translation invariant. Exercise 5.1.1 Prove Theorem 328. Theorem 2.2, p. 56 of Jain and Gupta. The next theorem shows that we can extend the notion of length that is deTned for any subset of R. Theorem 329 The outer measure of an interval is its length. Proof. (Sketch) Let {I n } be an open covering of [a,b]. Then by the Heine- Borel Theorem 194 there is a Tnite subcollection of intervals that also covers [a,b].Arrange them such that their left endpoints form an increasing sequence a 1 <a 2 <...<a n . See Figure 5.1.1. Since [a,b]is connected, intervals must overlap which means that ∪ N i=1 (a i ,b i )=(a 1 ,b k )forsomek with 1 ≤ k ≤ N and [a,b] ? (a 1 ,b k ). Thus b?a ≤ (b k ?a 1 ) ≤ P ∞ n=1 !(I n ). DeTnition 330 Let {A n } be a countable collection of sets with A n ?R.We say m ? is countably subadditive if m ? (∪ n∈N A n ) ≤ P n∈N m ? A n . Theorem 331 Let {A n } be a countable collection of sets with A n ?R. Then m ? is countably subadditive. Proof. (Sketch) By the inTmum property, for a given ε>0, there is a countable collection of intervals {I n k } k∈N covering A n (i.e. A n ?∪ k∈N I n k ) such that P k∈N l(I n k ) ≤ m ? (A n )+ ε 2n .Noticethat∪ n∈N A n must be covered by ∪ n∈N (∪ k∈N I n k ) which is a countable union of countable sets and hence countable. By monotonicity of m ? we have m ? (∪ n∈N A n ) ≤ X n∈N m ? (A n )+ε 5.1. LEBESGUE MEASURE 153 since P ∞ n=1 ε 2 n = ε. Subadditivity follows since ε ≥ 0 was arbitrary and we can let ε → 0. There are also two important corollaries that follow from Theorem 331. The Trst important point is that there are unbounded sets with with Tnite outer measure. Corollary 332 If A is a countable set, then m ? (A)=0. Proof. Since A is countable, it can be expressed as {a 1 ,a 2 , ...,a n ,...}. Given ε>0, we can enclose each a n in an open interval I n with l(I n )= ε 2 n to get m ? (A) ≤ X n∈N l(I n )= X n∈N ε 2 n = ε. The result follows as we let ε → 0. One important example of this is to let A = Q (i.e. the rationals are a set of outer measure zero). The contrapositive of Corollary 332, that a set with outer measure di?erent from zero is uncountable, is obviously true. Corollary 333 [0,1] is uncountable. Proof. Suppose, tothe contrary, that [0,1] is countable. Thenby Corollary332, m ? ([0,1]) = 0 in which case l([0,1]) = 0 by Theorem 329, which leads to the contradiction. The converse of Corollary 332, that a set with outer measure zero is count- able is not always true. To see this, consider the Cantor set F constructed in Section 3.4. In particular, F = n∈N F n ? ≡ [0,1][ n∈N A n ! where ? A 1 =( 1 3 , 2 3 ) ? A 2 =( 1 9 , 2 9 )∪( 3 9 , 6 9 )∪( 7 9 , 8 9 ) ? A 3 =( 1 27 , 2 27 )∪( 1 9 , 2 9 )∪( 7 27 , 8 27 )∪ ( 1 3 , 2 3 )∪( 19 27 , 20 27 )∪( 7 9 , 8 9 )∪( 25 27 , 26 27 ), ? etc. 154 CHAPTER 5. MEASURE SPACES But ? m ? (A 1 )= 1 3 ? m ? (A 2 )= 1 9 + 1 3 + 1 9 =2 1 · ? 1 3 ¢ 2 +2 0 · ? 1 3 ¢ 1 ? m ? (A 3 )= 1 27 + 1 9 + 1 27 + 1 3 + 1 27 + 1 9 + 1 27 =2 2 · ? 1 3 ¢ 3 +2 1 · ? 1 3 ¢ 2 +2 0 · ? 1 3 ¢ 1 andingeneral m ? (A n )=2 n?1 · μ 1 3 ? n +2 n?2 · μ 1 3 ? n?1 +... +2 1 · μ 1 3 ? 2 +2 0 · μ 1 3 ? 1 = 1 3 " μ 2 3 ? n?1 + μ 2 3 ? n?2 +...+ μ 1 3 ? 1 +1 # = 1 3 · " 1? ? 2 3 ¢ n 1? ? 2 3 ¢ # =1? μ 2 3 ? n . Since F 1 ? F 2 ? ... ? F n ? ... and m ? (F 1 )= 2 3 < ∞, by Theorem 344. m ? (F) = lim n→∞ m ? (F n ) = lim n→∞ m ? ([0,1]\A n )= lim n→∞ [m ? ([0,1])?m ? (A n )] =1? lim n→∞ m ? (A n )=1? lim n→∞ 1? μ 2 3 ? n =1?1=0. Hence, the Cantor set presents an example of an uncountable set with outer measure zero. Sets of outer measure zero provide another notion of ?small? sets. From the point of view of cardinality, F is big (uncountable) while Q is small (countable). From the topological point of view, F is small (nowhere dense) while Q is big (dense). From the point of view of measure, both F and Q are small (measure zero). 5.1.2 L?measurable sets While the outer measure has the advantage that it is deTned for P(R), Theo- rem 331 showed that it is countably subadditive but not necessarily countably additive. In order to satisfy countable additivity, we have to restrict the do- main of the function m ? to some suitable subset, call it L (for Lebesgue) of P(R). The members of L are called L-measurable sets. 5.1. LEBESGUE MEASURE 155 DeTnition 334 AsetE ?R is (Lebesgue) L-measurable if ?A ?R we have m ? (A)=m ? (A∩E)+m ? (A∩E c ). The deTnition of L-measurability says that the measurable sets are those (bounded or unbounded) which split every set (measurable or not) into two parts that are additive with respect to the outer measure. Since A =(A∩E)∪(A∩E c )andm ? is subadditive, we always have m ? (A) ≤ m ? (A∩E)+m ? (A∩E c ). Thus, in order to establish that E is measurable, we need only show, for any set A,that m ? (A) ≥ m ? (A∩E)+m ? (A∩E c ). (5.1) Inequality (5.1) is often used in practice to determine whether a given set E is measurable where A is called the test set. Since DeTnition 334 is symmetric in E and E c ,wehavethatE c is L- measurable whenever E is. Clearly, ? and R are L-measurable. Lemma 335 If m ? (E)=0,thenE is L-measurable. Proof. Let A ? R be any set. Since A ∩ E ? E we have m ? (A ∩ E) ≤ m ? (E) = 0. Since A∩E c ? A we have m ? (A) ≥ m(A∩E c )=m ? (A∩E c )+ m ? (A∩E) which follows from above. Hence E is L-measurable. Corollary 336 Every countable set is L-measurable and its measure is zero. Proof. From Lemma 335 and Corollary 332. Exercise 5.1.2 Show that if m ? (E)=0,then m ? (E ∪A)=m ? (A) and that if in addition A ? E, then m ? (A)=0. Lemma 337 If E 1 and E 2 are L-measurable, so is E 1 ∪E 2 . Proof. Since E 1 and E 2 are L-measurable, for any set A,we have m ? (A)=m ? (A∩E 1 )+m ? (A∩E c 1 ) = m ? (A∩E 1 )+m ? ([A∩E c 1 ]∩E 2 )+m ? ([A∩E c 1 ]∩E c 2 ) = m ? (A∩E 1 )+m ? (A∩E 2 ∩E c 1 )+m ? (A∩[E 1 ∪E 2 ] c ) ≥ m ? (A∩[E 1 ∪E 2 ]) +m ? (A∩[E 1 ∪E 2 ] c ) 156 CHAPTER 5. MEASURE SPACES where the Trst equality follows from DeTnition 334 and the fact that E 1 is measurable, the second equality follows from the deTnition and taking the test set to be A ∩E c 1 and E 2 is measurable, the third equality follows from simple set operations like DeMorgan?s law, and the inequality follows from the subadditivity of m ? and the fact that [A∩E 1 ] ∪ [A∩E 2 ∩E c 1 ]= A∩[E 1 ∪E 2 ]. 4 But this satisTes (5.1), which is su?cient for L-measurability. Corollary 338 The collection L of all L-measurable sets is an algebra of sets in P(R). Proof. Follows from DeTnition 81, the symmetry (w.r.t. complements) in DeTnition 334, and Lemma 337 Lemma 339 Let A ? R be any set and {E n } N n=1 be a Tnite collection of disjoint L-measurable sets in R. Then m ? ? A∩ £ ∪ N n=1 E n ¤¢ = P N n=1 m ? (A∩ E n ). Proof. The result is clearly true for N = 1. Consider an induction on N. Suppose the result is true for N?1. Since the E n are disjoint, A∩ £ ∪ N n=1 E n ¤ ∩ E N = A∩E N and A∩ £ ∪ N n=1 E n ¤ ∩E c N = A∩ £ ∪ N?1 n=1 E n ¤ . Then m ? ? A∩ £ ∪ N n=1 E n ¤¢ = m ? (A∩E N )+m ? ? A∩ £ ∪ N?1 n=1 E n ¤¢ = m ? (A∩E N )+ N?1 X n=1 m ? (A∩E n ) where the Trst equality follows from DeTnition 334 and the second follows since the result is true for N ?1. Corollary 340 If {E n } N n=1 is a Tnite collection of disjoint L-measurable sets in R,thenm ? ? ∪ N n=1 E n ¢ = P N n=1 m ? E n . Proof. Taking A = R, the result follows from Corollary 338 and Lemma 339. The result in Corollary 340 veriTes that m ? restricted to L is Tnitely additive. However, we would like to extend it to the more general case of countable additivity. First, we must show that L is a σ-algebra (as discussed in section 2.6) so that (∪ ∞ n=1 E n ) ∈L for any {E n ,E n ∈L} so that the m ? is well deTned. 4 That is, we know m ? ([A∩E 1 ]∪[A∩E 2 ∩E c 1 ]) = m ? (A∩[E 1 ∪E 2 ]) and subadditiv- ity implies m ? ([A∩E 1 ]∪[A∩E 2 ∩E c 1 ]) ≤ m ? ([A∩E 1 ]) +m ? ([A∩E 2 ∩E c 1 ]) 5.1. LEBESGUE MEASURE 157 Theorem 341 The collection L of all L-measurablesetsisaσ-algebra of sets in P(R). Proof. (Sketch)Let E = ∪ n∈N E n . First we use the fact that Lis an algebra: i.e. ∪ n∈N E n ∈L and that ? ∪ N n=1 E n ¢ c ? (∪ n∈N E n ) c = E c . Hence m ? (A) ≥ N X n=1 m ? (A∩E n )+m ? (A∩E c ). By letting N →∞and using countable subaddivity of m ? we get where the Trst equality follows by DeTnition 334, the inequality follows since F c N ? E c5 , and the last equality follows by Lemma 339. Since the left hand side of (5.13) is independent of N,letting N →∞we have m ? (A) ≥ m ? (A∩E)+m ? (A∩E c ). DeTnition 342 The set function m : L→R + ∪{∞},obtainedbyrestricting the functions m ? to the σ-algebra L ? P(R) is called the Lebesgue mea- sure. That is, m = m ? | r L. 6 Thenextresultshowsthatafterrelaxingpoint(i)inRemark1wecan satisfy property (iii) with the Lebesgue measure. Theorem 343 If {E n } n∈N is a countable collection of disjoint sets in R, then m(∪ n∈N E n )= P n∈N m(E n ). Proof. Since ? ∪ N n=1 E n ¢ ? (∪ n∈N E n ),?N ∈N, andbothsetsareL?measurable by Theorems 338and 341, we have m(∪ n∈N E n ) ≥ m ? ∪ N n=1 E n ¢ = P N n=1 m(E n ) where the equality follows by Corollary 340. Since the left hand of the inequality is independent of N, letting N →∞we have m(∪ n∈N E n ) ≥ P n∈N m(E n ). Since the reverse inequality holds by countable subaddivity in Theorem 331, the result follows. The next property will be useful in proving certain convergence properties in upcoming sections and can be viewed as a continuity property of the Lebesgue measure. 5 Recall by DeMorgan?s Law that F c N = £ ∪ N n=1 E n ¤ c = ∩ N n=1 E c n . 6 This follows from DeTnition 56. 158 CHAPTER 5. MEASURE SPACES Theorem 344 Let <E n > be an inTnite decreasing sequence of L?measurable sets (i.e. E n+1 ? E n , ?n). Let mE 1 be Tnite. Then m(∩ ∞ i=1 E i ) = lim n→∞ m(E n ). Proof. Since L is a σ-algebra, ∩ ∞ n=1 E n ∈ L.ThesetE 1 \∩ ∞ n=1 E n can be written as the union of mutually disjoint sets {E n \E n+1 } (see Figure 5.1.2) E 1 \∩ ∞ n=1 E n =(E 1 \E 2 )∪(E 2 \E 3 )∪...∪(E n \E n+1 )∪... Then using countable additivity of m we have m(E 1 )?m(∩ ∞ n=1 E n )=m(E 1 \∩ ∞ n=1 E n )=m(∪ ∞ n=1 E n \E n+1 ) = ∞ X n=1 m(E n \E n+1 )= ∞ X n=1 [m(E n )?m(E n+1 )] = m(E 1 )? lim n→∞ E n . Comparing the beginning and end we have m(∩ ∞ n=1 E n ) = lim n→∞ E n . 5.1.3 Lebesgue meets borel Now that we know that L is a σ-algebra, we might ask what type of sets belong in L? For example, are open and/or closed sets in L? Lemma 345 The interval (a,∞) is L?measurable. Proof. (Sketch) Take any Let A.The open ray (a,∞)splitsA into two dis- joint parts A 1 =(a,∞)∩Aand A 2 =(?∞,a].According to (5.1), it is su?ces to show m ? (A) ≥ m ? (A 1 )+m ? (A 2 ). For ε>0, there is a countable collec- tion {I n } of open intervals which covers A satisfying P ∞ n=1 l(I n ) ≤ m ? (A)+ε by the inTmum property in DeTnition 327. Again, (a,∞)splitseachinter- val I n ∈ {I n } into two disjoint intervals I 0 n and I 00 n .Clearly{I 0 n } covers A 1 , {I 00 n } covers A 2 ,and P n !(I n )= P n !(I 0 n )+ P n !(I 00 n ). By monotonicity and subaddivity of m ? we have m ? (A 1 )+m ? (A 2 ) ≤ m ? (∪ n I 0 n )+m ? (∪ n I 00 n ) ≤ ∞ X n=1 !(I 0 n )+ ∞ X n=1 !(I 00 n ) = ∞ X n=1 !(I n ) ≤ m ? (A)+ε. 5.1. LEBESGUE MEASURE 159 But since ε>0 was arbitrary, the result follows. The next result shows that every open or closed set in R is L-measurable. Theorem 346 Every Borel set is L?measurable. Proof. The result follows from Theorem 124 that the collection of all open rays generates B Thus, the Lebesgue measure m is deTned for Borel sets. Hence we can work with sets we know a lot about. While it is beyond the scope of the book, we note that there are examples of sets that show B ?L and L?P(R). 7 The next theorem gives a useful characterization of measurable sets. It asserts that a measurable set can be ?approximated? by open and closed sets. See Figure 5.1.3. Theorem 347 Let E beameasurablesubsetofR. Then for each ε>0, thereexistsanopensetG andaclosedsetF such that F ? E ? G and m(G\F) <ε. Proof. Since E is measurable, m(E)=m ? (E). We use the inTmum property for sets E and E c .Given ε 2 , there exist open sets G and H (remember that the union of open intervals is an open set) such that E ? G and m(G) <m(E)+ ε 2 and E c ? H and m(H) <m(E c )+ ε 2 . Set F = H c . See Figure 5.1.4. Then by the properties of complements, we have that F is closed, E ? F,andm(E) ? m(F) < ε 2 . Thus we have F ? E ? G and m(G\F)=m(G)?m(F)=m(G)?m(E)+m(E)?m(F) < ε 2 + ε 2 = ε. The Trst equality is due to additivity and the second is simply an identity. 5.1.4 L-measurable mappings Before we actually begin to integrate a mapping, we must know that a given mapping is integrable. We break this topic up into two parts: functions and correspondences. 7 For an example of a non-measurable set see p. 289 of Carothers (2000). 160 CHAPTER 5. MEASURE SPACES functions Roughly speaking, a function is integrable if its behavior is not too irregular and if the values it takes on are not too large too often. We now introduce the notion of measurability which gives precisely the conditions required for integrability, provided the function is not too large. DeTnition 348 Let f : E →R∪{?∞,∞} where E is L-measurable. Then f is L-measurable if the set {x ∈ E : f(x) ≤ α}∈L, ?α ∈R. It is clear from the above deTnition that there is a close relation between measurability of a function and the measurability of the inverse image set. In particular, it can be shown that f is L-measurable i? for any closed set G ?R,inverseimagef ?1 (G) is a measurable set. See Figure 5.1.4.1. As α varies, the behavior of the set {x ∈ E : f(x) ≤ α} describes how the values of the function f are distributed. The smoother is f,thesmaller the variety of inverse images which satisfy the restriction on f. Example 349 Consider an indicator or characteristic function χ A : R→R given by χ A (x)= ? 1 if x ∈ A 0 if x/∈ A . with A ?R.Thenχ A is L-measurable i? A ∈L. To see this, note that {x ∈R : χ A (x) ≤ α} = ? ? ? ? if α<0 R\A if 0 ≤ α<1 R 1 ≤ α . But {?,A c ,R}∈L. Figure 5.1.4.2a. Example 350 Let the function f :[0,1] →R be given by f(x)= ? ? ? 1 if x =0 1 x if 0 <x<1 2 if x =1 . Notice that this function is neither continuous nor monotone. To see that f is L-measurable, note that {x ∈R : f(x) ≤ α} = ? ? ? ? ? ? ? ? α<1 {0} α =1 [ 1 α ,1)∪{0} 1 <α<2 [ 1 α ,1]∪{0} 2 ≤ α . 5.1. LEBESGUE MEASURE 161 Again, all these sets are in L. See Figure 5.1.4.2b. This example shows that an L-measurable function need not be continuous. The next result establishes that there are many criteria by which to es- tablish measurability of a function. Theorem 351 Let f : E →R∪{?∞,∞} where E is L-measurable. Then the following statements are equivalent: (i) {x ∈ E : f(x) ≤ α} is L- measurable ?α ∈ R; (ii) {x ∈ E : f(x) >α} is L-measurable ?α ∈ R; (iii) {x ∈ E : f(x) ≥ α} is L-measurable ?α ∈ R;(iv){x ∈ E : f(x) <α} is L-measurable ?α ∈ R; These statements imply (v) {x ∈ E : f(x)=α} is L-measurable ?α ∈R∪{?∞,∞}. Proof. (i) ? (ii) ? (iii) ? (iv) ? (i) can be established from {x ∈ E : f(x) >α} = E\{x ∈ E : f(x) ≤ α} {x ∈ E : f(x) ≥ α} = ∩ ∞ n=1 ? x ∈ E : f(x) >α? 1 n ? {x ∈ E : f(x) <α} = E\{x ∈ E : f(x) ≥ α} {x ∈ E : f(x) ≤ α} = ∩ ∞ n=1 ? x ∈ E : f(x) <α+ 1 n ? where each operation follows since L is a σ-algebra (which is closed under complementation and countable intersection). Next, if α ∈ R,then{x ∈ E : f(x)=α} = {x ∈ E : f(x) ≤ α} ∩ {x ∈ E : f(x) ≥ α}.Ifα = ∞,thensince{x ∈ E : f(x)=∞} = ∩ ∞ n=1 {x ∈ E : f(x) ≥ n} we have (iii) ?(v). A similar result holds for α = ?∞.where the Trst line follows since L is a σ-algebra (which is closed under countable intersection) and the second follows since the di?erence of two measurable sets is measurable. Next we present some properties of L-measurable functions Lemma 352 (i) If f is an L-measurable function on the set E and E 1 ? E is an L-measurable set, then f is an L-measurable function on E 1 . (ii) If f and g are L-measurable functions on E,thentheset{x ∈ E : f(x) <g(x)} is L-measurable. Proof. (i) follows since {x ∈ E 1 : f(x) >α} = {x ∈ E : f(x) >α}∩E 1 and the intersection of two L-measurable sets is measurable. (ii) DeTne 162 CHAPTER 5. MEASURE SPACES A q = {x ∈ E : f(x) <q<g(x)} with q ∈ Q whose existence is guaranteed by Theorem 100. Then A q = {x ∈ E : f(x) <q}∩{x ∈ E : g(x) >q} and {x ∈ E : f(x) <g(x)} = ∪ q∈Q A q , which is a countable union ofL-measurable sets. The next theorem establishes that certain operations performed on L- measurable functions preserve measurability. Theorem 353 Let f and g be L-measurable functions on E and c be a constant. Then the following functions are L-measurable: (i) f ±c; (ii) cf; (iii) f ±g;(iv)|f|;(v)f 2 ; (vi) fg. Proof. (i) {x ∈ E : f(x)±c>α} = {x ∈ E : f(x) >α 0 } with α 0 = α?c so f ±c is L-measurable when f is. (ii) If c =0, then cf is L-measurable since any constant function is L- measurable. Otherwise {x ∈ E : cf(x) >α} = ? {x ∈ E : f(x) >α 0 } if c>0 {x ∈ E : f(x) <α 0 } if c<0 with α 0 = α c is L-measurable since f is L-measurable. (iii) {x ∈ E : f(x)+g(x) >α} = {x ∈ E : f(x) >α?g(x)}. Since α?g is L-measurable by (i) and (ii), then f +g is L-measurable by Lemma 352. (iv) Follows since {x ∈ E : |f(x)| >α} = ? E if α<0 {x ∈ E : f(x) >α}∪{x ∈ E : f(x) < ?α} if α ≥ 0 andbothsetsontherhsareL-measurable since f is L-measurable. (v) Follows from {x ∈ E :(f(x)) 2 >α} = ? E if α<0 {x ∈ E : |f(x)| >α} if α ≥ 0 and (iv). (vi) Follows from the identity fg = 1 2 [(f +g) 2 ?f 2 ?g 2 ] and (ii), (iii), (v). Parts (ii) and (iii) of Theorem 353 imply that scaled linear combinations of indicator functions deTned on measurable sets are themselves measurable functions. This type of function, known as a simple function, will play an important role in approximating a given function. As we will see in the next 5.1. LEBESGUE MEASURE 163 section, unlike the standard (Riemann) way of approximating the integral of f by calculating the area under f basedonpartitionsofthedomain,inthis chapter we will be approximating the integral of f by calculating the area under f based on partitions of the range. See Figure 5.1.4.2c. DeTnition 354 Afunction? : E →R given by ?(x)= n X i=1 a i χ E i (x) (5.2) is called a simple function if there is a Tnite collection {E 1 ,...,E n } of disjoint L?measurable sets with ∪ n i=1 E i = E and a Tnite set of real numbers {a 1 ,...,a n } such that a i = ?(x),?x ∈ E i for i =1,...,n where χ E i (x) is an indicator function introduced in Example 349. The right hand side of (5.2) is called the representation of ?. We note that the real numbers {a i } and the sets {E i } in this representa- tion are not uniquely determined as the next example shows. Example 355 Let {E 1 ,E 2 ,E 3 } be disjoint subsets of an L-measurable set E. Consider the two simple functions 2χ E 1 +5χ E 2 +2χ E 3 and 2χ E 1 ∪E 3 +5χ E 2 . Clearly these two simple functions are equal. Notice that the coe?cients in the Trst representation are not distinct. Since a simple function obtains only a Tnite number of values {a 1 ,...,a n } on E we can construct the inverse image sets {A i } as A i = {x ∈ E : x = ? ?1 ({a i })}, i =1,...,n, n ∈ R. In this example, A 1 = {x ∈ E : x = ? ?1 ({2})} = E 1 ∪ E 3 and A 2 = {x ∈ E : x = ? ?1 ({5})} = E 2 , both of which are L-measurable, disjoint sets. To avoid such problems with non-uniqueness, we use this construction to deTne the canonical (or standard) representation of ? : E →R by ?(x)= P k i=1 a i χ A i (x) where the Tnite collection {A 1 ,...,A k }of L?measurable sets are disjoint with∪ k i=1 A i = E and the Tnite set of real numbers {a 1 ,...,a k } are distinct and nonzero. The next theoremandexercise provide su?cient conditions forL-measurability. Theorem 356 A continuous function deTned on an L-measurable set is L- measurable. 164 CHAPTER 5. MEASURE SPACES Proof. Let f be a continuous function deTned on E (which is L-measurable). Consider the set A = {x ∈ E : f(x) >α} which is the inverse image of the open ray (α,∞). Since f is continuous, Theorem 248 implies f ?1 ((α,∞)) is open and hence L-measurable. Exercise 5.1.3 Show that any monotone function f : R→Ris L-measurable. Next we consider how sequences of L-measurable functions behave. Theorem 357 Let <f n > be a sequence of functions on a common do- main E. Then the functions max{f 1 ,...,f n }, min{f 1 ,...,f n }, sup n f n , inf n f n , limsup n f n ,andliminf n f n are all L-measurable. Proof. If g(x)=max{f 1 (x),...,f n (x)},then {x ∈ E : g(x) >α} = ∪ n i=1 {x ∈ E : f i (x) >α}andL-measurability of eachf i impliesg isL-measurable. Sim- ilarly, if h(x)=sup n f n (x), then {x ∈ E : h(x) >α} = ∪ ∞ i=1 {x ∈ E : f i (x) > α}. A similar argument, along with the fact that inf n f n = ?sup n (?f n ) and (ii) of Theorem 353, establishes the corresponding statements for inf . To establish the last results, note that liminf n f n = ?limsup n (?f n )= sup n (inf k≥n f k ). Using the above results, for any function f : E → R, we can construct non-negative functions f + =max{f,0} and f ? =max{?f,0}. The function f is L-measurable i? both f + and f ? are L-measurable. It is also easy to verify that f = f + ?f ? and |f| = f + +f ? . See Figure 5.1.4.3. Corollary 358 (i) If <f n > is a sequence of L-measurable functions con- verging pointwise to f on E, then f is L-measurable. (ii) The set of points on which <f n > converges is L-measurable. Proof. (i) Since f n → f, we have limsup n f n = liminf n f n = f by Theorem 321 and the result thus follows from Theorem 357 above. (ii) From (i), the set {x ∈ E : limsup n f n ?liminf n f n =0} is L-measurable by (v) of Theorem 351. In some cases, two functions may be ?almost? the same in the sense of L-measurability. The next deTnition helps us make that precise. DeTnition 359 Apropertyissaidtoholdalmost everywhere (a.e.) if the set of points where it fails to hold is a set of measure zero. 5.1. LEBESGUE MEASURE 165 Example 360 Let f :[0,1]→{0,1} be given by f(x)= ? 1 if x ∈Q 0 otherwise known as the Dirichlet function. While this function is famous since it is everywhere discontinuous (which we will use in Section 5.2.1, here we simply use it to illustrate the concept of almost everywhere. In particular, f(x)=0 a.e. since {x ∈ [0,1] : f(x) 6=0} = {x ∈ [0,1] : x ∈ Q} and m({x ∈ [0,1] : x ∈ Q})=0which follows from the countability of the rationals established in Example 77 and Corollary 336.. Theorem 361 Let f and g have domain E and let f be an L-measurable function. If f = g a.e., then g is measurable. Proof. Let D = {x ∈ E : f(x) 6= g(x)}. Then mD = 0 by assumption. Let α ∈R, and consider {x ∈ E : g(x) >α} = {x ∈ E\D : f(x) >α}∪{x ∈ D : g(x) >α} =[{x ∈ E : f(x) >α}\{x ∈ D : g(x) ≤ α}]∪{x ∈ D : g(x) >α} Since f is L-measurable, the Trst set is L-measurable. Furthermore, since the other two sets are contained in D, which has measure 0, they are L- measurable by Lemma 335 and Exercise 5.1.2. Now we consider a weaker notion of continuity than considered in Theo- rem 356. Theorem 362 If a function f deTned on E (which is L-measurable) is con- tinuous a.e., then f is L-measurable on E. Proof. Follows from Theorem 356. Theorem 362 thus establishes the su?cient condition such that the dis- continuous and non-monotone function in Example 350 is L-measurable. Now we consider a weaker version of convergence than considered in Corollary 358. DeTnition 363 Asequence<f n > of functions deTned on E is said to converge a.e. toafunctionf if lim n→∞ f n (x)=f(x),?x ∈ E\E 1 where E 1 ? E with mE 1 =0. 166 CHAPTER 5. MEASURE SPACES Theorem 364 If a sequence <f n > of L-measurable functions converges a.e. to the function f,thenf is L-measurable. Proof. Follows from Corollary 358. Example 365 Let <f n > be given by <x n > on [0,1] which converges pointwise to f = ? 0 if x ∈ [0,1) 1 if x =1 and f is L-measurable since it is the constant (zero) function almost every- where. The next theorem establishes that if a sequence of functions converges pointwise, then we can isolate a set of points of arbitrarily small measure such that on the complement of that set the convergence is uniform. Theorem 366 Let E be an L-measurable set with mE < ∞ and <f n > be a sequence of L-measurable functions deTned on E.Letf : E → R be such that ?x ∈ E, f n (x) → f(x). Then given ε>0 and δ>0, ? an L-measurable set A ? E with m(A) <δand ?N such that for n ≥ N and x/∈ A we have |f n (x)?f(x)| <ε. Proof. Let G n = {x ∈ E : |f n (x)?f(x)| ≥ ε and E k = ∪ ∞ n=k G n = {x ∈ E : |f n (x)?f(x)| ≥ ε for some n ≥ k}. Thus E k+1 ? E k and for each x ∈ E theremustbesomesetE k such that x/∈ E k otherwise we would violate the assumption f n (x) →f(x),?x ∈ E. Thus <E k > is a decreasing sequence of L-measurable sets for which ∩ ∞ k=1 E k = ? so that by Theorem 344 we have lim k→∞ mE k = 0. Hence given δ>0,?N such that mE N <δ;thatis, m{x ∈ E : |f n (x)?f(x)| ≥ ε for some n ≥ N} <δ. If we write A for this E N ,then mA < δ and E\A = {x ∈ E : |f n (x)?f(x)|≥ ε,?n ≥ N}. The next theorem says that for any L-measurable function f there exists a sequence of ?nice? functions (more speciTcally simple functions) that con- verge pointwise to f. Moreover, on the subdomain where f is bounded, this convergence is uniform. This means that a bounded measurable function can be approximated by a simple function. 5.1. LEBESGUE MEASURE 167 Theorem 367 Let f be an L-measurable function deTned on a set E.Then there exists a sequence <f n > of simple functions which converges pointwise to f on E and converges uniformly to f on any set where f is bounded. Furthermore, if f ≥ 0, then <f n > can be chosen such that 0 ≤ f n ≤ f n+1 , ?n ∈N. Proof. (Sketch) We can assume that f ≥ 0. If not, then let f = f + ?f ? where f + and f ? are non-negative. For n ∈ N,we divide the range of f (which can be unbounded) into two parts: [0,2 n )and[2 n ,∞). See Figure 5.1.4.4. Then divide [0,2 n )into2 2n ?1 equal parts. Let F n be the inverse image of [2 n ,∞)andE n,k be the inverse images of [k2 ?n ,(k +1)2 ?n ]for k =0,1,...,2 2n ?1. Since f is measure, F n and E n,k are measurable. DeTne a simple function ? n =2 n χ F n + 2 2n ?1 X k=0 k2 ?n χ E n,k . (5.3) Note that 0 ≤ ? n ≤ f and 0 ≤ f ?? n ≤ 2 ?n on ∪ 2 2n ?1 k=1 E n,k . For any x ∈ E, there exists n large enough such thatf(x) < 2 n . Hence x ∈∪E n,k implies that f(x)?? n (x) ≤ 2 ?n and thus ? n (x) → f(x). Moreover, if f is bounded, there exists an n large enough such that E = ∪ 2 2n ?1 k=1 E n,k and f(x)?? n (x) < 1 n for each x ∈ E, thus <? n > converges uniformly to f. Exercise 5.1.4 Show that ? n increases in (5.3). Hint: E n,k = E n+1,2k ∪ E n+1,2k+n . The next deTnition will be useful in Chapter FS. DeTnition 368 Let f be an L-measurable function. Then inf{α ∈ R : f ≤ α a.e.} is called the essential supremum of f, denoted esssupf, and sup{α ∈ R : f ≥ α a.e.} is called the essential inTmum of f, denoted ess inf f. Example 369 Let f :[0,1]→{?1,0,1} be given by f(x)= ? ? ? 1 if x ∈Q ++ 0 if x is irrational ?1 if x ∈Q ? which is a simple generalization of the Dirichlet function. Given the results in Example 360 we have esssupf = ess inf f =0. 168 CHAPTER 5. MEASURE SPACES correspondences Let Γ : X 3Y be a correspondence where X = R or a subset of R equipped with the Lebesque measure and L(X)isaσ? algebra of all L -measurable subsets of X and Y is a complete, separable metric space. We will introduce the concept of measurablility of correspondences the same way we deTned measurability of single-valued functions (i.e. through inverse images). We know a function f : X ?→ Y is L -measurable if f ?1 (V)isL -measurable for every open set V ? Y or equivalently f ?1 (U)isL -measurable for every closed set U ? Y. DeTnition 370 Consider a measurable space (X,L) where X ?R (or X = R ), Y is a complete separable metric space Y, and Γ : X 3 Y is a closed- value correspondence. Γ is measurable if the inverse image of each open set is a L -measurable set. That is, for every open subset V ? Y we have Γ ?1 (V)={x ∈ X : Γ(x)∩V 6= ?}∈L. Notice that measurability is deTned only for closed valued correspon- dences. Given a correspondence Γ : X 3 Y we can ask under what conditions there exists a measurable selection of Γ (i.e. a single-valued, L?measurable function f : X ?→ Y such that f (x) ∈ Γ(x) for all xdX. The following theorem says that every L?measurable correspondence has a measurable selection provided the spaces X and Y have certain properties. Theorem 371 (Measurable Selection) Let (X,L) be a Lebesgue measur- able space, let Y be a complete separable metric space, and let Γ : X 3Y be a L?measurable, closed valued correspondence. Then there exists a measurable selection of Γ. Proof. (Sketch) By induction, we will deTne a sequence of measurable functions f n : X ?→ Y such that (i) f n (z)issu?ciently close to Γ(z)(i.e.d(f n (z),Γ(z)) < 1 2 n )and (ii)f n (z)andf n+1 (z)aresu?cientlyclose toeach other (i.e. d(f n+1 (z),f n (z)) ≤ 1 2 n?1 on X for all n). Then we are done, since from (ii)itfollowsthathf n (z)i is Cauchy for each z and due to completeness of Y there exists a function f : X ?→ Y such that f n (z) ?→ f (z)onX pointwise and by Corollary 358 the pointwise limit f of a sequence of measurable functions is measurable. Hence we take f 5.1. LEBESGUE MEASURE 169 as a measurable selection. Condition (i) guarantees that f (z) ∈Γ(z),?z ∈ X (hereweusethefactthatΓ(z)isclosedandd(f(z),Γ(z)) = 0 implies f(z) ∈Γ(z)) by Exercise 4.1.3 Now we construct a sequence hf n i of measurable functions satisfying (i) and (ii).Let {y n ,n∈N} be a dense set in Y (since Y is separable such a countable set exists). DeTne f k (z)=y p where p is the smallest integer such that the ball with center at y p with radius 1 k has non-empty intersection with Γ(z). See Figure 5.1.4.5. It can be shown that f k is measurable and <f k > satisfy (i) and (ii). How is measurability of a correspondence related to upper or lower hemi- continuity? We would expect that hemicontinuity implies measurability and we now show that this is true (a result similar to that for functions in Theo- rem 356. In the case of lower hemicontinuity we get the result immediately. Lemma 372 Under the assumptions of Theorem 371 if Γ : X 3 Y is lhc, then Γ is measurable. Proof. Since Γ is lhc, then f ?1 (V)isopenforV ? Y open. Since open sets are L -measurable, then f ?1 (V) ∈L so that f is L -measurable. To show that uhc implies measurability, we show that open sets can be replaced by closed sets in DeTnition 370. Lemma 373 Under the assumption of the Theorem 371,Γ : X 3Y is mea- surable i? f ?1 (U) is L -measurableforeveryclosedsubsetU ? Y. Proof. ??=?LetV be an open subset of Y. DeTne the closed sets C n = ? xdY,d(x,Y\V) ≥ 1 n a . Then V = ∪ n∈N C n . Consequently Γ(x)∩V 6= ? i? Γ(x)∩C n 6= ? for some n. This yields Γ ?1 (V)=∪ n∈N Γ ?1 (C n ) ∈L because Γ ?1 (C n ) ∈ L by assumption (because C n is closed) and ∪ n∈N Γ ?1 (C n ) ∈ L (because L is σ-algebra). ?=? ? We omit this direction since it would require introducing measur- ability on the Cartesian product X ×Y. [See Aubin-Frankowske Section 8.3 pg.319]. Lemma 374 Under the assumption of Theorem 371, if Γ : X 3 Y is uhc then Γis measurable. Proof. Since Γ is uhc, then f ?1 (U)isclosedforU closed and closed sets are L -measurable. Hence f ?1 (U) ∈L. Thus f is L -measurable by Lemma 373. 170 CHAPTER 5. MEASURE SPACES 5.2 Lebesgue Integration In introductory calculus classes, you were introduced to Rieman integration. While simple, it has many defects. First, the Rieman integral of a function is deTned on a closed interval and cannot be deTned on an arbitrary set. Second, a function is Rieman integrable if it is continuous or continuous almost everywhere. The set of continuous functions, however is relatively small. Third, given a sequence of Rieman integrable functions converging to some function, the limit of the sequence of the integrated function may not be the Rieman integral of the limit function. In fact the Rieman integral of the limit function may not even exist. These defects are absent in Lebesgue integration. To see these problems we begin by brie?yreviewingtheRiemann integral. 5.2.1 Riemann integrals Consider a bounded function f :[a,b] → R and a partition P = {a = x 0 < x 1 < ... < x n?1 <x n = b} of [a,b]. Let Υ be the set of all possible partions. For each P,deTne the sums S(P)= n X i=1 (x i ?x i?1 )H i and s(P)= n X i=1 (x i ?x i?1 )h i where H i =sup{f(x):x ∈ (x i?1 ,x i ]} and h i =inf{f(x):x ∈ (x i?1 ,x i ]}, ?i =1,...n. The sums S(P)ands(P) are known as step functions. See Figure 5.2.1.1. Then the upper Riemann integral of f over [a,b]isde- Tned by R u R b a f(x)dx =inf P∈Υ S(P)andthelower Riemann integral of f over [a,b]isdeTned by R l R b a f(x)dx =sup P∈Υ s(P). If R u R b a f(x)dx = R l R b a f(x)dx, thenwesaytheRiemannintegral exists anddenoteitR R b a f(x)dx. We state without proof (since it would take us far aTeld) the following Proposition which characterizes the ?class? of Riemann integrable functions. 8 Proposition 375 A bounded function is Riemann integrable i? it is contin- uous almost everywhere. We next provide explicit examples of functions that are and are not Rie- mann integrable. 8 See Jain and Gupta (1986), Appendix 1. 5.2. LEBESGUE INTEGRATION 171 Example 376 Consider the Riemann integral of Dirichlet?s function intro- duced in Example 360. Then R u R 1 0 f(x)dx =1and R l R 1 0 f(x)dx =0so the Riemann integral does not exist. Intuitively, this is because in any partition P, however Tne, there are both rational and irrational numbers which follows from the density of both sets established in Example 154. Formally, to see that the Dirichlet function (while bounded) is not continuous anywhere (and hence does not satisfy the requirements of the Proposition 375), consider the following argument. If q ∈ Q∩[0,1], let <x n > be a sequence of irrational numbers converging to q (the existence of such a sequence follows from The- orem 102). Since f(x n )=0, ?n ∈ N, the sequence <f(x n ) > does not converge to f(q)=1so f is not continuous at a ∈ Q.Similarly,ifι is an irrational number, let <y n > be a sequence of rational numbers converging to ι (the existence of such a sequence again follows from Theorem 102). Since f(y n )=1, ?n ∈N, the sequence <f(y n ) > does not converge to f(ι)=0so f is not continuous at ι ∈R\Q. See Figure 5.2.1.2. Example 377 Next consider the Riemann integral of f :[0,1] → {0,1} given by f(x)= ? 1 if x = 1 n 0 otherwise . Hence this function takes on the value 1 on the rationals { 1 n ,n∈ N} rather than the bigger set Q = { m n , m,n ∈ N}. We begin by noting that one can show that { 1 n ,n ∈ N} is not dense in [0,1]. As in the preceding example, it is simple to show that f is discontinuous at { 1 n ,n ∈ N}. On the other hand, f is continuous at D =[0,1]\{ 1 n ,n∈N}.Toseethis,letx ∈ D\{0}. Then ?n ∈ N such that x ∈ ( 1 n+1 , 1 n ).Letδ = 1 2 min ?ˉ ˉ x? 1 n ˉ ˉ , ˉ ˉ x? 1 n+1 ˉ ˉ a . Then ?x 0 ∈ (x?δ,x+ δ),wehavef(x 0 )=0.Thus,?ε>0, ?δ such that ?x 0 ∈ (x ? δ,x + δ), we have |f(x)?f(x 0 )| = |0?0| =0<ε.Sincef is discontinuous at a countable set of points, we know the Riemann integral exists by Proposition 375 and is given by R R 1 0 f(x)dx =0. See Figure 5.2.1.3. Example 378 Finally, let {q i } be the enumeration of all the rational num- bers in [0,1] and let Q n = {q i ∈ Q∩[0,1] : i =1,2,...,n}, n ∈ N.DeTne, for each n ∈N, the function f n :[0,1] →{0,1} by f n (x)= ? 1 if x ∈ Q n 0 otherwise . 172 CHAPTER 5. MEASURE SPACES The function f n is discontinuous only at the n points of Q n in [0,1].Sincef n is continuous a.e. and is bounded, the Riemann integral exists and R R 1 0 f n (x)dx = 0. Notice however that while f n → f, R R 1 0 f n (x)dx does not converge to R R 1 0 f(x)dx since the latter doesn?t even exist! 5.2.2 Lebesgue integrals Now that we?ve exposed some of the problems with the Riemann integral, we take up a systematic treatment of the Lebesgue integral. As Proposition 375 suggests, the class of Riemann integrable functions is somewhat narrow. On the other hand, the Lebesgue integrable functions are (relatively) larger. This is because the Lebesgue integral replaces the class of step functions (used in the construction of the Riemann integral) with the larger class of simple functions that were deTned in 354. The essential di?erence between stepfunctionsandsimplefunctionsistheclassofsetsuponwhichtheyare deTned. In particular the collection of subsets upon which the step function is deTned is a strict subset of the collection of subsets upon which simple functions are deTned. We will construct Lebesgue integrals under three sep- arate assumptions concerning boundedness of the function over which we are integrating (f)andtheTniteness of the measure (m)ofthesets(E)upon which the function (f)isdeTned. Assumption 1: f is bounded and m(E) < ∞ Consider the representation ? : E → R deTnedinExample355givenby ?(x)= P k i=1 a i χ A i (x)whereA i ? E ∈ L are disjoint and a i ∈ R\{0} are distinct. Then we deTne the elementary integral of this simple function to be R ?(x)dx = P k i=1 a i m(A i ). This integral is well deTned since m(A i ) < ∞ ?i and there are Tnitely many terms in the sum. In this case, we call the function ? an integrable simple function. To economize on notation, let R E ? ≡ R ?(x)dx. Sometimes it is useful to employ representations that are not canonical and the following lemma asserts that the elementary integral is independent of its representation. Lemma 379 Let ? = P n i=1 a i χ E i , with E i ∩E j = ? for i 6= j. Suppose each E i is an L-measurable set of Tnite measure. Then R E ? = P n i=1 a i m(E i ). 5.2. LEBESGUE INTEGRATION 173 Proof. The set A a = {x ∈ E : ?(x)=a} = ∪ a i =a E i .HenceamA a = P a i =a a i m(E i ) by additivity of m.Thus, R E ? = P am(A a )= P a i m(E i ). Next we establish two basic properties of the elementary integral. Theorem 380 Let ? and ψ be simple functions which vanish outside a set of measure zero. 9 Then (i) integration preserves linearity: R E (a? + bψ)= a R E ? + b R E ψ and (ii) integration preserves monotonicity: if ? ≥ ψ a.e., then R E ? ≥ R E ψ. Exercise 5.2.1 Prove Theorem 380. Let f : E → R be any bounded function and E an L-measurable set with mE < ∞. In analogy with the Riemann integral, we deTne the up- per Lebesgue integral of f over E by L u R E f(x)dx =inf ψ≥f R E ψ and the lower Lebesgue integral of f over E is deTned by L l R E f(x)dx = sup ?≤f R E ? where ψ and ? range over the set of all simple functions deTned on E. Notice that L u R E f(x)dx and L l R E f(x)dx are well deTned since f is bounded and m has Tnite measure on E. See Figure 5.2.2.1. DeTnition 381 If L u R E f(x)dx = L l R E f(x)dx, then we say the Lebesgue integral exists and denote it R E f(x)dx. Notice that if f is a simple function, then inf ψ≥f = f and sup ?≤f = f so that L u R E f(x)dx = L l R E f(x)dx. Hence simple functions are Lebesgue integrable. The next question is what other functions are Lebesgue integrable? The next theorem provides necessary and su?cient conditions for integrability. In particular, su?ciency shows that if one can establish that the function is L-measurable (as well as the conditions under which this section is based), then we know it is integrable. This is another theorem like Heine-Borel where su?ciency makes one?s life simple. Theorem 382 A bounded function f deTned on an L-measurable set E of Tnite measure is Lebesgue integrable i? f is L-measurable. 9 We say a function f vanishes outside a set of measure zero if m({x ∈ E : f(x) 6=0})= 0 or outside a set of Tnite measure if m({x ∈ E : f(x) 6=0}) < ∞. 174 CHAPTER 5. MEASURE SPACES Proof. (Sketch) (?)Sincef : E → R is bounded, ?M ≤ f(x) ≤ M. Divide [?M,M]inton equal parts. Construct sets E k = f ?1 μ· k?1 n M, k n M ?? (i.e. E k is the set of all x ∈ Esuch that f(x) belongs to a slice £ k?1 n M, k n M ¢ . See Figure 5.2.2.2. E k is measurable because it is an inverse image of a measurable function of an interval. DeTne two simple functions ψ n (x)= M n P n k=?n kχ E k (x)and? n (x)= M n P n k=?n (k ? 1)χ E k (x). Then ? approxi- mates f from below and ψ approximates f from above. Because ? and ψare simple functions, R E ? and R E ψ are well deTned. The upper (lower) Lebesgue integrals of f on E,beingtheinTmum (supremum) satisfy Z E ? n ≤ L l Z E f ≤ L u Z E f ≤ Z E ψ n . As n gets larger, ? n and ψ n get closer to each other and hence so do their integrals. Thus for n →∞, L l R E f = L u R E f which means R E f exists. (=?)Let f be integrable. Then inf ψ≥f Z ψ(x)dx =sup ?≤f Z ?(x)dx where ? and ψ are simple functions. Then by the property of inTmum and supremum, for any n, there are simple functions ? n and ψ n such that ? n (x) ≤ f(x) ≤ ψ n (x)and Z ψ n (x)dx? Z ? n (x)dx < 1 n . (5.4) DeTne ψ ? =inf n ψ n and ? ? =sup n ? n , which are measurable by Theorem 357 and satisfy ? ? (x) ≤ f(x) ≤ ψ ? (x). Butthesetofx for which ? ? (x)di?ers from ψ ? (x)(i.e. ? = {x ∈ E : ? ? (x) <ψ ? (x)}) has measure zero due to (5.4). Thus ? ? = ψ ? except on a set of measure zero. Thus f is measurable by Theorem 361. Notice that the assumptions on boundedness and Tnite measure imply MmE<∞ upon which the proof rests. Next we establish that the Lebesgue integral is a generalization of the Riemann integral. 5.2. LEBESGUE INTEGRATION 175 Theorem 383 Let f be a bounded function on [a,b]. If f is Riemann inte- grable over [a,b], then it is Lebesgue integrable and R R b a f(x)dx = R [a,b] f(x)dx. Proof. The proof rests on the fact that every step function (upon which Rie- mann integrals are deTned) is also a simple functions (upon which Lebesgue integrals are deTned), while the converse is not true. Then R l Z b a f(x)dx ≤ sup ?≤f Z [a,b] ?(x)dx ≤ inf ψ≥f Z [a,b] ψ(x)dx ≤ R u Z b a f(x)dx where the Trst and third inequalities follow from the above fact and the second follows from the fact that ? ≤ f ≤ ψ and (ii) of Theorem 380. Of course, the converse is not true. Example 384 In Example 376 we showed that the Dirichlet function was not Riemann integrable. However, it is Lebesgue integrable by Theorem 382 since it is L-measurable (which is clear since it is a simple function). Hence R [0,1] f(x)dx =1·m(Q∩[0,1])+0·m([0,1]\Q)=1·0+0·1. Now we establish the following properties of Lebesgue integrals which follow as a consequence of the factthatLebesgueintegralsaredeTned on simple functions and elementary integrals preserve linearity and monotonicity by Theorem 380. Theorem 385 If f and g are bounded L-measurable functions deTned on asetEof Tnite measure, then: (i) R E (af + bg)=a R E f + b R E g; (ii) if f = g a.e., then R E f = R E g; (iii) if f ≤ g a.e., then R E f ≤ R E g and hence ˉ ˉ R E f ˉ ˉ ≤ R E |f|;(iv)ifc ≤ f(x) ≤ d,thencm(E) ≤ R E f ≤ dm(E); and (v) if A and B are disjoint L-measurable sets of Tnite measure, then R A∪B f = R A f + R B f. Exercise 5.2.2 Prove Theorem 385. We now prove a very important result concerning the interchange of limit and integral operations of a convergent sequence of bounded L-measurable functions. Theorem 386 (Bounded Convergence) Let <f n > be a sequence of L-measurable functions deTned on a set E of Tnite measure and suppose |f n (x)| ≤ M, ?n ∈N and ?x ∈ E. If f n → f a.e. on E, then f is integrable and R E f =lim n→∞ R E f n .. 176 CHAPTER 5. MEASURE SPACES Proof. (Sketch) Since f n → f, f is measurable by Theorem 364. Since f n is uniformly bounded, then f is bounded. Given ε, it is possible to split E (by Theorem 366) into two parts E\A where f n → f uniformly and A with m(A) <ε.Then, lim n→∞ Z E f n → Z E lim n→∞ f n = Z f if ˉ ˉ ˉ ˉ Z E f n ? Z E f ˉ ˉ ˉ ˉ = ˉ ˉ ˉ ˉ Z E (f n ?f) ˉ ˉ ˉ ˉ is su?ciently small. Split this integral into two parts: ˉ ˉ ˉ ˉ Z E (f n ?f) ˉ ˉ ˉ ˉ ≤ Z E |f n ?f|≤ Z E\A |f n ?f|+ Z A |f n ?f| The Trst inegral is su?ciently small becausef n →f uniformly and the second issu?ciently small because|f n ?f|is bounded andm(A)issu?ciently small. It is important to note that R E f = lim n→∞ R E f n only requires point- wise convergence with Lebesgue integration. A similar result for Riemann integration (i.e. R R E f = lim n→∞ R R E f n ) requires uniform convergence. Example 387 Here we return to Example 378. There we saw that the bounded function f n was discontinuous at the n points of the L-measurable set Q n = {q i ∈ Q∩[0,1] : i =1,2,...,n}, n ∈ N.WhileR R 1 0 f n (x)dx =0 along the sequence and while f n → f, we saw lim n→∞ R R 1 0 f n (x)dx did not exist. On the other hand, since the bounded function f n is L-measurable and m[0,1] < ∞,weknowlim n→∞ R 1 0 f n (x)dx exists and equals 0 by Example 384. Assumption 2: f is nonnegative and m(E) ≤∞ In many instances, economists consider functions which are unbounded (e.g. most utility functions we write down are of this variety). Hence, it would be nice to relax the above assumption about boundedness. This section does that,albeitatthecostthatf must be nonnegative. Here we also do not require E to be of Tnite measure. DeTnition 388 If f : E →R + on an L-measurable set E is L-measurable, we deTne R E f =sup h≤f R E h where h is a bounded, L-measurable function which vanishes outside a set of Tnite measure. 5.2. LEBESGUE INTEGRATION 177 Notice that the integral is deTnedonanyfunctionh (not just sim- ple functions) which satisTes the conditions of the previous subsection and sup h≤f R E h is similar to the deTnition of the lower Lebesgue integral in the previous subsection. That is, his bounded andmH = m({x ∈ E : h(x) 6=0}) < ∞.Then R E h and sup h≤f R E h are well deTned. Furthermore R E h = R H h+ R E\H h = R H h. DeTnition 389 A nonnegative L-measurable function f deTned on an L- measurable set E is integrable (or summable) if R E f<∞.If R E f = ∞, we say f is not integrable even though it has a Lebesgue integral. Example 390 Let the function f :[0,1] →R + be given by f(x)= ? 1 x if x ∈ (0,1] 0 x =0 . While f is unbounded, consider the sequence of functions h n :[0,1] → R given by h n (x)= ? 1 x if x ∈ ( 1 n ,1] nx∈ [0, 1 n ] . In this case h n ≤ f (except at h n (0) = n and f(0) = 0 butthisisaset of measure 0)andh n is a bounded, L-measurable function which vanishes outside a set of Tnite measure. See Figure 5.2.2.3. Then R [0,1] h n (x)dx = R [0, 1 n ] ndx + R ( 1 n ,1] 1 x dx = n· ( 1 n ?0) + ln(1)?ln( 1 n )=1+ln(n). Since {h n } is contained in the set of all bounded h such that h ≤ f, as we take the sup over all such functions, we know 1+ln(n) →∞as n →∞, so that f is not integrable on [0,1]. Exercise 5.2.3 Let the function f :[1,∞) →R be given by f(x)= 1 x . Is f bounded? Is m[1,∞) Tnite? Is f integrable? Lemma 391 (Chebyshev?s Inequality) Let ? be an integrable function on A and ?(x) ≥ 0 a.e. on A. Let c>0. Then c·m{x ∈ A : ?(x) ≥ c} ≤ R A ?(x)dm. Proof. Let b A = {x ∈ A : ?(x) ≥ c}.Then R A ?(x)dm = R b A ?(x)dm + R Ab A ?(x)dm ≥ R b A ?(x)dm ≥ cm 3 b A ′ .See Figure 5.2.2.4. As in the previous subsection, there are various linearity and monotonicity properties associated with Lebesgue integrals of non-negative L-measurable functions. 178 CHAPTER 5. MEASURE SPACES Theorem 392 Let f and g be nonnegative L-measurable functions deTned on a set E.Then(i) R E cf = c R E f, c>0; (ii) R E (f +g)= R E f + R E g; and (iii) if g ≥ f a.e., then R E g ≥ R E f. Proof. (ii) Let h and k be bounded, L-measurable functions such that h ≤ f, k ≤ g and vanish outside sets of Tnite measure. Then h+k ≤ f +g so that R E h + R E k = R E (h + k) ≤ R E (f + g). Then sup h≤f R E h +sup k≤g R E k ≤ R E (f +g)sobyDeTnition 388 we have (i.e. R E f + R E g ≤ R E (f +g)). To establish the reverse inequality, let l be a bounded L-measurable func- tion which vanishes outside a set of Tnite measure and is such that l ≤ f +g. DeTne h and k by setting h(x)=min(f(x),l(x)) and k(x)=l(x) ? h(x). Then h ≤ f (by construction) and k ≤ g also follows from l = h + k ≤ f + g. Furthermore, h and k areboundedbytheboundforl and vanish where l vanishes. Then R E l = R E h + R E k ≤ R E f + R E g. But this implies sup l≤f+g R E l ≤ R E f + R E g or R E (f +g) ≤ R E f + R E g. Exercise 5.2.4 Let f be a nonnegative L-measurable function. Show that f =0a.e. on E i? R E f =0. As in the previous subsection, we now prove some important results con- cerning the interchange of limit and integral operations. The bounded conver- gence theorem has one restrictive assumption. It is that the sequence <f n > is uniformly bounded. In the following lemma this assumption is dropped. Instead we assume nonnegativity of <f n > and the result is stated in terms of inequality rather than equality. Theorem 393 (Fatou?s Lemma) Let <f n > be a sequence of nonnega- tive L-measurable functions and f n (x) → f(x) a.e. on E.Then R E f ≤ lim n→∞ R E f n . Proof. (Sketch) Let <f n >→f pointwise onE. The idea of the proof is to use the Bounded Convergence Theorem 386. To do so, we need a uniformly bounded sequence of functions. Hence, let h be a bounded function such that h(x) ≤ f(x), obtaining non-zero values only on a subset of E with Tnite measure. DeTne a new sequence <h n > by h n (x)=min(f n (x),h(x)).Then h n (x) is uniformly bounded, h n (x) ≤ f n (x)andh n → h pointwise. Thus by the bounded convergence theorem Z E h = lim n→∞ Z E h n ≤ lim n→∞ inf Z E f n (5.5) 5.2. LEBESGUE INTEGRATION 179 whereweusethelim inf since the limit of <f n (x) > may not exist. Since (5.5)holdsforanyh with the given properties, it also holds for the supremum sup h≤f Z E h ≤ lim n→∞ inf Z E f n . (5.6) But the left hand side of (5.6) is by deTntion R E f. Thenextexampleshowsthatthestrict inequality may be obtained. Example 394 Let the functions f n :[0,1] →R + be given by f n (x)= ? n if 1 n ≤ x ≤ 2 n 0 otherwise See Figure 5.2.2.5. In this case lim n→∞ f n (x)=0a.e. and lim n→∞ R [0,1] f n (x)dx = sup n→∞ h inf k≥n R [0,1] f k (x)dx i =sup{< 1 >} =1. 10 To see that nonnegativity matters for Fatou?s lemma, consider the fol- lowing example. Example 395 Instead, let the functions f n :[0,1] →R + be given by f n (x)= ? ?n if 1 n ≤ x ≤ 2 n 0 otherwise . Again lim n→∞ f n (x)=0a.e. and lim n→∞ R [0,1] f n (x)dx =sup n→∞ h inf k≥n R [0,1] f k (x)dx i = sup{?1,?2,?3,...} = ?1. Hence without nonnegativity we may have R E f> lim n→∞ R E f n . The conclusion of Theorem 393 is weak. It is possible to strengthen it by imposing more structure on the sequence of functions. Theorem 396 (Monotone Convergence ) Let <f n > be an increasing sequence of nonnegative L-measurable functions and f n (x) → f(x) a.e. on E. Then R E f =lim n→∞ R E f n . Proof. Since f n ≤ f,?n we have R E f n ≤ R E f by (iii) of Theorem 392. This implies lim n→∞ R E f n ≤ R E f. The result then follows from Theorem 393. 10 In this example, it was not necessary to actually take the liminf 180 CHAPTER 5. MEASURE SPACES Example 397 Let f :[0,1] →R be deTned by f(x)= ? 1 √ x if x ∈ (0,1] 0 x =0 . As in Example 390, while f is unbounded, consider a sequence of functions h n :[0,1] →R given by h n (x)= ? f(x) if f(x) ≤ n n if f(x) >n . or in other words h n (x)= ? 1 √ x if x ∈ [ 1 n 2 ,1] nx∈ [0, 1 n 2 ) . In this case h n ≤ f (except at x =0butthisisasetofmeasure0)andh n is a bounded, L-measurable function which vanishes outside a set of Tnite measure. Furthermore, h n is monotone since h n (x) ≤ h n+1 (x),?x ∈ [0,1]. Then R [0,1] h n (x)dx = R [0, 1 n 2 ] ndx+ R ( 1 n 2 ,1] 1 √ X dx = n·( 1 n 2 ?0)+2(1? 1 n )=2? 1 n . Then as n →∞, R [0,1] h n (x)dx =2. By the Monotone Convergence Theorem, R [0,1] f(x)=2. Example397isknownasanimproperintegralwhenregardedasaRie- mann integral since the integrand is unbounded. 11 On the other hand, it is perfectly proper when regarded as a Lebesgue integral. In this example, the two integrals are equal. Furthermore, we note that while Example 397 provides a case in which an unbounded nonnegative L-measurable function is integrable, Example 390 provides an instance of a closely related function which is not integrable. Assumption 3: f is any function and m(E) ≤∞(general lebesgue integral) DeTnition 398 An L-measurable function f is integrable over E if f + and f ? are both integrable over E. In this case, R E f = R E f + ? R E f ? . Theorem 399 Afunctionf is integrable over E i? |f| is integrable over E. 11 We also say that a Riemann integral is improper if its interval of integration is un- bounded. 5.2. LEBESGUE INTEGRATION 181 Proof. (?)Iff is integrable over E,then f + and f ? are both integrable over E. Thus, R E |f| = R E f + + R E f ? by Theorem 392. Hence |f| is integrable. (?)If R E |f| < ∞,thensoare R E f + and R E f ? . Example 400 Consider a version of the Dirichlet function f :[0,1] → {?1,1} given by f(x)= ? 1 if x ∈Q∩[0,1] ?1 otherwise . Observe that |f| =1and hence Riemann integrable while f is not. Lemma 401 Let A ∈ L, m(A)=0,and f be an L-measurable function. Then R A f =0. Proof. We show it Trst for a simple function. Let f = P n i=1 χ E i where {E i } is a collection of L-measurable sets that are disjoint. Then fχ A = P n i=1 α i χ A∩E i , where A∩E i are disjoint and m(A∩E i )=0(sincem(A∩E i ) ≤ m(A)=0. Thus R fχ A = P n i=1 α i m(A∩E i )=0. If f is a non-negative measurable function, then by Theorem 367 there is a non-decreasing sequence <f n > of simple functions that converges pointwise to f. Then by Theorem 396 Z A f = lim n→∞ Z A f n = lim n→∞ Z fχ A =0. Finally, iff is an arbitrary measurable function, thenfχ A = f + χ A ?f ? χ A and R A f = R A f + ? R A f ? =0?0=0. Lemma 402 Let f be an L-measurable function over E.Ifthereisanin- tegrable function g such that |f|≤ g,thenf is integrable over E. Proof. From f + ≤ g,it follows that R E f + ≤ R E g,andsof + is integrable on E. Similarly, f ? ≤ g implies integrability of f ? . Hence f is integrable over E. Theorem 403 Let f and g be integrable functions deTned on a set E.Then (i) the function cf where c is Tnite is integrable over E and R E cf = c R E f; (ii) the function f +g is integrable over E and R E (f +g)= R E f + R E g; (iii) if g = f a.e., then R E g = R E f;(iv) if g ≥ f a.e., then R E g ≥ R E f;and(v)If E 1 and E 2 are disjoint L-measurable sets in E, then R E 1 ∪E 2 f = R E 1 f + R E 2 f. 182 CHAPTER 5. MEASURE SPACES Exercise 5.2.5 Prove Theorem 403. In considering when we could interchange limits and integrals, we saw ei- ther we had to impose bounds on functions (Bounded Convergence Theorem 386) or consider monotone sequences of nonnegative functions (Monotone Convergence Theorem 396). In the general case, we simply must bound the sequence of functions by another (possibly unbounded) function. Theorem 404 (Lebesgue Dominated Convergence Theorem) Let g be an integrable function on E and let <f n > be a sequence of L-measurable functions such that |f n | ≤ g on E and lim n→∞ f n = f a.e. on E. Then R E f = lim n→∞ R E f n . Proof. (Sketch) By Lemma 402, f is integrable. We want to use Fatou?s Lemma 393 which requires a sequence of non-negative functions, which is not assumed in this theorem. However we can deTne two sequences, namely h n = f n + g and k n = g ? f n for which <h n >→ f + g and <k n >→ g ? f where both are non-negative. Hence by Fatou?s Lemma, we have R E (f + g) ≤ liminf n→∞ R E (f n + g)and R E (g ?f) ≤ liminf n→∞ R E (g ?f n ). The Trst inequality implies R E f ≤ liminf n→∞ R E f n and the second implies R f ≥ limsup n→∞ R f n by Theorem 403. Combining these two we have lim inf n→∞ Z E f n ≥ Z f ≥ lim sup n→∞ Z f n which implies the desired result. The above theorem requires that the sequence <f n > be uniformly dom- inated by a Txed integrable function g. However, the proof does not need such a strong restriction. In fact, the requirements can be weakened to con- sider a sequence of integrable functions <g n > which converge a.e. to an integrable function g and that |f n |≤ g n . Example 405 Let f n :[0,1] → R be given by f n (x)=nx n .SeFigure 5.2.2.6. Then lim n→∞ f n (x)=0a.e. and R [0,1] f n (x)dx = R [0,1] nx n dx = n n+1 x n+1 | 1 0 = 1 1+ 1 n so that lim n→∞ R [0,1] f n (x)dx =1. On the other hand, R [0,1] f(x)dx =0. Notice in the above example that the sequence of functions has no dom- inating function. 5.2. LEBESGUE INTEGRATION 183 Example 406 Let f n :[0,2] →R be given by f n (x)= ? √ n if 1 n ≤ x ≤ 2 n 0 otherwise See Figure 5.2.2.7. Then lim n→∞ f n (x)=f(x)=0?x ∈ [0,2] so that R [ 1 n , 2 n ] f(x)dx =0. Note that |f n (x)|≤ g(x) ?x ∈ [0,2] where g(x)= ( q 2 x if 0 <x≤ 2 0 if x =0 whichisintegrableover[0,2]. Itisalsosimpletosee R [ 1 n , 2 n ] f n (x)dx = √ n ? 2 n ? 1 n ¢ = 1 √ n so that lim n→∞ R [ 1 n , 2 n ] f n (x)dx =0. Finally we state a convergence theorem that di?ers from the previous ones in the sense that we don?t assume that <f n >→ f. Instead the theorem guarantees the existence of a function f to which <f n > converges a.e. given <f n > is a non-decreasing sequence of integrable functions with corresponding sequence of their integrals -R f n dm ? being bounded. Theorem 407 (Levi) Let hf n i be a sequence on A ? R and f 1 ≤ f 2 ≤ ..... ≤ f n ≤ ...... where f n is integrable and R A f n dm ≤ K. Then there exists f s.t. f = lim n?→∞ f n a.e. on A , f is integrable on A and R A f n dm ?→ R A fdm . Proof. (Sketch) Without loss of generality, assume f 1 ≥ 0. DeTne f(x)= lim n→∞ f n (x).Sincehf n iis non-decreasing, f(x)iseitheranumberor+∞.Using the Chebyshev?s inequality(Lemma 391), it is easyto showthatm({x ∈ A : f(x)=+∞})= 0, which implies f n → f pointwise a.e. In order to use the Lebesgue Domi- nated Convergence Theorem 404, we need to construct an integrable function ? on A that dominates f n (i.e. f n ≤ f ≤ ? on A. See Figure 5.2.2.8. Let A r = {x ∈ A : r?1 ≤ f(x) <r}and ?(x)= P ∞ r=1 rχ A r . Clearly f n ≤ f(x) ≤ ?. Is ? integrable on A = ∪ ∞ r=1 A r (i.e. is R A ? = P ∞ r=1 rm(A r ) < ∞)? For s ∈ N, deTne B s = ∪ s r=1 A r .Since?(x) ≤ f (x)+1andbothf n and f are bounded on B s , we have s X r=1 rm(A r )= Z B s ?dm ≤ Z B s f (x)dm+m(A) = lim n?→∞ Z B s f n (x)dm+m(A) ≤ K+m(A). 184 CHAPTER 5. MEASURE SPACES Boundedness of partial sums of an inTnite series P ∞ r=1 rm(A r ) ? = R A ?dm ¢ guarantees integrability of ?. We can also state a ?series version? of Levi?s theorem. It says that un- der certain conditions on the series, integration and inTnite summation are interchangeable. Corollary 408 If <g k > isasequenceofnon-negativefunctionsdeTned on A such that P ∞ k=1 R A g k (x)dm < ∞ ,thentheinTnite series P ∞ k=1 g k (x) con- verges a.e. on A.Thatis, P ∞ k=1 g k (x) → g(x) a.e. and P ∞ k=1 R A g k (x)dm = R A P ∞ k=1 g k (x)dm ? = R A g(x)dm ¢ . Proof. Apply Levi?s Theorem to the functions f n (x)= P n k=1 g k (x). 5.3 General Measure In the preceding sections, we focussed onR (or subsets thereof) as the under- lying set of interest. From this set, we constructed the Lebesgue σ-algebra denoted L.ThenwedeTned the Lebesgue measure m on elements of L.That is,westudiedthetriple(R, L, m) known as the Lebesgue measure space. These ideas can be extended to general measure spaces. DeTnition 409 The pair (X,X),whereX is any set and X is a σ-algebra of its subsets is called a measurable space.AnysetA ∈ X is called a (X-)measurable set. DeTnition 410 Let (X,X) beameasurablespace. Ameasure is an ex- tended real valued function μ : X → R∪{∞} such that: (i) μ(?)=0; (ii) μ(A) ≥ 0,?A ∈ X; and (iii) μ is countably additive (i.e. if {A n } n∈N is a countable, disjoint sequence of subsets A n ∈ X, then μ(∪ n∈N A n )= P n∈N μ(A n )). DeTnition 411 A measure space is a triple (X,X,μ). DeTnition 412 Let (X,X) beameasurablespace. Ameasureμ is called Tnite if μ(X) < ∞.μis called σ-Tnite if there is a countable collection of sets {E i } ∞ i=1 in X with μ(E i ) < ∞ for all i and X = ∪ ∞ i=1 E i . 5.3. GENERAL MEASURE 185 Example 413 (i) If X = R, then Lebesgue measure is not Tnite because m(?∞,∞)=∞ but it is σ-Tnite because (?∞,∞)=∪ ∞ n=1 (?n,n) and m((?n,n)) = 2n.(ii) If X =[0,100], then Lebesgue measure is Tnite because m([0,100]) = 100. Exercise 5.3.1 Let X =(?∞,∞) , X = P (X) and μ(E)= ? # of elements if E is Tnite ∞ if E is inTnite .Show that (X,X,μ) is a measure space and that μ is not σ-Tnite. Lebesgue measure has one important property that follows from Exercise 5.1.2. That is, if E is a Lebesgue measurable set with measure 0 and ifA ? E, then A is also Lebesgue measurable and m(A)=0. In general, however, we canhaveasituationthatE ∈X (is X -measurable) with μ(E)=0,A? E but A may not be in X. Example 414 Let (X,X,μ) be a measure space deTned as follows: Let X = {a,b,c} , X = {?,{a,b,c},{a,b},{c}} ; μ(?)=μ({a,b})=0, μ({c})= μ({a,b,c})=1,{a}?{a,b} but {a} is not X-measurable. DeTnition 415 Let (X,X,μ) be a measure space. μ is complete on X if for any E ∈X with μ(E)=0and for A ? E then A ∈X and μ(A)=0. That is, μ is complete in X if any subset of a zero measurable set is measurable and has measure zero. If we consider Lebesgue measure restricted to the Borel σ-algebra (i.e (R,B,m)) then m is not complete on B.To show this would necessitate more machinery (the Cantor set can be used to illustrate the idea). However, if μ is not complete on X, then there exists a completion of X denoted by e X. For example, the completion of (R,B,m)is(R,L,m). ExactlythesamewaywebuiltthetheoryofLebesguemeasureand Lebesgue integral in Sections 5.1 to 5.2.2, the theory of general measure and integral can be constructed. The space (R,L,m) can be replaced by (X,X,m)andinsteadofL -measurability and L-integrability we will have X -measurability and X -integrability. 5.3.1 Signed Measures Although we have introduced a general measure space (X,X,μ), the only non-trivial measure space that we have encountered so far is the Lebesgue 186 CHAPTER 5. MEASURE SPACES measure space (R,L,m),where Lebesgue measuremwas constructed through the outer measure in Section 5.1. Can we construct other non-trivial measures μ on a general measure space (X,X)? Consider a measure space (X,X,μ) and let f be a non-negative X -measurable function. DeTne λ : X ?→ R by λ(E)= R E fdμ. Then the following theorem establishes that this set function λ is a measure. Theorem 416 If f is a non-negative X -measurable function then λ(E)= Z E fdμ (5.7) is a measure. Moreover if f is X -integrable, then λ is Tnite. Proof. λ(E)= R E fdμ = R X fχ E dμ ≥ 0 for all E ∈ X (where χ E is the characteristic function of E). Let {E i } ∞ i=1 be a collection of mutually disjoint sets and∪ ∞ i=1 E i = E.Then χ E = P ∞ i=1 χ E i . Let g n = fχ E n and f n = P n i=1 g n . Since g n ≥ 0, the sequencehf n iis non-decreasing. f n is also measurable for all n because the sum and the product of measurable functions is measurable. f n converges pointwise to f because P n i=1 χ E i → χ E = P ∞ i=1 χ E i . Then according to the monotone convergence theorem 396 λ(E)= Z E fdμ= Z fχ E dμ = Z f ∞ X i=1 χ E i = ∞ X i=1 Z fχ E i dμ = ∞ X i=1 Z E i fdμ= ∞ X i=1 λ(E i ) Hence λ is σ-additive. If f is integrable, then λ(X)= R X fdμ < ∞ and hence f is Tnite. This theorem provides us with a method of how to construct new mea- sures on a measure space (X,X,μ). Actually any non-negative X-integrable function represents a Tnite measure given by 5.7. Thus, given a measure space (X,X,μ),there is a whole set of measures deTned on X. Are all measures on (X,X,μ) of the type given by 5.7? In other words, let E be the set of all measures on (X,X,μ).Can any measure ν ∈E be rep- resented by an integrable function g such that ν(E)= R E gdμ?Theanswer is contained in a well-known result: the Radon-Nikodyn theorem. We could pursue this problem in our current setting; namely we could deal with mea- sures only (i.e. with non-negative σ-additive set functions deTned on X). We can, however, work in an even more general setting. Instead of dealing with non-negative σ-additive set functions (measures) we can drop the assumption 5.3. GENERAL MEASURE 187 of non-negativity and work with the σ-additive set functions which are ei- ther positive or negitive or both. These functions are called signed measures. This generalization is useful particularly when working with Markov pro- cesses. ???NEED AN EXAMPLE WITH A MARKOV PROCESS HERELet us now deTne the notion of signed measure rigorously. DeTnition 417 Let (X,X) be a measurable space. Let μ : X ?→ R ∪ {?∞,∞} with the following properties: (i) μ(?)=0; (ii) μ obtains at most one of the two symbols +∞,?∞ ; (iii) μ is σ-additive. Then μ is called a signed measure on X. In the text that follows when we refer to a ?measure? (without the preTx ?signed?) we mean measure in the sense of DeTnition 410(i.e. a non-negative set function). Example 418 Given a measure space (X,X,μ), an example of a signed measure is the set function ν (E)= Z E fdμ (5.8) where f is any X-integrable function. In Theorem 416 we showed that if f is a non-negative integrable function then P is a Tnite measure. Here we just assume that f is integrable and put no restrictions on non-negativity. We can also assume that f is the ?only? measurable function for which R fdμ exists (i.e. at least one of the functions f + ,f ? is integrable). Thus, if f in (5.8)isanintegrablefunction,thenν is a Tnite signed measure and if f is a X?measurable function for which R fdμexists, then ν is a signed measure (though not necessarily Tnite). Example 419 Let (R,L,m) be a Lebesgue measure space. In the Trst case, let f : R→Rbe given by f (x)= x (1+x 2 ) 2 .Then we have f + (x)= ? x (1+x 2 ) 2 , x ≥ 0 0 , x<0 , f ? (x)= ? 0 , x ≥ 0 ? x (1+x 2 ) 2 , x<0 , R ∞ ?∞ f + dm = R ∞ ?∞ f ? dm = 1 2 ,fis integrable and ν (E)= R E fdm = R ∞ ?∞ x (1+x 2 ) 2 ·χ E (x)dm is a Tnite signed measure. 188 CHAPTER 5. MEASURE SPACES Example 420 Let g : R → R be given by g(x)= ? x , x ≥ 0 x (1+x 2 ) 2 , x<0 .Then we have g + (x)= ? x , x ≥ 0 0 , x<0 ,g ? (x)= ? 0 , x ≥ 0 ? x (1+x 2 ) 2 , x<0 , R ∞ ?∞ g + dm = +∞ , R ∞ ?∞ g ? dm = 1 2 ,gis measurable but not integrable. However, the integral exists since R ∞ ?∞ gdm = R ∞ ?∞ g + dm ? R ∞ ?∞ g ? dm =+∞? 1 2 =+∞, ν (E)= R E g ·dm = R ∞ ?∞ g ·χ E dm is a signed measure (but not Tnite). Example 421 Let h : R→R be given by h(x)=x. Then we have h + (x)= ? x , x ≥ 0 0 , x<0 ,h ? (x)= ? 0 , x ≥ 0 ?x , x<0 , R ∞ ?∞ h + dm =+∞ , R ∞ ?∞ h ? dm = +∞. Hence R ∞ ?∞ hdm = R ∞ ?∞ h + dm? R ∞ ?∞ h ? dm = ∞?∞is not deTned, the Lebesgue integral doesn?t exist, and thus this function doesn?t deTne a signed measure. The previous examples show that if a signed measure ν is deTned by expression (5.8) using an integral then it can be written as a di?erence of two measures: ν (E)=ν 1 (E)?ν 2 (E)where ν (E)= Z E fdμ, ν 1 (E)= Z E f + dμ and ν 2 (E)= Z f ? dμ We now show that such a decomposition is possible for any arbitrary signed measure. This decomposition is known as the Jordan decomposition of a signed measure. First we need to prove some lemmas. However, in order to avoid introducing complicated terminology which does not help to understand the main ideas, in the remainder of this chapter we will deal only with Tnite signed measures. All theorems and proofs can be adopted to σ-Tnite signed measures. Lemma 422 Let ν be a Tnite signed measure on X. Then any collection of disjoint sets {E i } ,forwhichν (E i ) > 0(ν (E i ) < 0) is countable. Proof. Let E ?X be a collection of disjoint sets {E i } for which ν (E i ) > 0. For n =1,2,... let E n = ? E i ∈E : ν (E i ) > 1 n a . Then E = ∪ ∞ n=1 E n . For each n,E n is Tnite. If it were not, we would have a sequence hE k i ∞ k=1 of disjoint sets from E with ν (E k ) > 1 n for k =1,2,....Then ν (∪ ∞ k=1 E k )= P ∞ i=1 ν (E k ) ≥ 5.3. GENERAL MEASURE 189 P ∞ k=1 1 n = ∞ , which leads to a contradiction, that ν is Tnite. Then E n is Tnite and hence E = U ∞ n=1 E n is countable. Let ν be a signed measure on X and let ν (E) > 0. Let F ? E. What can be said about the sign of ν(F)? As the next example shows, not much can be said about the signed measure of a subset of a set whose signed measure is positive. Example 423 Let X = {1,2,3,......},X = P (X).For E ∈ X,deTne ν (E)= P nDE (?1) n 1 2 n . For E = {1,2,3} , ν (E)=? 3 8 < 0. If F = {2} ? E , ν (F)= 1 4 > 0. But notice that each singleton subset C of set B = {1,3,5,....} has ν (C) < 0 and each singleton subset D of the set A = {2,4,6,......} has ν (D) > 0. Moreover A and B are disjoint and A∪B = X. Example 424 Let X =[?1,1] and X be all L-measurable subsets of [?1,1]. For E ? [?1,1], let v(E)= R E xdx. For E = £ ? 1 2 ,1 ¤ , then ν (E)= R 1 ? 1 2 xdx = 3 8 . For F = £ ? 1 2 ,0 ¤ ? £ ? 1 2 ,1 ¤ , ν (F)= R 0 ? 1 2 xdx = ? 1 8 . Thus the sign measure of the set E is positive but its subset F has negative sign measure. But for each subset C of the set B =[?1,0) , ν (C) < 0 and for each subset D of the set A =[0,1] , ν (D) > 0 where A∩B = ? and A∪B = X. DeTnition 425 A X -measurable set E is positive (negative) with respect toasignedmeasureν,ifforanyX -measurable subset F of E , ν (F) ≥ 0 , (ν (F) ≤ 0). Thus sets A in Examples 423 and 424 are positive while B are negative. Notice that ν(G) > 0(orν(G) < 0) doesn?t mean that G is positive (nega- tive) as E in Examples 423 and 424 show. We will show that the existence of sets A and B for a signed measure D in these examples is not a coincidence. DeTnition 426 An ordered pair (A,B),whereA is a positive and B is a negativeset,withrespecttoasignedmeasureν and A∩B = ? , A∪B = X is called the Hahn decomposition with respect to ν of a measurable space (X,X). Theorem 427 (Hahn Decomposition) Let ν be a Tnite signed measure on a measurable space (X,X). Then there exists a Hahn decomposition of (X,X). 190 CHAPTER 5. MEASURE SPACES Proof. (Sketch)Let S be a family of collections of subsets of X whose el- ements A are collections of disjoint measurable sets E ? X with ν(E) < 0. Since ??? is a partial ordering on S satisfying the assumptions of Zorn?s lemma 46 (namely that every totally ordered subcollection {A i } has a max- imal element A = ∪ i A i ), then there is a maximal element E of S.Moreover E is countable (by Lemma 422). Let B = ∪{E ∈ E}. Then B is measurable and negative (by construction all of its subsets have negative measure). Let A = X\B. We have A ∩ B = ?,A∪ B = X,andA is measurable. If we show that A is positive, then we would be done since (A,B)wouldbea Hahn decomposition. Hence we need to show that A (as a complement of a maximal negative set) is positive. The idea is that if we assume that A is not positive, then we can construct a negatie set with negative measure outside the set B. This would violate the maximality of B. While the construction of such a set is given in the formal proof of this theorem in the appendix to the chapter, see Figure 5.3.1. In the special case where a signed measure ν is deTned by the integral ν (E)= R E fdμ, the Hahn decomposition is given by A = {x : f (x) ≥ 0} and B = {x : f (x) < 0} as we have seen in Example 424. It is easily seen that the Hahn decomposition is not unique. We can, for example, set A 1 = {x : f (x) > 0} , B 1 = {x : f (x) ≤ 0}. But the following theorem shows that the choice of a Hahn decomposition doesn?t really matter. Theorem 428 Let (A 1 ,B 1 ) , (A 2 ,B 2 ) be two Hahn decompositions of a mea- surable space (X,X) withrespecttoasignedmeasureν.Then for each E ∈X we have ν (E ∩A 1 )=ν (E ∩A 2 ) and ν (E ∩B 1 )=ν (E ∩B 2 ). Proof. From E ∩ (A 1 \A 2 ) ? E ∩ A 1 we have ν (E ∩(A 1 \A 2 )) ≥ 0and from E∩(A 1 \A 2 ) ? E∩B 2 we have ν (E ∩(A 1 \A 2 )) ≤ 0. Combining these two inequalities we have ν (E ∩(A 1 \A 2 )) = 0. Analogously we can show that ν (E ∩(A 2 \A 1 )) = 0.Henceν (E ∩A 1 )=ν((E ∩(A 2 \A 1 ))∪(E ∩(A 1 ∩A 2 ))) = 0+ ν (E ∩A 1 ∩A 2 ). If we start with ν (E ∩A 2 ), we arrive by similar rea- soning with 0+ ν (E ∩A 1 ∩A 2 ). Hence, v(E ∩A 1 )=ν(E ∩A 2 ). Similarly we can show that ν (E ∩B 1 )=ν (E ∩B 2 ). Theorem 429 Let ν be a Tnite signed measure on X and let (A,B) be an ar- bitrary Hahn decomposition with respect to ν. Then ν (E)=ν + (E)? ν ? (E) for any E ∈ X where ν + (E)=ν (E ∩A) and ν ? (E)=? ν (E ∩B) are both measures on X and don?t depend on the choice of Hahn decomposition (A,B). 5.3. GENERAL MEASURE 191 Proof. The independence of ν + and ν ? onthechoiceofHahndecomposition follows from Theorem 428. Since ν is a signed measure, ν is σ-additive and thus ν + and ν ? are as well. Since A is a positive set and E ∩A ? A, then ν (E ∩A) ≥ 0. Since B is a negative set and E∩B ? B, then ν (E ∩B) ≤ 0 so that ? ν (E ∩B) ≥ 0. Thus ν + and ν ? are measures on X. Since E = E ∩X = E ∩ (A∪B)=(E ∩A) ∪ (E ∩B),we have ν (E)=ν (E ∩A)+ ν (E ∩B)=ν + (E)? ν ? (E). DeTnition 430 ν = ν + ? ν ? is called the Jordan decomposition of a signed measure ν. The measure ν + (ν ? ) is called a positive (negative) variation of ν. |ν|(E)=ν + (E)+ ν ? (E) isalsoameasureonX and is called the total variation ofasignedmeasureν. Exercise 5.3.2 Let (X,X,μ) be a measure space and let f be X -integrable. If ν (E)= R E fdμ , show that ν + (E)= R E f + dμ , |ν|(E)= R E |f|dμ. Exercise 5.3.3 Show that a countable union of positive (negative) sets is a positive (negative) set. If a signed measure ν is deTned as the integral of an integrable function ν (E)= R E fdμ then by Lemma 401 it has the following property. If E ∈ X and μ(E)=0, then ν (E)=0. As we will soon see, this property of a signed measure is very important and we formulate if for any signed measure (not only the one given by an integral). DeTnition 431 Let ν be a Tnite signed measure and let μ be a measure on (X,X). If for every A ∈X , μ(A)=0implies ν (A)=0, then we say that ν is absolutely continuous with respect to μ , written ν<<μ. Hence by Lemma 401, ν (E)= R E fdμ is absolutely continuous with re- spect to μ. Now we prove two simple lemmas. Lemma 432 Let ν be a Tnite signed measure and μ be a measure on X. Then the following are equivalent:(i) ν<<μ,(ii) ν + << μ, ν ? << μ, (iii) |ν| << μ. Proof. (i)=? (ii). Let (A,B) be a Hahn decomposition with respect to ν. Let E ∈ X and μ(E)=0.Then μ(E ∩A)=0andbecauseν is absolutely continuous with respect to μ, we have ν (E ∩A)=ν + (E)=0. This implies 192 CHAPTER 5. MEASURE SPACES ν + << μ. Similarlyν ? << μ. The other two implictations follow immediately from these equalities: |ν|(E)=ν + (E)+ν ? (E) ν (E)=ν + (E)?ν ? (E). Lemma 433 Let μ,λ be Tnite measures on X , λ<<μand λ(E 0 ) 6=0 for at least one set E 0 ∈ X . Then there exists ε>0 and a set E ∈ X that is positive with respect to the signed measure λ? εμ and λ(E) > 0 and μ(E) > 0. Proof. Let (A n ,B n )forn ∈N be a Hahn decomposition of X with respect to λ? 1 n μ. Set A = ∪ ∞ n=1 A n , B = ∩ ∞ n=1 B n . Since B ? B n and B n is a negative set with respect to λ? 1 n μ then ? λ? 1 n μ ¢ (B) ≤ 0 ?? 0 ≤ λ(B) ≤ 1 n μ(B) ,forn ∈ N. Thus λ(B)=0. Since λ(X) 6=0, then λ(A)=λ(X\B)= λ(X) ? λ(B)=λ(X) > 0. As λ<<μwe have μ(A) > 0. Finally set E = A n 0 and ε = 1 n 0 . Now we are ready to tackle the main problem of this section, which you can think of as a representation theorem. 12 Given a measure space (X,X,μ), consider the set function ν (E)= R E fdμ where f is X -measurable and X -integrable. Under certain conditions, this speciTcsignedmeasureonX represents all signed measures (i.e. there are no other signed measures on X that cannot be represented as the integral of the X -measurable function f). This is established formally in the Radon-Nikodyn Theorem which states that under certain conditions any signed measure on X can be represented by the integral of a measurable function. The Radon-Nikodyn Theorem will be used in the Riesz Representation Theorem in the next chapter. Theorem 434 (Radon-Nikodyn) Let (X,X,μ) be a measure space, μ be a σ-Tnite measure, ν be a Tnite signed measure on X and ν<<μ.Then there exists a X -integrable function f on X such that ν (E)= R E fdμ for any E ∈X. Moreover f is unique in the sense that if g is any X -measurable function with this property, then g = f a.e. with respect to μ. 12 In general, a representation theorem provides a simple way to characterize (or repre- sent) a set of elements using certain properties that actually extends to the entire collection of elements under given assumptions. 5.3. GENERAL MEASURE 193 Proof. (Sketch) By Theorem 429, a Tnite signed measure ν can be decom- posed into ν + and ν ? where ν ? ,ν + are both measures and by (ii) of Lemma 432 they are both absolutely continuous with respect to μ (if ν is). Hence it su?ces to prove the theorem under the assumption that ν is a (non-negative) measure. Also, since μ is σ-Tnite, then X can be decomposed into countably many disjoint sets {E i } for which μ(E i ) < ∞. Hence it su?ces to prove the theorem with μ Tnite. In summary, we take μ,ν both Tnite measures with ν<<μ. Let G be the set of all non-negative X-measurable, integrable functions g satisfying Z E gdμ≤ ν(E), ?E ∈X. (5.9) Among these functions g, we want to Tnd a function f which satisTes (5.9) with equaility. Since R X gdμ≤ ν(X), ?g ∈ G (because ν is Tnite), the set of real numbers ?R gdμ,g ∈ G a is bounded (by ν(X)) and hence its supremum exists. Let α =sup g∈G R gdμ. f is constructed (using Levi?s Theorem 407) as a limit function of a sequence <f n > that attains this supremum (i.e. α = R fdμ). 13 Because f ∈ G, we know that R E fdμ≤ ν(E), ?E ∈ X. We claim that R E fdμ = ν(E), ?E ∈ X. If this were not true, then there would exist a set E such that R E fdμ<ν(E). Then by Lemma 433, we could construct a function g 0 = f +εχ E 0 belonging to G for which R g 0 dμ > α. But this would violate the fact that α is the supremum. The assumption in the Radon-Nikodyn theorem that μ is σ-Tnite is im- portant as the next exercise shows. Exercise 5.3.4 Let X = R and let X be a collection of all subsets of R that are countable or that have countable complement. DeTne μ(E)= ? # of elements of E if E is Tnite ∞, otherwise and ν (E)= ? 0 if E is countable 1 if X\E is countable .(i) Show that μ, ν are measures on X and that ν<<μ.(ii) Show that μ is not σ-Tnite. (iii) Show that the Radon-Nikodyn theorem doesn?t hold. 13 In particular, by the supremum property, there exists a sequence <g n > from G such that lim n→∞ R g n dμ = α.DeTne a sequence <f n > by f n =max{g 1 ,...,g n },f n ∈ G. Since <f n > is a non-decreasing sequence of integrable functions with R f n dμ ≤ α,then by Levi?s theorem there exists an integrable function f =limf n a.e. with R E fdμ = lim n→∞ R E f n dμ ≤ ν(E) (because f n ∈ G) and hence f ∈ G and R fdμ ≤ α.Onthe other hand, because g n ≤ f n we have R E fdμ= lim n→∞ R E f n dμ ≥ lim n→∞ R g n dμ = α. Combining these two inequalities gives R fdμ= α. 194 CHAPTER 5. MEASURE SPACES 5.4 Examples Using Measure Theory 5.4.1 Probability Spaces DeTnition 435 If μ(X)=1,then μ is a probability measure and (X,X,μ) is called a probability space. In this case, X is called the sample space, any measurable set A ∈ X is called an event, and μ(A) is called the prob- ability of the event A. For a probability space, we say almost surely (a.s.) interchangeably with almost everywhere (a.e.). We next illustrate measure spaces through some basic properties of prob- ability. DeTnition 436 Let (X,X,P) be a probability space. Let Λ be an arbitrary index set and let A i , i ∈Λ be events in X.TheA i are independent if and only if for all Tnite collections {A i 1 ,A i 2 ,...,A i k } we have P(A i 1 ∩A i 2 ∩...∩A i k )=P(A i 1 )P(A i 2 )···P(A i k ). The next deTnition makes clear that a random variable is nothing other than a measurable function. DeTnition 437 A random variable Y on a probability space (X,X,P) is a Borel measurable function from X to R (i.e. Y : X×X →R×B(R)). If Y is a random variable on (X,X,P),theprobability measure induced by Y is the probability measure P Y on B(R) given by P Y (B)={x ∈ X : Y(x) ∈ B}, B ∈B(R). The numbers P Y (B), B ∈B(R), completely characterize the random vari- able Y in the sense that they provide the probabilities of all events involving Y. This information can be captured by a single function from R to R as the next deTnition suggests. DeTnition 438 The distribution function of a random variable Y is the function F : R→R given by F(y)=P{x ∈ X : Y(x) ∈ B}. DeTnition 439 If Y is a random variable on (X,X,P),theexpectation of Y is deTned by E[Y]= R X YdP provided the Lebesgue integral exists. The next result gives a good illustration of simple functions, monotone convergence theorem. 5.4. EXAMPLES USING MEASURE THEORY 195 Theorem 440 Let Y be a random variable on (X,X,P) with distribution function F.Letg : R→R be a Borel measurable function. If Z = g ?Y, then E[Z]= R R g(y)dF(y) ? = R R gdP Y ¢ Exercise 5.4.1 Prove Theorem 440 (Theorem 5.10.2 p. 223 in Ash) One of the most remarkable results in probability is Kolmogorov?s strong law of large numbers. Theorem 441 (Strong Law of Large Numbers) If Y 1 ,Y 2 ,... are indepen- dent and identically distributed random variables and E[|Y 1 |] < ∞ ,then lim n→∞ 1 n n X i=1 Y i = E[Y 1 ] a.s. 5.4.2 L 1 Let us denote the collection of L-integrable functions f deTned on X ?R by L 1 (X). For instance, X can be all of Rin which case L 1 (R) is any measurable subset of R. Hence L 1 (X) is the collection of all L-measurable functions f deTned on X for which R X |f| < ∞. It is straightforward to see that L 1 (X) is a vector space. Exercise 5.4.2 Show that L 1 (X) is a vector space. Hint: Use Theorem 403. Can L 1 (X) be equipped with a norm? Let us deTne a function k·k 1 = L 1 (X) →R given by kfk 1 = R X |f|. Does this function satisfy the properties of a norm given in DeTnition 206? Exercise 5.4.3 Show that k·k 1 satisTes properties (i) kfk 1 ≥ 0, ?f ∈ L 1 (X), (iii) kαfk 1 = |α|kfk 1 ,?α ∈ R,f∈ L 1 (X), (iv) kf +gk 1 ≤ kfk 1 + kgk 1 ,? f,g ∈ L 1 (X) of the deTnition of a norm and the part of (ii) that f =0? kfk 1 =0. The next example makes it clear that the converse of part (ii) is not true. Example 442 If f is the Dirichlet function of Example 360, then kfk 1 =0 but f 6=0everywhere. 196 CHAPTER 5. MEASURE SPACES To overcome this problem, we will deTne a relation ?~ ?onthesetofall integrable functions. Let f,g∈ L 1 , deTne f ~ g i?f = g a.e. This relation is an equivalence and hence by Theorem 31 L 1 can be partitioned into disjoint classes e f of equivalent functions (i.e. functions that are equal a.e.). Figure 5.4.1??? Exercise 5.4.4 Prove f ~ g i? f = g a.e is an equivalence relation using DeTnition 26. By Theorem 403, for any two functions from the same equivalence class, the norm kfk 1 = kgk 1 ≡ ° ° ° e f ° ° ° 1 . Then the space L 1 consisting of equivalence classes with the k·k 1 norm is a normed vector space. To keep notation and terminology simple in what follows, we will refer to the elements of L 1 as functions rather than equivalence classes of functions. But you should keep in mind that when we refer to a function f we are actually referring to all functions that are equal a.e. to f. The most important question we must ask of our new normed vector space is ?Is it complete?? The next theorem provides the answer. Theorem 443 (L 1 ,k·k 1 ) is a complete normed vector space (i.e. a Banach space). Before proving completeness of L 1 ,we note that one strategy used in pre- vioussectionsisto:Trst, take a Cauchy sequence <f n > in a given function space and note that for a given x ∈ X, <f n (x) > is Cauchy in R and lim n→∞ f n (x)=f(x)existsforeachx since R is complete; second, prove that f n → f with respect to the norm of the normed function space. Un- fortunately, this procedure cannot be used in L 1 since for a Cauchy sequence <f n > in L 1 apointwiselimitof<f n (x) > may not exist for any point x as the following example shows. Example 444 Let a sequence <f n > of functions on [0,1] be given by f 1 = χ [0, 1 2 ] ,f 2 = χ [ 1 2 ,1] , f 3 = χ [0, 1 4 ] ,f 4 = χ [ 1 4 , 2 4 ] ,,f 5 = χ [ 2 4 , 3 4 ] ,f 6 = χ [ 3 4 ,1] , f 7 = χ [0, 1 8 ] ,f 8 = χ [ 1 8 , 2 8 ] ,...,f 13 = χ [ 6 8 , 7 8 ] ,f 14 = χ [ 7 8 ,1] , f 15 = χ [0, 1 16 ] ,... 5.4. EXAMPLES USING MEASURE THEORY 197 See Figure 5.4.2. This sequence is Cauchy in L 1 ([0,1]) but there is no point x ∈ [0,1] for which lim n→∞ f n (x) exists. In other words, <f n > doesn?t converge pointwise at any point x ∈ [0,1]. Proof. (Sketch)Let <f n > be a Cauchy sequence in L 1 . In order to Tnd a function to which the sequence converges, in light of example 444, we need to take a more sophisticated approach. The fact that <f n > is Cauchy means we can choose a subsequence <f n k > such that the two consecutive terms are so close to each other (i.e. ° ° f n k+1 ?f n k ° ° 1 < 1 2 n )thattheirinTnite sum (i.e. P ∞ k=1 R |f n k+1 ?f n k |dm converges a.e. on X (i.e. the sum is Tnite). Then by the Corollary of Levi?s Theorem 408, the inTnite sum P ∞ k=1 |f n k+1 ?f n k | also converges a.e. and because f n k+1 ?f n k ≤ |f n k+1 ?f n k |, ?k,the inTnite sum P ∞ k=1 f n k+1 ?f n k convergesa.e.aswell.Butthesumofthedi?erences of two consecutive terms in the subsequence itself f n 1 +(f n 2 ?f n 1 )+(f n 3 ?f n 2 )+... +(f n k ?f n k?1 )=f n k . Thus the subsequence <f n k > converges a.e. on X. Let f be the function <f n k > which converges a.e. on X.We need to show that <f n k >→f with respect to k·k 1 and that f ∈ L 1 (X). To prove the former we can use Fatou?s Lemma 393 (since <f n k >→ f a.e.) and to prove thelatterwecanusethetheestimatekfk 1 ≤kf ?f n k k 1 +kf n k k 1 ≤∞. The Trst term in this inequality is bounded by Fatou?s Lemma and the second is bounded since f n k ∈ L 1 ,?k. Then we have a Cauchy sequence <f n > in L 1 (X) whose subsequence <f n k >→f in L 1 (X). Then by Lemma 173, the whole sequence <f n >→ f in L 1 (X). Approximation in L 1 The next theorem establishes that the simple and continuous functions are dense in L 1 (X). Theorem 445 Let f be an L-integrable function on R and let ε>0.Then (i)thereisanintegrablesimplefunction? such that R |f ??| <εand (ii) there is a continuous function g such that g vanishes (i.e. g =0)outside some bounded interval and such that R |f ?g| < ε 2 . Proof. (of i) Without loss of generality, we may assume that f ≥ 0(other- wise f = f + ?f ? where f + and f ? are non-negative). If f is L-integrable, 198 CHAPTER 5. MEASURE SPACES then using the supremum property in DeTnition 388 for ε>0,there exists a bounded L-measurable function h that vanishes outside a set E of Tnite measure (i.e. m(E) < ∞)suchthath ≤ fand ?R f ¢ ?ε<h? R (f?h) <ε. Then by Theorem 367 there is a non-decreasing sequence (since h is bounded) of simple functions <h n > converging uniformly to h.Thenfor ε 2m(E) > 0, ?N such that |h n (x)?h(x)| < ε 2m(E) for all x ∈ E. Hence Z E |f ?h N |≤ Z E |f ?h|+ Z E |h N ?h|≤ ε 2 + ε 2m(E) m(E)=ε. (of ii)Given: g,? continuous function h s.t. g(x)=R(x),exceptona set ≤ ε 3 .f is integrable Z X fdx=inf ψ≥f Z X ψdx (5.10) ,?d>0, ? R |f ?ψ|dx < d. Referring to equation (5.10).ψ is a simple function i.e.ψ(x)= P n i=1 a i χ E i (x),μ(E i ) < ∞, ?i if μ is Lebesgue mea- sure. μ(M) < ∞, then for ε>0, ? F M closed and G M open such that F M ? M ? G M ,μ(G M )?μ(F M ) <ε.Let?s deTne ? ε (x)= ρ(x,R\G M ) ρ(x,R\G M )+ρ(x,F M ) If ? ε (x)=0ifx ε R\G M ? ε (x)=f if x ε F M ? ε (x) is continuous because ρ(x,R\G M )andρ(x,F M ) are continuous and ρ(x,R\G M )+ρ(x,F M ) 6=.Function χ M ?ρ ε =≤ x ε G M \F M χ M ?ρ ε =0x ε R\(G M \F M ) Hence Z |χ M (x)?ρ ε (x)|dμ < ε Thus χ M is approximated by a continuous function ρ ε . 5.4. EXAMPLES USING MEASURE THEORY 199 Separability of L 1 (X) In the next chapter we will show that ifX is compact, thenL 1 (X)isseparable with the countable dense set being the set of all polynomials with rational coe?cients. But if X is an abitrary L-measurable set including X with m(X)=∞ (i.e. X = R) we need to Tnd a di?erent countably dense set. We will show that the set M of all Tnite linear combinations of the form n X i=1 c i χ I i (5.11) where the numbers c i , i =1,...,n are rational and I i are all intervals (open, closed, and half-open) with rational endpoints is a countably dense set in L 1 (X). Countability of M is obvious. We need to show that M is dense in L 1 (X). By Theorem 445 we know that the set of all integrable simple func- tionsisdenseinL 1 (X). But every such function can be approximated ar- bitrarily closely by a function of the same type taking only rational values. Thus given f ∈ L 1 (X)andε>0 there is an integrable simple function ? = P n i=1 y i χ E i where y i are rational coe?cients, E i are mutually disjoint L-measurable sets, and ∪ n i=1 E i = X such that R X |f ??|dm < ε.Ifthe function ? were of the type (5.11) we would be done. Unfortunately it is not because it requires E i to be intervals (recall that the collection of all intervals with rational endpoints is countable whereas the collection of L-measurable subsets of X may not be countable). Hence we need to show that every simple integrable function ? can be approximated by functions of the form (5.11).HereweusethefactthatifasetE is L-measurable then it can be approximated by an interval (i.e. given ε>0, there is an interval I such that 14 m((E\I)∪(I\E)) <ε. (5.12) Nowusing (5.12) for sets{E i } n i=1 we can construct{I i } n i=1 suchthatm((E\I i )∪ (I i \E)) <εfor i =1,...,n.Let b I i = I i \∪ j<i I j ,i=1,...,n.Then b I i are mutu- ally disjoint. DeTne a function ψ(x)= ( y i if x ∈ b I i 0ifx ∈ X\∪ n i=1 b I i . 14 We proved a similar result in Theorem 347 where a measurable set E is approximated by open and closed sets. 200 CHAPTER 5. MEASURE SPACES The function ? and ψ di?er from each other on a set B with su?ciently small measure, namely m(B)=m(x ∈ X : ?(x) 6= ψ(x)) <nε. Hence k??ψk 1 = Z X |?(x)?ψ(x)|dm = Z B |?(x)?ψ(x)| ≤ sup n |y n |m(B) < (nsup n |y n |)ε canbe made arbitrarilysmall bychoosingεsu?cientlysmall. Thusψapproximates ? and ψ is of the form (5.11). Thus we have the following theorem. Theorem 446 L 1 (X) is separable. Proof. The countable dense set in L 1 (X)isthesetgivenby(5.11). 5.5 Appendix - Proofs in Chapter 5 Proof of Theorem 329. Take a closed Tnite interval [a,b]. Since [a,b] ? (a ? ε 2 ,b+ ε 2 ), then m ? ([a,b]) ≤ l(a ? ε 2 ,b+ ε 2 )=b ? a + ε,?ε>0so that m ? ([a,b]) ≤ b ? a. Next we will show that m ? [a,b] ≥ b ? a.But this is equivalent to showing that if {I n } n∈N is an open covering of [a,b], then P n∈N l(I n ) ≥ b?a. By the Heine-Borel Theorem 194 there is a Tnite subcollection that also covers [a,b]. Since the sum of the lengths of the Tnite subcollection can be no greater than the lengths of the original collection, it su?ces to show P N n=1 l(I n ) ≥ b?a for N Tnite. It is possible to construct a Tnite sequence of open intervals < (a k ,b k ) > K k=1 with a k <b k?1 <b k such that a ∈ (a 1 ,b 1 )andb ∈ (a K ,b K ). 15 Thus X n∈N l(I n ) ≥ K X k=1 l(a k ,b k ) = b K ?(a K ?b K?1 )?(a K?1 ?b K?2 )?...?(a 2 ?b 1 )?a 1 ≥ b K ?a 1 ≥ b?a. or m ? ([a,b]) ≥ b?a. Thus m ? ([a,b]) = b?a. 15 Since a ∈∪ N n=1 I n , ?(a 1 ,b 1 ) such that a ∈ (a 1 ,b 1 ). If b 1 ≤ b,thensinceb 1 /∈ (a 1 ,b 1 ), ?(a 2 ,b 2 ) such that b 1 ∈ (a 2 ,b 2 ). Continue by induction. 5.5. APPENDIX - PROOFS IN CHAPTER 5 201 To complete the proof we simply need to recognize that if I is any Tnite interval, then for a given ε>0, there is a closed interval [a,b] ? I such that l(I)?ε<l([a,b)]. Hence, l(I)?ε<l([a,b]) = m ? ([a,b]) ≤ m ? (I) ≤ m ? ? I ¢ = l(I)=l(I) where the Trst equality follows from the Trstpartofthistheorem,theTrst weak inequality follows from monotonicity in Theorem 328, the second weak inequality follows from the deTnition of closure, and the next two equalities follow from the deTnition of length. Since l(I)?ε<m ? (I) ≤ l(I)andε>0 is arbitrary, taking ε → 0givesm ? (I)=l(I). If I is an inTnite interval, m ? (I)=∞. Proof of Theorem 331. If m ? (A n )=∞ for any n, then the inequality holds trivially. Assume m ? A n < ∞, ?n.Thengivenε>0, boundedness implies that for each n, we can choose intervals {I n k } k∈N such that A n ? ∪ k∈N I n k (i.e. the intervals cover A n )and P k∈N l(I n k ) ≤ m ? (A n )+ ε 2n . 16 But the collection {I n k } n,k =(∪ n∈N {I n k } k∈N ) is countable, being the union of a countable number of countable collections and covers∪ n∈N A n (i.e. ∪ n∈N A n ? ∪ n∈N ∪ k∈N I n k ). Hence m ? (∪ n∈N A n ) ≤ X n∈N X k∈N l(I n k ) ≤ X n∈N 3 m ? (A n )+ ε 2 n ′ = X n∈N m ? (A n )+ε. Subadditivity follows since ε ≥ 0 was arbitrary and we can let ε → 0. Proof of Theorem 341. .Corollary 338 already established that L is an algebra. Hence it is su?cient to prove that if a set E = ∪ n∈N E n where each E n is L-measurable, then E is L-measurable. By Theorem 84, we may assume without loss of generality that the E n are mutually disjoint sets. Let A be any set and F N = ∪ N n=1 E n .SinceL is an algebra and E 1 ,...,E N are in L, the sets F N are L-measurable. For any set A,wehave m ? (A)=m ? (A∩F N )+m ? (A∩F c N ) (5.13) ≥ m ? (A∩F N )+m ? (A∩E c ) = N X n=1 m ? (A∩E n )+m ? (A∩E c ) 16 Existence of the countable collection follows from Theorem 108 and the inequality holds by the property of the inTmum that x =infA ??ε>0,?x ∈ A such that x<x+ε. 202 CHAPTER 5. MEASURE SPACES where the Trst equality follows by DeTnition 334, the inequality follows since F c N ? E c17 , and the last equality follows by Lemma 339. Since the left hand side of (5.13) is independent of N,letting N →∞we have m ? (A) ≥ ∞ X n=1 m ? (A∩E n )+m ? (A∩E c ) (5.14) ≥ m ? (A∩E)+m ? (A∩E c ) Here the second inequality follows from Theorem 331. But (5.14) is simply the su?cient condition for E to be L-measurable. Proof of Theorem 345. Let A be any set, A 1 = A∩(a,∞), and A 2 = A∩(?∞,a]. Accordingto(5.1),itissu?cient to show m ? (A) ≥ m ? (A 1 )+ m ? (A 2 ). If m ? A = ∞, the assertion is trivially true. If m ? (A) < ∞, then for each ε>0 there is countable collection {I n } of open intervals which cover A and for which P ∞ n=1 l(I n ) ≤ m ? (A)+ε by the inTmum property in DeTnition 327. Let I 0 n = I n ∩(a,∞)andI 00 n = I n ∩(?∞,a]. Then I 0 n ∪I 00 n = I n ∩R = I n and I 0 n ∩I 00 n = ?. Therefore, l(I n )=l(I 0 n )+l(I 00 n )=m ? (I 0 n )+m ? (I 00 n ). Since A 1 ? (∪ ∞ n=1 I 0 n ), then m ? (A 1 )=m ? (∪ ∞ n=1 I 0 n ) ≤ P ∞ n=1 m ? (I 0 n ). Similarly, since A 2 ? (∪ ∞ n=1 I 00 n ), then m ? (A 2 )=m ? (∪ ∞ n=1 I 00 n ) ≤ P ∞ n=1 m ? (I 00 n ). Thus, m ? (A 1 )+m ? (A 2 ) ≤ ∞ X n=1 [m ? (I 0 n )+m ? (I 00 n )] ≤ ∞ X n=1 l(I n ) ≤ m ? (A)+ε. But since ε>0 was arbitrary, the result follows. Proof of Measurable Selection Theorem 371. By induction, we will deTne a sequence of measurable functions f n : X ?→ Y such that (i) d(f n (z),Γ(z)) < 1 2 n and (ii) d(f n+1 (z),f n (z)) ≤ 1 2 n?1 on X for all n. Then we are done, since from (ii) it follows that hf n i is Cauchy and due to completeness of Y there exists a function f : X ?→ Y such that f n (z) ?→ f (z)onX and by Corollary 358 the pointwise limit of a sequence of measurable functions is measurable. Condition (i)guaranteesthatf (z) ∈ Γ(z),?z ∈ X where f is a measurable selection (here we use the fact that Γ(z)isclosedandd(f(z),Γ(z)) = 0 implies f(z) ∈Γ(z)). 17 Recall by DeMorgan?s Law that F c N = £ ∪ N n=1 E n ¤ c = ∩ N n=1 E c n . 5.5. APPENDIX - PROOFS IN CHAPTER 5 203 Now we construct a sequence hf n i of measurable functions satisfying (i) and (ii).Let {y n ,n∈N} be a dense set in Y (since Y is separable such a countable set exists). DeTne f 0 (z)=y p where p is the smallest integer such that Γ(z)∩B 1 (y p ) 6= ? ( f 0 (z)iswelldeTned because {y n ,n∈N} is dense in Y ). Since Γ is measurable then f ?1 0 (y p )= £ Γ ?1 (B 1 (y p )) ¤ £ ∪ m<p Γ ?1 (B 1 (y m )) ¤ ∈L. Let V be open in Y.Thenf ?1 0 (V)isinatmostacountableunionofsuch f ?1 0 (y p ). Hence f ?1 0 (V) is measurable so that f 0 is measurable. Suppose we already have f k measurable. Then z ∈ f ?1 k (y i ) ≡ D i implies f k (z)=y i and d(f k (z),Γ(z)) < 1 2 k (i.e. Γ(z) ∩ B 2 ?k (y i ) 6= ?). Thereforewecan deTne f k+1 (z)=y p for z ∈ D i where p is the smallest integer such that Γ(z)∩B 2 ?k (y i )∩B 2 ?k?1 (y p ) 6= ?.Thusf k+1 is deTned on X = ∪ i≥1 D i , it is measurable, and we have d(f k+1 (z),Γ(z)) < 1 2 k+1 and d(f k+1 (z),f k (z)) ≤ 1 2 k + 1 2 k+1 ≤ 1 2 k?1 on X. Proof of Theorem 382. (?)Iff is L-measurable and bounded by M, then we can construct sets E k = ? x ∈ E : kM n ≥ f(x) > (k?1)M n ? ,?n ≤ k ≤ n which are measurable, disjoint, and have union E.Thus P n k=?n mE k = mE. DeTne simple functionsψ n (x)= M n P n k=?n kχ E k (x)and? n (x)= M n P n k=?n (k? 1)χ E k (x). Then ψ n (x) ≥ f(x) ≥ ? n (x). Thus L u Z E f(x)dx =inf ψ≥f Z E ψ(x)dx ≤ Z E ψ n (x)dx = M n n X k=?n kmE k (5.15) and L l Z E f(x)dx =sup ?≤f Z E ?(x)dx ≥ Z E ? n (x)dx = M n n X k=?n (k?1)mE k . (5.16) Then (5.15)-(5.16) implies 0 ≤ L u Z E f(x)dx?L l Z E f(x)dx ≤ M n n X k=?n mE k = M n mE. Since mE < ∞ by assumption, lim n→∞ M n mE =0. 204 CHAPTER 5. MEASURE SPACES Proof of Bounded Convergence Theorem 386. Since f is the limit of L-measurable functions f n it is L-measurable by Theorem 364 and hence integrable. By Theorem 366 we know that given ε>0, ?N and an L- measurable set A ? E with mA < ε 4M such that for n ≥ N and x ∈ E\A we have |f n (x)?f(x)| < ε 2mE . Furthermore, since |f n (x)| ≤ M, ?n ∈ N and ?x ∈ E, then |f(x)| ≤ M, ?x ∈ E and |f n (x)?f(x)| ≤ 2M, ?x ∈ A. Therefore, ˉ ˉ ˉ ˉ Z E f n ? Z E f ˉ ˉ ˉ ˉ = ˉ ˉ ˉ ˉ Z E (f n ?f) ˉ ˉ ˉ ˉ ≤ Z E |f n ?f| = Z E\A |f n ?f|+ Z A |f n ?f| < ε 2mE m(E\A)+2MmA<ε,?n ≥ N. where the Trst inequality follows by monotonicity (i.e. (iii) of Theorem 385) and the second equality follows from (v) of Theorem 385). Hence R E f = lim n→∞ R E f n . Proof of Fatou?s Lemma 393. WLOG we may assume the convergence is everywhere since integrals over sets of measure zero are zero. Let h be aboundedL-measurable function such that h ≤ f and vanishes outside asetH = {x ∈ E : h(x) 6=0} of Tnite measure (i.e. mH < ∞). De- Tne a sequence of functions h n (x)=min{h(x),f n (x)}. Then h n is bounded (by the bound for h)andvanishesoutsideH. Moreover, lim n→∞ h n (x)= lim n→∞ min{h(x),f n (x)} =min{h(x),f(x)} = h(x)onH.Since<h n > is a uniformly bounded sequence of L-measurable functions such that h n → h,then lim n→∞ R H h n = R H h by the Bounded Convergence Theorem 386. Since h vanishes outside H, then R E h = R H h. While f n → f a.e., we do not have that the sequence < R E f n (x) > is convergent. However, Z E h =lim n→∞ Z H h n =lim n→∞ Z H h n ≤ lim n→∞ Z E f n where the second equality follows from the fact that liminf=limsup at a limit point and the inequality follows since h n (x) ≤ f n (x)byconstruction. 18 18 We chose liminf rather than limsup since this gives a tighter bound. 5.5. APPENDIX - PROOFS IN CHAPTER 5 205 Taking the supremum over all h ≤ f we have sup h≤f Z E h = Z E f ≤ lim n→∞ Z E f n where the equality uses DeTnition 388. Proof of Lebesgue Dominated Convergence Theorem 404. By Lemma 402, f n is integrable over E. Since lim n→∞ f n = f a.e. on E and |f n |≤ g,then|f|≤ g a.e. on E. Hence f is integrable over E. Now consider a sequence <h n > of functions deTned by h n = f n + g which is nonnegative by construction and integrable for each n. Therefore, by Fatou?s Lemma 393, we have R E (f+g) ≤ lim n→∞ R E (f n +g), which implies R E f ≤ lim n→∞ R E f n by (ii) of Theorem 403. Similarly, construct the sequence <k n > of functions deTned by k n = g?f n which is again nonnegative by construction and integrable for each n. Therefore, by Fatou?s Lemma 393, we have R E (g?f) ≤ lim n→∞ R E (g?f n ), which implies R E f ≥ lim n→∞ R E f n by (ii) of Theorem 403. Proof of Levi?s Theorem. 407Assume that f 1 ≥ 0(otherwisewe would consider ˉ f n = f n ? f 1 ). DeTne ? = {x ∈ A : f n (x) ?→ +∞}. Then ? = ∩ ∞ r=1 ∪ ∞ n=1 ? (r) n ,where? (r) n = {x ∈ A : f n (x) >r}. Using the Chebyshev?s inequality (Lemma ??) m 3 ? (r) n ′ ≤ K r .Since ? (r) 1 ? ? (r) 2 ? ...... ? ? (r) n ? ...., this implies m 3 ∪ ∞ n=1 ? (r) n ′ ≤ K r . But for any r we have ? ?∪ ∞ n=1 ? (r) n . Then m(?) ≤ K r . Since r was arbitrary, we have m(?)= 0. Thus we have proved that hf n i → f a.e. on A. Let?s deTne A r = {x ∈ A : r?1 ≤ f (x) <r,r∈N} and let ?(x)=r on A r . If we prove that ? is integrable on A then using the Lebesgue Dominated Convergence Theo- rem 404 we can conclude that Levi?s theorem holds. Let B s = ∪ s r=1 A r .Since on B s ,f n and f are bounded and ?(x) ≤ f (x)+1, we have R B s ?(x)dm ≤ R B s f (x)dm + m(A) = lim n?→∞ R B s f n (x)dm + m(A) ≤ K + m(A)where R B s ?(x)dm = P s r=1 rm(A r ). Hence we have P s r=1 rm(A r ) ≤ K + m(A) for any s. Boundedness of partial sums of a series means that the inTnite series P ∞ r=1 rm(A r )existsandequals R A ?(x)dm. Proof of Hahn Decomposition Theorem 427. Due to Zorn?s lemma a maximal system E of disjoint measurable sets E with ν (E) < 0 exists. Moreover E is countable (by Lemma 422). Put B = ∪{E ∈E} .Then B is measurable and negative (because all of its subsets have negative sign measure). 206 CHAPTER 5. MEASURE SPACES Let A = X\B .WehaveA∩B = ?, A∪B = X and A is measurable. We have to show that A is positive. By contradiction assume that A is not positive. Then there exists E 0 ∈Xsuch that E 0 ? A and ν (E 0 ) < 0. (5.17) Let E 0 denote the maximal collection of disjoint measurable sets F ? E 0 for which ν (F) > 0 (at least one such set F exists because otherwise E 0 would be negative and would contradict maximality of E ).DuetoLemma422the collection E 0 is countable. Let F 0 = ∪{F ∈E 0 }. We have ν (F 0 ) > 0andF 0 ? E 0 . (5.18) It follows that E 0 \F 0 is negative because, by construction, it doesn?t contain a positive measurable set. Then from the equality E 0 =(E 0 \F 0 ) ∪ F 0 we have ν (E 0 )=ν (E 0 \F 0 )+ν (F 0 ). (5.19) From (5.17),(5.18), and (5.19) we have ν (E 0 \F 0 ) < 0. The set E 0 \F 0 is then negative, with negative measure, and (E 0 \F 0 )∩B = ?which contradicts the maximality of set B. Proof of Radon-Nikodyn Theorem 434. By (ii) of Lemma 432, it su?ces to deal with ν which is non-negative (i.e. a measure). Also since μ is σ-Tnite, the whole space X can be decomposed into countably many disjoint sets {E i } for which μ(E i ) < ∞. Hence in the proof we can assume that μ and ν are both Tnite measures on X.Let G be the set of all non-negative, X?measurable, integrable functions g for which Z E gdμ≤ ν (E), ?E ∈X. Setting E = X we have Z X gdμ≡ Z gdμ≤ ν (X), ?g ∈ G. Hence the set of real numbers ?R gdμ,g ∈ G a is bounded from above by ν (X) and thus there exists a real number α such that α =sup g∈G Z gdμ. 5.5. APPENDIX - PROOFS IN CHAPTER 5 207 Then by the supremum property, there exists a sequence hg n i ∞ n=1 from G such that lim n?→∞ Z g n dμ = α. Let us set for n ∈N,f n =max{g 1 ,......,g n } . Clearly f n ≤ f n+1 for n ∈N. Next we show that f n ∈ G. It su?ces to show that if g 1 ,g 2 ∈ G,then max{g 1 ,g 2 }∈ G. Given a set E ∈X ,deTne F = {x ∈ E :max(g 1 ,g 2 )(x)=g 1 (x)} G = {x ∈ E :max(g 1 ,g 2 )(x)=g 2 (x) 6= g 1 (x)}. Then becauseF andGare disjoint we have R E max(g 1 ,g 2 )dμ = R F∪G max(g 1 ,g 2 )dμ = R F g 1 dμ+ R G g 2 dμ ≤ ν (F)+ν (G)=ν (E). Thus max(g 1 ,g 2 ) ∈ G and con- sequently each function f n , n ∈N belongs to G. Since α =sup gDG R gdμ, then R f n dμ ≤ α ,forn ∈ N.Let f : X → R be deTned as f (x) = lim n?→∞ f n (x). This is a well deTned function because for anyx ∈ X,the sequencehf n (x)iis non-decreasing and hence limf n (x) exists. Then by Levi?s Theorem 407 f is integrable and R fdμ= lim n→∞ R f n dμ. We next show that f ∈ G. For E ∈X and n ∈N we have χ E f n ≤ χ E f n+1 and lim n?→∞ χ E (x)f n (x)=χ E (x)f (x)forallx ∈X. Then Z E fdμ= Z fχ E dμ = Z lim n?→∞ f n χ E dμ = lim n?→∞ Z f n χ E dμ = lim Z E f n dμ ≤ ν (E). (5.20) This shows that f ∈ G and hence R fdμ≤ α because f n ≥ g n for n ∈N we have R fdμ = lim n?→∞ R f n dμ ≥ lim n?→∞ R g n dμ = α.Combining the last two inequalities we have R fdμ= α. Wenowshowthat R E fdμ = ν (E) for all E ∈ X. By contradiction, assume this equality doesn?t hold. Then due to (5.20), R E fdμ< ν(E)and then the set function ν 0 (E)=ν (E)? Z E fdμ (5.21) is a Tnite measure not indentically equal to zero. Since ν 0 << μ (because ν<<μand R fdμ<<μ ) using Lemma 433 there exists ε>0andE 0 ∈ X such that μ(E 0 ) > 0and εμ(E 0 ∩F) ≤ ν 0 (E 0 ∩F),?F ∈X. (5.22) 208 CHAPTER 5. MEASURE SPACES Set g 0 = f+ εχ E 0 . We have R g 0 dμ = R fdμ+ εμ(E 0 ) > R fdμ = α. If we show that g 0 ∈ G this would contradict the fact that α =sup gDG R gdμ.For anyF ∈X and using(5.21) and (5.22) we have R F g 0 dμ = R F ? f +εχ E 0 ¢ dμ = R F fdμ+εμ(E 0 ∩F) ≤ R F fdμ+ν 0 (E 0 ∩F)= R F fdμ+ν (E 0 ∩F)? R E 0 ∩F fdμ= R F\(E 0 ∩F) fdμ+ ν (E 0 ∩F) ≤ ν (F\(E 0 ∩F))+ ν (E 0 ∩F)=ν (F). Hence g 0 ∈ G, which leads to the contradiction. The uniqueness of f (except on a set of measure zero) follows from Exercise 5.2.4. Proof of Theorem 443. Let <f n > be a Cauchy sequence in L 1 so that kf m ?f n k 1 → 0asm,n →∞. Then we can Tnd a sequence of indices <n k > with n 1 <n 2 <...<n k < ... such that ° ° f n k ?f n k?1 ° ° 1 = Z X |f n k ?f n k?1 |dm < 1 2 k ,k=1,2,... (5.23) DeTne a sequence <g k > by g k = f n k ? f n k?1 for k =2,3,... with g 1 = f n 1 .Then (5.23) is simply R X |g k |dm < 1 2 k and taking the inTnite sum of both sides yields ∞ X k=1 Z X |g k |dm ≤ ∞ X k=1 1 2 k =1. Thus by the Corollary of Levi?s Theorem 408, P ∞ k=1 |g k | converges a.e. on X.Since g k ≤ |g k |, then P ∞ k=1 g k converges a.e. on X (i.e. there exists a function f such that P ∞ k=1 g k → f a.e. on X). But f = ∞ X k=1 g k =lim J→∞ J X k=1 g k = lim J→∞ (g 1 +g 2 +...+g J ) =lim J→∞ £ f n 1 +(f n 2 ?f n 1 )+(f n 3 ?f n 2 )+... +(f n J ?f n J?1 ) ¤ = lim J→∞ f n J . Hence <f n k >→f a.e. on X. Nowweshowthat<f n k >→ f with respect to k·k 1 and that f ∈ L 1 (X). Since <f n k > is Cauchy in L 1 (X) (and a subsequence of a Cauchy sequence is Cauchy), given ε>0, Z |f n k ?f n l |dm < ε (5.24) for su?ciently large k and l. Hence by Fatou?s Lemma 393 we can take the limit as l →∞behind the integral in (5.24) obtaining Z |f n k (x)?f(x)|dm = kf ?f n k k 1 ≤ ε. 5.5. APPENDIX - PROOFS IN CHAPTER 5 209 Since kfk 1 = kf ?f n k +f n k k 1 ≤kf ?f n k k 1 +kf n k k 1 ≤ ε+kf n k k 1 < ∞ it follows that f ∈ L 1 (X)and<f n k >→ f in L 1 (X). But by Lemma 173, if a Cauchy sequence contains a subsequence converging to a limit, then the sequence itself converges to the same limit. Hence <f n >→ f in L 1 (X). 210 CHAPTER 5. MEASURE SPACES Figures for Sections X to X 5.6. BIBILOGRAPHY FOR CHAPTER 5 211 5.6 Bibilography for Chapter 5 This material is based on Royden (Chapters 3, 4, 11, 12) and Jain and Gupta (1986) Lebesgue Measure and Integration, New York: Wiley (Chapters 3 to 5). 212 CHAPTER 5. MEASURE SPACES Chapter 6 Function Spaces In this chapter we will consider applications of functional analysis in eco- nomics such as dynamic programming, existence of equilibrium price func- tionals, and approximation of functions. In the Trst case, we can represent complicated sequence problems such as the optimal growth model by a sim- ple functional equation. In particular, a representative household?s lifetime utility, conditional on an initial level of capital k is denoted by the function v(k)whichsolves v(k)= max k 0 ∈[0,K] u(f(k)?k 0 )+βv(k 0 ) where k 0 denotes next period?s choice of capital which lies in some compact set X =[0,K], u : R + →R is an increasing, continuous function representing the household?s preferences over consumption which is just output that is not saved for next period (i.e. f(k)?k 0 )andβ ∈ (0,1) represents the fact that households discount the future. We can think of the above equation as deTning an operator T which maps continuous functions deTnedona compact set (the v(k 0 ) on the right hand side of the above equation) into continuous functions (the v(k) on the left hand side). If we let C(X)denote the set of continuous functions deTnedonthecompactsetX,thenwehave T : C(X) →C(X). In this chapter we analyse under what conditions solutions to such functional equations exist. Another simple example of such operators from mathematics are di?erential equations. Let (X,d X )and(Y,d Y ) be metric spaces. In Chapter 4 we studied func- tions f that took points in a metric space (X,d X ) into points in a metric space (Y,d Y ). Now let F(X,Y) denote the collection of all such functions 213 214 CHAPTER 6. FUNCTION SPACES f : X →Y.LetB(X,Y) be a subset of F(X,Y) with the property that for each pair f,g ∈ B(X,Y), the set {d Y (f(x),g(x)) : x ∈ X} is bounded. We say B(X,Y) is the collection of all bounded functions. DeTnition 447 Given B(X,Y),deTne a metric d : B×B→R by d(f,g)=sup{d Y (f(x),g(x)) : x ∈ X}. This metric is called the sup (supremum) metric. SeeFigure6.1 Exercise 6.0.1 Show that d is a metric. To do so, see DeTnition 125. Example 448 The existence of the metric d (i.e. of the supremum) is guar- anteed only if the space is bounded. It should be clear that there are many functions which do not belong to B(X,Y) and hence that there are many functions upon which d cannot be applied. For one example, let f :(0,1) →R be given by f(x)= 1 x and g(x)=0. We saw in chapter 4 that a fundamental property of a metric space is its completeness. What can be said about the completeness of (B,d)? Theorem 449 Let (X,d X ) and (Y,d Y ) be metric spaces. If (Y,d Y ) is com- plete, then the metric space (B,d) of all bounded functions f : X → Y with sup norm d is complete. Exercise 6.0.2 Prove Theorem 449.Hint: To prove the theorem, the follow- ing method of constructing f is useful. Let <f n > be a Cauchy sequence in (B,d).Then?x ∈ X,the sequence <f n (x) > is Cauchy in Y and since (Y,d Y ) is complete we have <f n (x) > converges to f(x) in Y,?x ∈ X. Show that f : X → Y deTned by lim n→∞ f n (x),?x ∈ X, is the function that <f n > converges to with respect to the sup norm. If Y is a vector space, then the metric space (Y,d Y )isalsoanormed vector space by Theorem 207. Then F(X,Y) is a vector space where (f + g)(x)=f(x)+g(x)and(αf)(x)=αf(x). Its subspace of all bounded functions B(X,Y) ? F(X,Y) is a normed vector space (B,k·k)withthe norm kfk = d(f,0) = sup{kf(x)k,x∈ X}. This norm is called the sup norm. Theorem 449 states that if (Y,k·k Y ) is complete, then (B,k·k)isalso complete. 215 A sequence <f n > converges to f : X → Y in B with respect to the sup norm if kf n ?fk =sup{kf n (x)?f(x)k,x∈ X}→ 0asn →∞. In Chapter 4 we introduced two types of convergence of a sequence of func- tions: pointwise and uniform. These two types of convergence are deTned in terms of the metric (norm) of the space Y (e.g. if (Y,d Y )=(R,|·|),then in terms of the absolute value). Now we introduced (B,d)whered is the metric on B(X,Y). As in any metric space we deTne convergence of elements (in this case functions) with respect to its metric (in this case d).The question is, ?Is convergence with respect to d related to pointwise or uniform convergence respectively??. The following theorem addresses this question. Theorem 450 Let <f n > be a sequence of functions in B(X,Y).Then <f n >→f ∈B(X,Y) with respect to the sup norm if and only if <f n >→ f uniformly on X. Exercise 6.0.3 Prove Theorem 450. In light of Theorem 450, one might wonder if there exists a metric d on F(X,Y) or on a subspace such that convergence of <f n > with respect to this metric would be equivalent to pointwise convergence. Unfortunately, no such metric exists. Before proceeding, we list the principal results of the chapter. Here we introduce two important function spaces: the space of bounded continuous functions (denoted C(X)) and p-integrable functions (denoted L p (X)). We give necessary and su?cient conditions for compactness in C(X) in Ascoli?s Theorem 458. Then we deal with the problem of approximating continuous functions. The fundamental result is given in a very general set of Theorems by Stone and Weierstrass (the lattice version is 464 and the algebraic version is 468) which provide the conditions for a set to be dense in C(X). Next the Brouwer Fixed Point Theorem 302 of Chapter 4 on Tnite dimensional spaces is generalized to inTnite dimensional Banach spaces in the Schauder Fixed Point Theorem 475. Next we introduce the L p (X) space and show that it is complete in the Riesz-Fischer Theorem 481. Among L p spaces, we show that L 2 is a Hilbert space (i.e. that it is a complete normed vector space with the inner product) and consider the Fourier series of a function in L 2 . Then we introduce linear operators and functionals, as well as the notion of 216 CHAPTER 6. FUNCTION SPACES a dual space of a normed vector space. We construct dual spaces for most common spaces: Euclidean, Hilbert, ! p , and most importantly for L p in the Riesz Representation Theorem 532. Next we show that one can construct bounded linear functionals on a given space X in the Hahn-Banach Theorem 539 which is used to prove certain separation results such as the fact that two disjoint convex sets can be separated by a linear functional in Theorem 549. Such results are used extensively in economics; for instance, it is employed to establish the Second Welfare Theorem. The chapter ends with nonlinear operators. First we introduce the weak topology on a normed vector space and develop a variational method of optimizing nonlinear functions in The- orem 572. Then we consider another method of Tnding the optimum of a nonlinear functional by dynamic programming. 6.1 The set of bounded continuous functions Let C(X,Y) denote the set of all continuous functions f : X → Y. In order to deTne a normed vector space, we need to equip this set with a norm. We Trst consider the sup norm. Since a continuous function can be unbounded (e.g. f :(0,1] →R given by f(x)= 1 x ), the sup norm may not be well deTned on the whole set C(X,Y). Hence we will restrict attention to a subset of C(X,Y) that contains only bounded continuous functions, which we denote BC(X,Y). Next we consider important properties of this space (BC(X,Y),k·k ∞ )where k·k ∞ is the sup norm. 6.1.1 Completeness Even if (Y,d Y ) is complete, we cannot directly use Theorem 449 to prove that (BC(X,Y),k·k ∞ )iscompletebecause(BC(X,Y),k·k ∞ )isasubspace ofthecompletespace(B(X,Y),k·k ∞ ). But if we show that BC(X,Y)is closed in B(X,Y), then the fact that a closed subspace of a complete space is complete by Theorem 177 in Chapter 4 would imply that (BC(X,Y),k·k ∞ ) is complete. Lemma 451 BC(X,Y) is closed in B(X,Y); that is, if a sequence <f n > of functions from BC(X,Y) converges to a function f : X → Y,thenf is continuous. 6.1. THE SET OF BOUNDED CONTINUOUS FUNCTIONS 217 Proof. (Sketch) We need to show that f is continuous at any x 0 ∈ X (i.e. if ?ε>0, ?δ>0 such that ?x ∈ X with d X (x,x 0 ) <δwe have d Y (f(x),f(x 0 )) <ε). By the triangle inequality d Y (f(x),f(x 0 )) ≤ d Y (f(x),f n (x)) +d Y (f n (x),f n (x 0 )) +d Y (f n (x 0 ),f(x 0 )). The Trst and third terms on the right hand side are arbitrarily small (i.e. ε 4 ) since f n →f with respect to the sup norm and the second term is arbitrarily small ( ε 2 )sincef n is continuous. The next theorem establishes that (BC(X,Y),k·k ∞ ) is a Banach space. Theorem 452 The normed vector space (BC(X,Y),k·k ∞ ) is complete. Proof. Follows from Lemma 451 and Theorem 177. That (BC(X,Y),k·k ∞ ) is a Banach space follows from DeTnition 208. In the remainder of this section, we will assume that X is a compact set and (Y,d Y )=(R,|·|). In this case f ∈C(X,R) is bounded by Theorem 261. Hence, instead of (BC(X,R),k·k ∞ ) we will simply use the notation C(X). Just remember, whenever you see C(X) we are assuming that X is compact, Y is R, and we are considering the sup norm. While uniform convergence implies pointwise convergence, we know the the converse does not hold (e.g. f n :[0,1] → R given by f n (x)=x n ). In C(X), however, there is a su?cient condition for uniform convergence (and hence for convergence with respect to the sup norm) in terms of pointwise convergence. Lemma 453 (Dini?s Theorem) Let <f n > be a monotone sequence in C(X) (e.g. f n+1 ≤ f n ,?n). If the sequence <f n > converges pointwise to a continuous function f ∈C(X), it also converges uniformly to f. Proof. (Sketch)Let <f n > be decreasing, f n → f pointwise, and deTne f n = f n ?f.Then - f n ? is a decreasing sequence of non-negative functions with f n → 0pointwise.Givenε>0, for each x ∈ X, pointwise convergence guarantees we can Tnd an index N(ε,x)forwhich0≤ f N(ε,x) (x) <ε.Due to continuity of f N(ε,x) there is a δ(x) neighborhood around x such that 0 ≤ f N(ε,x) (x 0 ) <εfor each x 0 of this neighborhood and due to monotonicity of - f n ? we have 0 ≤ f n (x 0 ) <εfor n ≥ N(ε,x). Since X is compact, there are Tnitely many points x i ∈ X whose neighborhoods B δ(x i ) (x i )coverX. 218 CHAPTER 6. FUNCTION SPACES From Tnitely many corresponding indices N(ε,x i )wecanTnd the minimum N(ε) required for uniform convergence. It is clear from the above example f n (x)=x n that the requirement that f be continuous is essential. In the above case, f is clearly not continuous since f(x)= ? 0ifx ∈ [0,1) 1ifx =1 . 6.1.2 Compactness While Theorem 193 established (necessary and) su?cient conditions (com- pleteness and total boundedness) for compactness in a general metric space (and hence a general normed vector space), total boundedness was di?cult to establish. As in the case of the Heine-Borel Theorem 194 (which provided simple su?cient conditions for a set in R n to be compact), here we develop the notion of equicontinuity which will be included as a su?cient condition for compactness. DeTnition 454 Let (X,d X ) and (Y,d Y ) be metric spaces. Let D be a sub- set of the function space BC(X,Y). If x 0 ∈ X,thesetD of functions is equicontinuous at x 0 if ?ε>0, ?δ(x 0 ,ε) such that ?x ∈ X, d X (x,x 0 ) <δ implies d Y (f(x),f(x 0 )) <ε,?f ∈D.IfthesetD is equicontinuous at x 0 for each x 0 ∈ X, then it is equicontinuous on X. Notice that the primary di?erence between the deTnition of equicontinuity and that of continuity in 244 is that here d Y (f(x),f(x 0 )) <εmust hold for all f ∈ D, while in the former this condition must hold only for the given function f. Example 455 Let f n :[0,1] → R be given by f n (x)=x n and D = {f n }. At what points is D equicontinuous and at what points does it fail to be equicontinuous? It fails at x =1. To see this, let x 0 =1.Givenε> 0,there exists N ∈ N such that for x ∈ [0,1] with d X (x,1) <δwe have |f N (x) ?f N (1)| =1?x N ≥ ε . Take the logs of both sides of 1?ε ≥ x n and notice that log(x) < 0 on [0,1] to yield n ≥ ln(1?ε) ln(x) (by the Archimedean property such an n exists) so that we can take N = w 3 ln(1?ε) ln(x) ′ +1. In general δ in DeTnition 454 depends on both ε and x. If, however, the choice of δ is independent of x we say that the set of functions D is uniformly 6.1. THE SET OF BOUNDED CONTINUOUS FUNCTIONS 219 equicontinuous on X if ?ε>0, ?δ(ε)suchthat?x,x 0 ∈ X, d X (x,x 0 ) <δ implies d Y (f(x),f(x 0 )) <ε,?f ∈D. If X is compact, then these two notions are equivalent as the following lemma shows. 1 Lemma 456 Let X be compact. A subset D ?C(X) is equicontinuous i? it is uniformly equicontinuous. Proof. (Sketch) (?)Since D ? C(X) is equicontinuous at x ∈ X,then given ε we can Tnd δ(ε,x). Then the collection {B δ(ε,x) (x),x∈ X} covers X and because X is compact, there exists a Tnite subcollection covering X and a corresponding Tnite set of {δ(ε,x i ),i=1,...,k}. Then there exists a smallest δ(ε)thatdoesn?tdependonx. Equicontinuity is related to total boundedness when both X and Y are compact as the following lemma shows. Lemma 457 Let X be compact and Y ?R be compact. Let D be a subset of C(X,Y). Then D is equicontinuous i? D is totally bounded in the sup norm. Proof. (Sketch) (?)LetD be totally bounded. Given ε>0, choose ε 1 > 0andε 2 > 0suchthat2ε 1 + ε 2 <ε. Then for given ε 1 , there are Tnitely many functions {f i ,i=1,...,k} such that ε 1 balls around them cover D. Since any Tnite collection of continuous functions is equicontinuous (see Exercise 6.1.1), given x 0 and ε 2 , there exists δ such that if d X (x,x 0 ) <δ, then d Y (f i (x),f i (x 0 )) <ε 2 for i =1,...k. We make a similar ?estimate? for any f ∈ D. But because there is an f i which is ?ε 1 -close? to f,then we split d Y (f(x),f(x 0 )) into three parts (using the triangle inequality) d Y (f(x),f(x 0 )) ≤ d Y (f(x),f i (x)) +d Y (f i (x),f i (x 0 )) +d Y (f i (x 0 ),f(x 0 )) ≤ ε 1 +ε 2 +ε 1 <ε. The Trst and third terms are su?ciently small because f i is ?ε 1 -close? to f. The second term is su?ciently small because {f i ,i=1,...,k} is equicontin- uous. 2 1 This lemma is similar to the result that a continuous function on a compact set is uniformly continuous. 2 Notice that we haven?t used compactness of X nor Y in this direction. Thus total boundedness always implies equicontinuity. 220 CHAPTER 6. FUNCTION SPACES (?)SinceX is compact and D is equicontinuous, then by Lemma 456 D is uniformly equicontinuous. Then given ε 1 there exists δ(ε 1 )andTnitely many points {x i ∈ X,i =1,...,k} such that {B δ(ε i ) (x i )} covers X and d Y (f(x i ),f(x)) <ε 1 for x ∈ B δ(ε i ) (x i ) for all f ∈ D.Since Y is compact, then Y is totally bounded by Theorem 198. Then given ε 2 there exist Tnitely many points {y i ∈ Y,i =1,...,m} such that {B ε 2 (y i ),i=1,..,m} covers Y. Let J be the set of all functions α such that α : {1,...,k} → {1,...,m}. The set J is Tnite (it contains m k elements). For α ∈ J,choosef ∈ D such that f(x i ) ∈ B ε 2 (y α(i) )andlabelitf α (for example, the index α of the func- tion f α in Figure 6.1.1 is α(1,2,3,4) = (2,3,1,1) because f(x 1 ) ∈ B ε 2 (y 2 ), f(x 2 ) ∈ B ε 2 (y 3 ),f(x 3 ) ∈ B ε 2 (y 1 ),f(x 4 ) ∈ B ε 2 (y 1 )). Then the collection of open balls {B ε (f α ),α∈ J} with ε ≤ 2ε 1 +ε 2 is a Tnite ε-covering of D.Let f ∈ D.Thenf(x i ) ∈ B ε 2 (y α(i) )fori =1,...,k. Choose this index α and corresponding f α .Then one must show that f ∈ B ε (f α ). Let x ∈ X. Then there exists i such that x ∈ B δ(ε 1 ) (x i )where d Y (f(x),f α (x)) ≤ d Y (f(x),f(x i )) +d Y (f(x i ),f α (x)) +d Y (f α (x i ),f α (x)) ≤ ε 1 +ε 2 +ε 1 ≤ ε. The Trst and third terms are su?ciently small becauseD isuniformlyequicon- tinuous and the second term is su?ciently small because f(x i ),f α (x i ) ∈ B ε 2 (y α(i) ). Exercise 6.1.1 Show that a set which contains Tnitely many continuous functions is equicontinous. Hint: Since the collection of f i is Tnite, there are Tnitely many δ i associated with each one and hence the minimum of those δ i is well deTned. Thus d Y 3 f (x),f j (i) (x) ′ <εholds for any x ∈ X.Hence there are Tnitely many open balls {B ε (f j(i) ),i=1,...,k} covering D. Before stating the main theorem of this subsection, we point out something about boundedness in C(X). In a normed vector space (X,k·k)asubsetA is said to be bounded if it is contained in a ball (i.e. ?M such that kfk≤ M,?f ∈A). Since in C(X) we have kfk =sup x |f(x)|, this is equivalent to ?M such that |f(x)| ≤ M, ?f ∈ D, ?x ∈ X. This is sometimes called uniform boundedness of a set of functions D. However, in terms of the normed vector space C(X)itisjust the normal deTnition of boundedness. Analogous to the Heine-Borel theorem in R, we now state necessary and su?cient conditions for compactness in C(X). 6.1. THE SET OF BOUNDED CONTINUOUS FUNCTIONS 221 Theorem 458 (Ascoli) Let X be a compact space. A subset D of C(X) is compact i? it is closed, bounded, and equicontinuous. Proof. Step 1:IfD is bounded, then |f(x)| ≤ M, ?f ∈ A, ?x ∈ X.Then D is a subset of the ball B M (0). Let Y be the closure of B M (0). Then Y is a closed, bounded subset of R and hence compact. Then D ?C(X,Y)with both X and Y compact. Step 2:(?) Suppose D ?C(X,Y) is compact. Then by Lemma 189, D is closed. By Lemma 187, D is bounded. By Theorem 198, D is to- tally bounded. But with X and Y compact (step 1) total boundedness is equivalent to equicontinuity. Step 3:(?)SupposeD is closed, bounded and equicontinuous (or to- tally bounded). By Theorem 177 a closed subset of a complete normed vector space C(X) is complete. By Theorem 198, completeness and total boundedness is equivalent to compactness. Example 459 Is the unit ball in C(X) a compact set? Without loss of gen- erality we can take X =[0,1]. The unit ball B 1 in C([0,1]) is B 1 (0) = {f ∈ C([0,1]) : kfk≤ 1}.B 1 (0) is clearly bounded and closed. Is it equicontinuous? In Example 455 we showed that {x n ,n∈ N} was not equicontinuous. But since kx n k =sup x∈[0,1] |x n | =1for each n ∈ N,then{x n ,n∈ N} ? B 1 (0). Thus B 1 (0)contains a subset which is not equicontinuous so that B 1 (0) is not equicontinuous. Then by Ascoli?s Theorem 458, B 1 (0)is not compact. In the previous example, how can the unit ball be closed if it contains a sequence <x n > converging to a function that doesn?t belong to B 1 (0)? It is because <x n > isnotconvergentinC([0,1]). 6.1.3 Approximation For many applications it is convenient to approximate continuous functions by functions of an elementary nature (e.g. functions which are piecewise linear or polynomials). DeTnition 460 Let f ∈ F(X,Y) with norm k·k F . Given ε>0, we say g (ε-)approximates f on X with respect to k·k F if kf ?gk F <ε.If f ∈C(X), thensinceweareusingthesupnorm,thisisequivalentto?ε>0, sup x∈X {|f(x)?g(x)|} <εin which case it is clear that the approximation is uniform. See Figure 6.1.2. 222 CHAPTER 6. FUNCTION SPACES The concept of approximation can be stated in terms of dense sets. Let H be a subset of C(X). Recall from DeTnition 153 that H is dense in C(X) if the closure of H,denotedH,satisTes H = C(X). But by Theorem 148, f ∈ H i? for any ε>0, there exists g ∈ H such that kf ?gk <ε.In other words, a function f ∈ C can be approximated by a function g ∈ H ?C(X) if H = C(X). An alternative way to see this, is suppose we are trying to approximate a continuous function f : X→R with X compact and suppose we know that the set of all polynomials is dense (we will prove this later). Then we could think about starting with a large degree of approximation error (say ε 1 =1) and ask what polynomial function (call it P 1,n (x)=a 0 +a 1 x+a 2 x 2 +...+a n x n ) bounds the error within ε 1 (i.e. kP 1,n (x)?f(x)k <ε 1 ). If this error is too large, we could choose a smaller one, say ε 2 (= 1 2 ) <ε 1 and look for another polynomial function P 2,n (x) such that kP 2,n (x)?f(x)k <ε 2 . We could let H = {P 1 ,P 2 ,...}?C(X). Approximation is essentially constructing a sequence of polynomials <P n > that converges to f with respect to the sup norm (i.e. uniformly). The most generalapproximation theoremis known as the Stone-Weierstrass Theorem which provides conditions under which a vector subspace of C(X) is dense in C(X). There are two versions of this result: one uses lattices and the other is algebraic. We begin by noting that the space C(X) has a lattice structure. If f,g∈ C(X), so are the ?meet? and ?join? functions f ∧ g and f ∨ g deTned as (f ∧g)(x)=min[f(x),g(x)] and (f ∨g)(x)=max[f(x),g(x)]. To see that f ∧g and f ∨g are continuous, note that (f ∧g)(x)= 1 2 (f +g)? 1 2 |f ?g| = ? 1 2 (f +g)? 1 2 (f ?g)iff>g 1 2 (f +g)+ 1 2 (f ?g)iff<g and (f ∨g)(x)= 1 2 (f +g)+ 1 2 |f ?g|. But linear combinations of continuous functions are continuous by Theorem 251. Recall from DeTnition 42, a subset H of C(X) is a lattice if for every pair of functions f,g∈H,wealsohavef ∧g and f ∨g in H. DeTnition 461 AsubsetH of C(X) is called separating (or H separates points) if for any two distinct points x,y ∈ X,?h ∈H with h(x) 6= h(y). 6.1. THE SET OF BOUNDED CONTINUOUS FUNCTIONS 223 Example 462 H 1 = {all constant functions f : X → R} is a lattice but is not separating. To see this, let f(x)=κ,κ ∈R be a constant function. Then H 1 is a totally ordered set (since for any two distinct elements κ 1 and κ 2 in R we have, say, κ 1 <κ 2 ). Furthermore, these two elements have a maximum and a minimum. However, this set is not separating since f(x)=f(y)=κ for x 6= y. The lattice version of the Stone-Weierstrass theorem is the consequence of the following lemma. Lemma 463 Suppose X has at least two elements. Let H be a subset of C(X) satisfying: (i) H is a lattice; (ii) Given x 1 ,x 2 ∈ X, x 1 6= x 2 ,α 1 ,α 2 ∈R, there exists h ∈H such that h(x 1 )=α 1 and h(x 2 )=α 2 . Then H is dense in C(X). Proof. (Sketch) Take f ∈ C(X)andε>0. We want to Tnd an element of H that is within ε of f. First, Tx x ∈ X. By assumption (ii), ?y 6= x, ?η y ∈ H such that η y (x)= f(x)andη y (y)=f(y). For y 6= x,setO y = {x 0 ∈ X : η y (x 0 ) >f(x 0 )?ε}. This set is open since η y and f are continuous and as y varies, ∪ y6=x O y is an open covering of X. Since X is compact, there exists Tnitiely many sets O y such that X = ∪ N j=1 O y j with y j 6= x, ?j. Then let v x =max{η y 1 ,...,η yN }. Since H is a lattice, v x ∈ H with the same properties as the η y ?s; namely, v x (x)=f(x)andv x (x 0 ) >f(x 0 )?ε,?x 0 ∈ X. Second, let x vary. For each x ∈ X,let? x = {x 0 ∈ X : v x (x 0 ) <f(x 0 )+ε}. By exactly the same argument as the Trst step, there exists Tnitely many sets ? x 1 ,...,? x J covering X.Setv =min{v x 1 ,...,v x J }. Then v ∈ H and f(x 0 )?ε<v<f(x 0 )+ε, ?x 0 ∈ X. This means kf ?vk≤ ε. Assumption (ii) in Lemma 463 appears hard to verify. But as we will show, if H separates points in X and if H contains all constant functions, then H satisTes assumption (ii) of Lemma 463. Theorem 464 (Stone-Weierstrass L) If: (i) H is a separating vector subspace of C(X); (ii) H is a lattice; (iii) H contains all constant functions. Then H is dense in C(X). Proof. To apply the previous lemma, we must show that assumptions (i) and (iii) of the Theorem imply assumption (ii) of Lemma 463. 224 CHAPTER 6. FUNCTION SPACES Let x 1 ,x 2 ∈ X with x 1 6= x 2 . Since H is separating, ?h ∈ H such that h(x 1 ) 6= h(x 2 ). Let α 1 ,α 2 ∈R, then the system of linear equations α 1 = μ+λh(x 1 ) α 2 = μ+λh(x 2 ) has a unique solution (μ,λ) ∈R 2 since rank · h(x 1 )1 h(x 2 )1 ? =2 because h(x 1 ) 6= h(x 2 ). Set g(x)=μ + λh(x). Since H is a vector subspace containing constant functions, g ∈H.Moreover,weseethatg(x 1 )=α 1 and g(x 2 )=α 2 so that assumption (ii) of Lemma 463 is satisTed. 3 The Stone-Weierstrass theorem is very general and covers many classes of elementary functions to approximate continuous functions. We now state the algebraic version of the Stone-Weierstrass theorem. Following this, we will apply whichever version is more suitable to some concrete examples. DeTnition 465 We call a vector subspace H ?C(X) an algebra of func- tions (not to be confused with an algebra of sets) if it is closed under multi- plication. Hence H ?C(X) is an algebra of functions if: (i) ?f,g∈H and α,β ∈R, we have αf +βg∈H; (ii) ?f,g∈H,wehavef·g ∈H (where f·g is deTned as (f ·g)(x)=f(x)·g(x),?x ∈ X). Before stating the algebraic version of the Stone-Weierstrass Theorem, we prove the following set of lemmas. Lemma 466 AvectorsubspaceH ?C(X) is a lattice i? for every element h ∈H,thefunction|h|∈H as well. Proof. (?)Leth ∈ H.Then|h| =max(h,0)?min(h,0) and since H is a lattice as well as a vector subspace, then the r.h.s. is from H.Thus |h|∈H. (?)Wecanwritemax(f,g)= 1 2 £ (f +g)+ 1 2 |f ?g| ¤ and min(f,g)= 1 2 £ (f +g)? 1 2 |f ?g| ¤ . The right hand sides hold since f,g∈H,theabsolute value is fromH,andH is a vector space. 3 Instead of assuming that H contains all constants it is su?cient to assume that H contains just the constant c =1.SinceH is a vector space, it contains all scalar multiples of 1. 6.1. THE SET OF BOUNDED CONTINUOUS FUNCTIONS 225 Note that Lemma 466provides a very convenient way of checking that a subset of functions is a lattice. In the next lemma we construct a sequence of polynomials <P n > converging uniformly to |x| on [?1,1]. Lemma 467 There exists a sequence of polynomials <P n > that converges uniformly to f(x)=|x| on [?1,1]. Proof. (Sketch) We construct the sequence <P n (x) > on [?1,1] by in- duction: for n =1,P 1 (x) = 0; given P n (x), we deTne P n+1 (x)=P n (x)+ 1 2 (x 2 ?P 2 n (x)),?n ∈ N. Then show that (i) P n (x) ≤ P n+1 (x), ?x ∈ [?1,1] (i.e. P n is nondecreasing) and (ii) P n (x) →|x| pointwise on [?1,1].Since the limit function |x| is continuous, by Dini?s Theorem 453 <P n (x) > converges to |x| uniformly on [?1,1]. Theorem 468 (Stone-Weierstrass A) Every separating algebra of func- tions H ?C(X) containing all the constant functions is dense in C(X). Proof. If H isaseparatingsubalgebraofC(X) containing constant func- tions, then so is its closure H. Therefore it su?ces to show that H is a lattice and apply Theorem 464. Let f ∈ H be nonzero. By Lemma 467, ? <P n > of polynomials that converges uniformly on [?1,1] to f(x)=|x|. Since ?1 ≤ f kfk ≤ 1,the se- quence of fuctions D P n 3 f kfk ′E converges uniformly to ˉ ˉ ˉ f kfk ˉ ˉ ˉ = |f| kfk . But ? P n μ f kfk ?à → |f| kfk ? ? kfkP n μ f kfk ?à →|f|. Since H is an algebra, all terms in this sequence are in H (because an algebra is closed under linear combination and multiplication). Since H is closed, |f|∈H. By Lemma 466, H is a lattice. Exercise 6.1.2 Prove that if H is a separating subalgebra of C(X), then H is as well. Both versions of the Stone-Weierstrass Theorem are a very general state- ment about the density of a subset H in C(X)or equivalently about approx- imation in C(X). As some of the next examples show, it covers all known approximation theorems of continuous functions. 226 CHAPTER 6. FUNCTION SPACES Example 469 Let H 2 be the set of Lipschitz functions h : X → R given by |f(x)?f(y)| ≤ Cd X (x,y), ?x,y ∈ X and C ∈ R.First,wemust establish H 2 is a vector subspace of C(X) containing constant functions. That is we must establish H 2 is closed under addition and scalar multipli- cation. To see this, suppose f,g ∈ H 2 so that |f(x)?f(y)| ≤ C 1 d X (x,y) and |g(x)?g(y)|≤ C 2 d X (x,y).Then |(f +g)(x)?(f +g)(y)| = |f(x)?f(y)+g(x)?g(y)| ≤ |f(x)?f(y)|+|g(x)?g(y)| ≤ (C 1 +C 2 )d X (x,y) so that H 2 is closed under addition. Similarly, for α ∈R, |αf(x)?αf(y)| = |α||f(x)?f(y)|≤|α|C 1 d X (x,y) so that H 2 is closed under scalar multiplication. Second, we must establish that H 2 is a lattice. To see this, notice ||h(x)|?|h(y)||≤|h(x)?h(y)| by the triangle inequality. Finally, we must establish that H 2 is separating. To see this, for x 6= y, the function h(z)=d X (x,z) is Lipschitz with constant 1 satisfying h(x)=d X (x,x)=0and h(y)=d X (x,y) > 0.ThusH 2 is dense in C(X) by Theorem 464. Example 470 Let H 3 be the set of continuous piecewise linear functions h :[a,b] →R given by h(x)=b k +a k x for c k?1 ≤ x<c k , k =0,1,...,n with a = c 0 <c 1 <...<c n = b and where a k?1 c k + b k?1 = a k c k + b k , ?k keeps h continuous. It is easy to show that H 3 : is a vector subspace of C([a,b]) containing constant functions; is a lattice because |g(x)|∈ H 3 i? g ∈ H 3 ; and is separating since g(x)=x ∈ H 3 . Thus H 3 is dense in C([a,b]). Example 471 Let H 4 be the set of all polynomials h : X → R where X is a compact subset of R n . It is easy to show that H 4 is a subalgebra of C(X) containing the constants and is separating. Thus H 4 is dense in C(X).To see H 4 is a subalgebra, note that if we multiply two polynomials, the product is still a polynomial. A special case of Example 471 is X =[a,b] known as the Weierstrass Approximation Theorem. Notice that all the previous examples guarantee 6.1. THE SET OF BOUNDED CONTINUOUS FUNCTIONS 227 the existence of a dense set in C(X) but don?t present a constructive method of approximating a continuous function. The next example shows how to Tnd asequenceofpolynomialsh n (x;f) converging uniformly to f(x)on[0,1]. 4 Example 472 Let H 5 be the set of Bernstein polynomials b n (f):[0,1] →R for a function f :[0,1] →R where b n (x;f)= n X k=0 f μ k n ? · μ n k ? x k (1?x) n?k with μ n k ? = n! k!(n?k)! and n!=n·(n?1)·...·2·1. Example 473 Let H 6 be the set of all continuous functions di?erentiable to order p = ∞ on X ? R n (denoted C ∞ (X)). It is easy to show that C ∞ (X) is a separating algebra containing constant functions. Thus C ∞ (X) is dense in C(X). 6.1.4 Separability of C(X) To see that C(X) is separable we must show that there exists a countable subset S ? C(X)thatisdenseinC(X) (i.e. S = C(X)). Consider the set S of all polynomials deTned on X with rational coe?cients. From Example 471, we know that the set of all polynomials is dense in C(X). But any polynomial can be uniformly approximated by polynomials with rational coe?cients since Q is dense in R. Corollary 474 if X is compact, the set S of all polynomials in X with rational coe?cients (which is a countable set) is dense in C(X).Hence, C(X) is separable. 6.1.5 Fixed point theorems In Chapter 4 we proved Brouwer?s Txed point Theorem 302 for continuous functions deTnedonacompactsubsetofR n . But this theorem holds true in a more general setting. In particular, we don?t need to restrict it to a Tnite dimensional vector space; it can be extended to inTnite dimensional vector spaces (i.e. function spaces). 4 See Carothers p. 164 for a proof of uniform convergence. 228 CHAPTER 6. FUNCTION SPACES Theorem 475 (Schauder) Let K be a non-empty, compact, convex subset ofanormedvectorspaceandletf : K → K be a continuous function. Then f has a Txed point. Proof. (Sketch) Since K is compact, K is totally bounded. Hence, given ε>0, there exists a Tnite set {y i ,i =1,...,n} such that the collection {B ε (y i ),i=1,...,n} covers K. Let K ε = co({y 1 ,...,y n }). Since K is convex and y i ∈ K for all i =1,...,n, then K ε ? K (by Exercise 4.5.4). Note that K ε is Tnite dimensional and since it is also closed and bounded, it is compact (by Heine-Borel). DeTne the ?projection? function p ε : K → K ε by P ε (y)= P n i=1 θ i (y)y i such that the functions θ i : K →R are continuous for i =1,...,n, θ i (y) ≥ 0, and P n i=1 θ i =1.The construction of θ i isgivenintheproofintheap- pendix to this chapter. By construction, P ε (y)isanε-approximation of y (i.e. kP ε (y)?yk <ε,?ydK). Now for the function f : K → K deTne f ε : K ε →K ε by f ε (x)=P ε (f(x)),for all x ∈ K ε . The function f ε satisTes all the assumptions of Brouwer?s Fixed Point Theorem 302. Hence there exists x ε ∈ K ε such that x ε = f ε (x ε ). Set f(x ε )=y ε and choose a sequence <ε i > converging to zero. We must show that the approximating sequence <x ε > and <y ε > converge to the same point. By construction <y ε i > is a sequence in K and since K is compact, there exists a convergent subsequence <y g(ε i ) >→y ∈ K. (6.1) Allthat?sleftistoshowthat<x ε >=<f ε (x ε ) > also converges to y. kx ε ?yk = ky ε +x ε ?y ε ?yk = ky ε +f ε (x ε )?y ε ?yk = ky ε +P ε (y ε )?y ε ?yk≤kP ε (y ε )?y ε k+ky ε ?yk. The Trst term is su?ciently small because P ε (y) approximates y and the second term is su?ciently small since <y ε >→ y.Hence<x ε >→ y. Since f is continuous, then f (x ε ) → f (y). Combining this and (6.1) we have f (y)=y or that y is a Txed point of f. Schauder?s Fixed Point Theorem requires compactness of a subset K of the function space C(X). We will now state it in a slightly di?erent form that is more suitable for applications in function spaces (i.e. the assumptions of the following theorem are easier to verify). 6.2. CLASSICAL BANACH SPACES: L P 229 Theorem 476 Let F ? C(X) be a nonempty, closed, bounded, and convex set with X compact. If the mapping T : F → F is continuous and if the family T(F) is equicontinuous, then T has a Txed point in F. Proof. T(F) ? F by assumption. Set H = co 3 T(F) ′ (i.e. H is the convex hull of the closure of T(F)). By deTnition H is closed and convex. If we show that H is equicontinuous we are done since then by Ascoli?s Theorem 458 H is compact and T is continuous. By Schauder?s Theorem 475 T : H → H has a Txed point. We need to show that if T(F) is equicontinuous, then co 3 T(F) ′ is equicontinuous. Let f ∈ co 3 T(F) ′ .Then f = P k i=1 λ i f i such that f i ∈ T(F), λ i > 0, and P k i=1 λ i =1(whichobviouslyimpliesλ i ≤ 1fori = 1,...,k). f i ∈ T(F)impliesthat? <f i n > ∞ n=1 →f i where f i n ∈ T(F). Since kf(x)?f(y)k = ° ° ° ° ° k X i=1 λ i f i (x)? k X i=1 λ i f i (y) ° ° ° ° ° ≤ k X i=1 ° ° f i (x)?f i (y) ° ° ≤ k X i=1 £° ° f i (x)?f i n (x) ° ° + ° ° f i n (x)?f i n (y) ° ° + ° ° f i n (y)?f(y) ° ° ¤ . This expression is arbitrarily small for x,y close because it is the sum of Tnitely (k) many expressions which are arbitrarily small. In particular, the Trst and third terms are arbitrarily small because <f i n >→ f i and the second term is su?ciently small becuase f i n ∈ T(F)andT(F)isuniformly equicontinuous. 6.2 Classical Banach spaces: L p In the previous section we analysed the space of all bounded continuous functions f : X →R equipped with the (sup) norm kfk ∞ =sup{|f(x)|,x∈ X}. There we showed that (BC(X,R),k·k ∞ ) is complete. There are some potential problems using this normed vector space. Con- vergence with respect to the sup norm in the set BC(X,R) is uniform con- vergence (by Theorem 450), which is quite restrictive. For example, the 230 CHAPTER 6. FUNCTION SPACES sequence <f n (x) >=<x n > on X =[0,1] is not convergent in the space C([0,1]. That is, in Example 455 we showed that <x n > does not converge uniformly. We also mentioned that a metric (and hence a norm) that would induce pointwise convergence does not exist. Does there exist a norm on the set C([0,1]) for which <x n > would be convergent? Since x ∈ [0,1],<x n > is bounded and x n → 0 pointwise a.e. (i.e. except at x = 1). The sequence < R [0,1] x n > also converges (to 0) since lim n→∞ kx n ?0k 1 =lim n→∞ Z [0,1] x n = Z lim n→∞ x n = Z [0,1] 0=0 where the second equality follows from the Bounded Convergence Theorem 386. Thus <x n > on [0,1] converges with respect to the norm k·k 1 to f =0. While we have deTned a norm on C(X) that does not require strong convergence restrictions on a given sequence, we must establish whetherC(X) equipped with k·k 1 is complete. The next example shows this is not the case. Example 477 Take C([?1,1]) with f n :[?1,1] →R given by f n (x)= ? ? ? 1 if x ∈ [?1,0] 1?nx if x ∈ (0, 1 n ) 0 if x ∈ [ 1 n ,1] . See Figure 6.2.1. The sequence <f n (x) > is Cauchy. To see this, we must show kf n (x)?f m (x)k 1 → 0, with n ≥ m and m su?ciently large. Is it convergent in C([?1,1]) with respect to the norm k·k 1 ? Let f(x) be its limit. Then we must show kf n ?fk 1 = Z [?1,1] |f n (x)?f(x)|dx = Z [?1,0] |1?f(x)|dx + Z (0, 1 n ) |1?nx?f(x)|dx+ Z [ 1 n ,1] |0?f(x)|dx vanishes as n →∞. Since all the integrands on the right hand side are nonnegative, so is each integral. Hence kf n ?fk 1 → 0 would imply each integral on the right hand side approaches zero as n →∞. Consequently lim n→∞ Z [?1,0] |1?f(x)|dx =0and lim n→∞ Z [ 1 n ,1] |0?f(x)|dx =0 6.2. CLASSICAL BANACH SPACES: L P 231 implies f(x)= ? 1 if x ∈ [?1,0] 0 if x ∈ (0,1] . But then f(x) is not continuous on [?1,1] and hence f(x) /∈ C([?1,1]) so that <f n (x) > is not convergent. This proves that (C([?1,1]),k·k 1 ) is not a complete normed vector space. To summarize, we have seen that the function spaceC(X) can be equipped with two norms: the sup norm and k·k p=1 .Intheformercase,(C(X),k·k) is complete but in the latter case, (C(X),k·k 1 )isnotcomplete. Nowwe will introduce the space of all L?measurable functions f : X → R that are p?integrable. We will show that this space, known as the L p space, is the completion of C(X) with respect to the k·k p norm (just as, for instance, Rwas the completion of Q). Consider, then, the measure space (R,L,m)whereL is a σ-algebra of all Lebesgue measurable sets and m istheLebesguemeasure. While we will work here with (R,L,m), it can be extended to more general measure spaces (X,X,μ). DeTnition 478 For any p ∈ [1,∞),we deTne L p (X) with X ? R to be the space of all L-measurable functions f : X → R such that R X |f(x)| p dx < ∞ and L ∞ (X) to be the space of all essentially bounded L-measurable functions (i.e. functions which are bounded almost everywhere - See Figure 6.2.2) Furthermore, deTne the function k·k p : L p (X) →R as kfk p = ( ?R X |f| p ¢1 p p ∈ [1,∞) esssup|f| p = ∞ . We shall establish that k·k p deTnes a norm on L p (X). For p ∈ [1,∞), this norm is called the L p -norm or simply the p?norm or Lebesgue norm. To show that k·k p satisTes the triangle inequality property required of a norm, we use the same procedures as we used in ! p spaces in Chapter 4. Theorem 479 (Riesz-Holder Inequality) Let p,q be nonnegative conju- gate real numbers (i.e. 1 p + 1 q =1). If f ∈ L p (X) and g ∈ L q (X),then fg∈ L 1 and R X |fg| ≤ kfk p kgk q , with equality i? α|f| p = β|g| q a.e. where α,β are nonzero constants. 232 CHAPTER 6. FUNCTION SPACES Proof. When p =1,thenq = ∞.Takef ∈ L 1 and g ∈ L ∞ . Then g is bounded a.e. so that |g| ≤ M a.e. Now |fg| ≤ M|f| a.e. so that fg∈ L 1 . Integrating we have Z X |fg|≤ M Z X |f| = kfk 1 kgk ∞ . Next assume p,q ∈ (1,∞). If either f = 0 a.e. or g =0a.e.,wehave equality. Let f 6=0a.e. andg 6= 0 a.e. Substituting for a = |f(x)| kfk p and b = |g(x)| kgk q in Lemma 235, we have |f(x)g(x)| kfk p kgk q ≤ μ 1 p ? |f(x)| p 3 kfk p ′ p + μ 1 q ? |g(x)| q 3 kfk q ′ q . Then fg∈ L 1 and by integrating we get R X |f(x)g(x)| kfk p kgk q ≤ μ 1 p ? R X |f(x)| p 3 kfk p ′ p + μ 1 q ? R X |g(x)| q 3 kfk q ′ q = 1 p + 1 q =1 or R fg ≤ kfk p kgk q . By Lemma 235 equality holds when a p = b q , which means 3 kgk q ′ q |f| p = 3 kfk p ′ p |g| q . Now we need to show that k·k p satisTes the triangle inequality property of a norm. Theorem 480 (Riesz-Minkowski) For p ∈ [1,∞] and f,g∈ L p , kf +gk p ≤ kfk p +kgk p . Proof. For p =1andp = ∞, it follows trivially from |f +g|≤|f|+|g|. Let p ∈ (1,∞)andleth = |f +g| p?1 . Since p?1= p q , it follows that h ∈ L q and 3 khk q ′ q = R |f +g| p = 3 kf +gk p ′ p . Now 3 kf +gk p ′ p = Z |f +g| 1+ p q = Z |f +g||f +g| p q ≤ Z |f|h+ Z |g|h ≤ 3 kfk p +kgk p ′ khk q = 3 kfk p +kgk p ′3 kf +gk p ′ p q 6.2. CLASSICAL BANACH SPACES: L P 233 where the second inequality follows from Theorem 479. Since p ? p q =1, dividing both sides by 3 kf +gk p ′ p q 6=0,wehavekf +gk p ≤ kfk p + kgk p . Finally, if kf +gk = 0, then the inequality holds trivially. As in L 1 considered in section 5.4, we stress that the function k·k p satisTes all properties of a norm except the zero property (i.e. kfk p =0doesnot imply f = 0 everywhere). Using equivalence classes of functions rather than functions themselves, it can be shown as in the previous section that kfk p is anormonL p . Again, the most important question we must ask of our new normed vector space is ?Is it complete?? The next theorem provides the answer. Theorem 481 (Riesz-Fischer) For p ∈ [1,∞], (L p ,k·k p ) is a complete normed vector space (i.e. a Banach space). Proof. (Sketch) The proof for p = 1 was already given in Theorem 443. The proof for p ∈ (1,∞) is virtually identical. Finally, let p = ∞ and let <f n > be a Cauchy sequence in L ∞ . For x ∈ X, |f k (x)?f n (x)|≤kf k ?f n k ∞ (6.2) except on a set A k,n ? X with mA k,n =0byDeTnition 368 of the es- sential supremum . If A = ∪ k,n A k,n , then mA =0and|f k (x) ? f n (x)| ≤ kf k ?f n k ∞ ,?k,n ∈ N with k>nand ?x ∈ X\A.Since<f n (x) > is Cauchy in R, there exists a bounded function f(x)that<f n (x) > converges to ?x ∈ X\A. Moreover this convergence is uniform outside A as (6.2) indi- cates. Now we would like to establish how L p spaces are related to one another and also howthey are related to the set of continuous functionsC(X,Y).Before doing that, however, we present an example which shows that continuity does not guarantee that a function is an element of L p . Example 482 Let f :(0,1) → R be given by f(x)= 1 x . The function f is continuous on (0,1) but is not p-integrable for any p.Hencef ∈C((0,1),R) but f/∈ L p ((0,1)). Lemma 483 BC(X,R) ? L ∞ (X). 234 CHAPTER 6. FUNCTION SPACES Proof. If a function is bounded and continuous, it must be essentially bounded. Note that from DeTnition 478, if the function is continuous, then the ess sup is just the sup. That is, if f ∈ BC(X), then kfk ∞ is the extension (in the sense of DeTnition 56) from BC(X)toL ∞ (X)ofthesupnorm. This also justiTes why we used the prior notation k·k ∞ for the sup norm. Thus convergence in L ∞ (X) is equivalent to uniform convergence oustide a set of measure zero. Note that if X is compact, then C(X) ? L ∞ (X). If we make additional assumptions about the domain X,however,there are inclusion relations among the L p (X) spaces and their associated norms. Theorem 484 If m(X) < ∞, then for 1 <p<q<∞ we have L ∞ (X) ? L q (X) ? L p (X) ? L 1 (X) and kfk 1 ≤ c 1 kfk p ≤ c 2 kfk q ≤ c 3 kfk ∞ where c i are constants which are independent of f. Proof. L ∞ (X) ? L q (X)for1<q<∞and m(X) < ∞since if f is bounded a.e. (i.e. f ∈ L ∞ ) and measurable, then since mX is Tnite we have that f is integrable by Theorem 382. Assume 1 <p<q<∞.Letf ∈ L q .Thenf p ∈ L q p . Set λ = q p . Since q>p,we have λ>1. Choose μ such that 1 λ + 1 μ =1. Then Z |f| p = Z |f| p ·1 ≤ μZ |f| pλ ?1 λ · μZ 1 ?1 μ = μZ |f| q ? p q ·(mX) 1 μ < ∞ where the Trst inequality follows from Holder?s Inequality (Theorem 479) and taking the p-th root of both sides of the above inequality we obtain kfk p ≤ [m(X)] 1 pμ kfk q .Hence f ∈ L p (X). Note, for instance, the proof gives a constructive way to obtain the con- stant c 2 =[m(X)] 1 pμ . Thus stating that for p<q,L q (X) ? L p (X)means that if f is q-integrable, then f is also p-integrable and kfk p ≤ ckfk q where c is some constant. This inequality implies that if a sequence <f n > converges in L q (X), then <f n >? L p (X)andconvergesalsoinL p (X). Note also that in this theorem we compare normed vector spaces with di?erent norms. Putting the two previous lemmas together we have the following result. Corollary 485 If m(X) < ∞, BC(X) ? L p (X). 6.2. CLASSICAL BANACH SPACES: L P 235 Example 486 The inclusions in Theorem 484 are strict. For instance, let 1 ≤ q<∞ and take f :(0,1) → R given by f(x)= 1 x 1 q . Then f ∈ L p ((0,1)) for p<qbut f/∈ L q ((0,1)). In particular, take f(x)= 1 √ x . Then R (0,1) 1 √ x dx =2 √ x| 1 0 =2so f ∈ L 1 ((0,1)) but R (0,1) 3 1 √ x ′ 2 dx = R (0,1) 1 x dx =ln(x)| 1 0 = ∞ so f/∈ L 2 ((0,1)). Example 487 The assumption that m(X) < ∞ is important. Take f : [1,∞) → R given by f(x)= 1 x 1 p with 1 ≤ p<∞. Then f ∈ L q ([1,∞)) if q>pbut f/∈ L p ([1,∞)). In particular, f(x)= 1 x ∈ L 2 ([1,∞)) but f/∈ L 1 ([1,∞)). Comparing Theorem 239 (in ! p ) and Theorem 484 (in L p ), one may won- der why the order of ! p spaces is exactly opposite that of L p (X)spaceswith m(X) < ∞. That is, for 1 <p<q<∞ ! 1 ? ! p ? ! q ? ! ∞ L 1 ? L p ? L q ? L ∞ . ! p spaces are spaces of sequences and we know that a sequence is just a function f : N→R;thatis,afunctiondeTnedonanunboundedset. If <x i >∈ ! p , then P ∞ i=1 |x i | p < ∞.ThisinTnite sum can only be Tnite if |x i | p decreases ?rapidly enough? to zero. Now if p<q,then|x i | q decreases ?more rapidly? than |x i | p (i.e. |x i | q < |x i | p ). Hence if <x i >∈ ! p ,then <x i >∈ ! q . In the case of L p (X)withmX < ∞, while X is bounded, f : X →R may not be bounded. If f ∈ L q (X), then R X |f| q < ∞. For p<q, |f| p < |f| q ? R X |f| p < R X |f| q < ∞ and hence f ∈ L p (X). 6.2.1 Additional Topics in L p (X) Approximation in L p (X) For L p (X),p∈ (1,∞) we have the following result which is similar to The- orem 445 in L 1 (X). Theorem 488 Let 1 <p<∞,X?R,f∈ L p (X) and ε>0.Then (i) there is an integrable simple function ? such that kf ??k p <ε; and (ii) there is a continuous function g such that g vanishes (g =0)outsidesomebounded interval and such that kf ?gk p <ε. 236 CHAPTER 6. FUNCTION SPACES Exercise 6.2.1 Prove Theorem 488. Hint: See Carothers p.350. Note that here X can be equal to R so that the theorem also covers sets of inTnite measure. Corollary 489 The set of all integrable simple functions is dense in L p (X). The set of all continuous functions vanishing outside a bounded interval is dense in L p (X). Now consider the case where p = ∞.Let f ∈ L ∞ (X)inwhichcasef is bounded a.e. on X (i.e. there is a set E such that m(E)=0andf is bounded on X\E.Then by Theorem 367, there exists a sequence of simple functions <? n > converging uniformly to f on X\E.In other words, ? n → f uniformly a.e. on X.Thus,? n →f in L ∞ (X). Corollary 490 The simple functions are dense in L ∞ (X). If m(X) < ∞, then any simple function is integrable. Thus we have: Corollary 491 If m(X) < ∞,then the integrable simple functions are dense in L ∞ (X). Notice that the condition m(X) < ∞ is critical here. For example, f = 1 ∈ L ∞ (R) cannot be approximated by an integrable simple function. Separability of L p (X) If X is compact, then the set of all polynomials with rational coe?cients P Q (X)isdenseinC(X) and because C(X)isdenseinL p (X)(byCorollary 474), then P Q (X)isalsodenseinL p (X).Thus L p (X)withX compact is separable. If X is not compact, then as we showed in L 1 (X), the set M of all Tnite linear combinations of the form P n i=1 c i χ I i where c i are rational numbers and I i are intervals with rational endpoints is a countably dense set in L p (X). Theorem 492 Corollary 493 L P (X) is separable for 1 <p<∞ . Corollary 494 L ∞ (X) is not separable for any X (either compact or not). 6.2. CLASSICAL BANACH SPACES: L P 237 Proof. Take two bounded functions χ [a,c] andχ [a,d] . Sincekχ [a,c] ?χ [a,d] k ∞ =1 for c 6= d, then B1 2 ? χ [a,c] ¢ ∩B1 2 ? χ [a,d] ¢ = ? for c 6= d where B1 2 ? χ [a,c] ¢ = ? fdL ∞ : kf ?χ [a,c] k ∞ < 1 2 ? . Let F be an arbitrary set which is dense in L ∞ ([a,c]). Then for each c with a<c<bthere is a function f c ∈ F such that kχ [a,c] ?f c k ∞ < 1 2 since χ [a,c] ∈ L ∞ (X)andF is dense in L ∞ (X). Because f c 6= f d for c 6= d.and there are uncountably many real numbers between [a,b], F must be uncountable. 6.2.2 Hilbert Spaces (L 2 (X)) As we mentioned in Chapter 4, a Hilbert space is a Banach space equipped with an inner product. Hence, a Hilbert space is a special type of Banach space which posesses an additional structure: an inner product. This addi- tional structure allows us, apart from measuring length of vectors (norms), to measure angles between vectors. In particular it enables us to introduce the notion of orthogonality for two vectors. DeTnition 495 We say that two vectors x and y of M are othogonal (perpendicular) if their inner product <x,y>=0andwedenoteitx ⊥ y. The set N ? H is called an orthogonal set (or orthogonal system) if any two di?erent elements ? and ψ of N are orthogonal, that is <?,ψ>=0. An orthogonal set N is called orthonormal if it is orthogonal and k?k =1 for each ? in N. Example 496 R n with the inner product deTned by <x,y>= x 1 y 1 +....+ x n y n = P n i=1 x i y i is a Hilbert space. The set N = {e i ,i=1,...,n} where e i =(0,...,0,1,0,...,0) is orthonormal. Example 497 ! 2 with the inner product deTned by <x,y>= P ∞ i=1 x i y i where x = hx i i,y= hy i i is a Hilbert space. The set N = {e i ,idN} is orthonormal. 238 CHAPTER 6. FUNCTION SPACES Example 498 L 2 ([0,2π]) with inner product deTned by <f,g>= R 2π o f (x)g(x)dx where f,gdL 2 ([0,2π]) is a Hilbert space. The set N = n 1 √ 2π ,cos nx √ π ,sin nx π ,ndN o is an orthonormal. Exercise 6.2.2 Show that N in Example 498 is an orthonormal system. Notice that the distance between any two distinct elements of an or- thonormal system is √ 2. That is, k??ψk 2 = < (??ψ),(??ψ) > = <?,?>? <?,ψ>? <ψ,?>+ <ψ,ψ>= k?k 2 +kψk 2 =1+1=2. Lemma 499 If H is a separable Hilbert space, then each orthonormal set is countable. Proof. Let U = {e α ,α∈ A}, where A is an index set, be an uncountable orthonormal set in H. Then the collection of balls around each element e α with radius 1 2 (i.e. {B1 2 ( e α ),α∈ A would be an uncountable collection of disjoint balls and hence H could not be separable. DeTnition 500 An orthonormal set {e α ,α∈ A} is said to be complete if it is maximal. In other words it is not possible to adjoin an additional element e ∈ H with e 6=0to{e α ,α∈ A} such that {e, e α ,α∈ A} is an othonormal set in H. The existence of a complete orthonormal set in any Hilbert space H is guaranted by Zorn?s lemma because the collection {N} of all orthonormal sets in H is partially ordered by set inclusion. Thus we have the following. Theorem 501 Every separable Hilbert space contains a countable complete orthonormal system. The following theorem can be used to check if the orthonormal set is complete. Theorem 502 {e α ,α∈ A} is a complete orthonormal set in H i? x ⊥ e α , ?α ∈ A implies x =0. 6.2. CLASSICAL BANACH SPACES: L P 239 Proof. By contradiction. (?)Let{e α , ?α ∈ A} be complete and ? x 6=0 in H such that x ⊥ e α , ?α ∈ A. DeTne e = x kxk so that kek =1.Hence {e,e α ,α∈ A}is orthonormal whichcontradicts the assumption that{e α ,α∈ A} is complete (maximal). (?) Assume that x ⊥e α ,?α ∈ A ? x =0and{e α , ?α ∈ A} is not com- plete. Then ? e ∈ H such that {e,e α , ?α ∈ A} is an orthonormal sysytem and e/∈ {e α , ?α ∈ A}. Since e ⊥ e α , ?α ∈ A and e 6=0(becausekek =1), the assumption is contradicted. Exercise 6.2.3 Show that the orthonormal systems in R n ,! 2 ,andL 2 ([0,2π]) deTned in examples 496 to 498 are complete. Consider now a separable Hilbert space and let {e i } be an orthonormal system in H. We know that {e i } is either a Tnite or countably inTnite set. We deTne the Fourier coeTcients with respect to {e i } of an element x ∈ H to be a i =<x,e i >. Theorem 503 (Bessel?s Inequality) Let {e i } be an orthonormal system in H and let x ∈H.. Then I X i=1 a 2 i ≤kxk 2 where a i =<x,e i > are the Fourier coe?cients of x and I = N if {e i } is Tnite or I = ∞ otherwise. Proof. 0 ≤kx? P n i=1 a i e i k 2 = kxk 2 ?2 P n i=1 a i <x,e i > + P n i=1 P n i=1 a i a j < e i ,e j > = kxk 2 ? P n i=1 a 2 i . Thus P n i=1 a 2 i ≤ kxk 2 and since n was arbitrary, we have P ∞ i=1 a 2 i ≤kxk 2 . Now let a i beFouriercoe?cients ofxwith respect to{e i }and let P ∞ i=1 a 2 i < ∞ (i.e. P ∞ i=1 a 2 i converges). Then consider a sequence hzi deTned by z n = n X i=1 a i e i , for m ≥ n z m ?z n = m X i=n a i e i , 240 CHAPTER 6. FUNCTION SPACES and we have kz m ?z n k 2 = P m i=n P m j=n a i a j <e i ,e j >= P m i=n a 2 i . This term canbemadesu?ciently small for m,n large enough because P ∞ i=1 a 2 i is con- vergent. Hence hz n i is a Cauchy sequence. Because H is a Hilbert space (thus complete), there exists y ∈H such that y = P ∞ i=1 a i e i . Since the inner product is continuous, by the Cauchy-Schwartz inequality we have <y,e i >= lim n?→∞ <z n ,e i >=< ∞ X j=1 a j e j ,e i >= a i . Thus a i are Fourier coe?cients of y, as well as of x (which we started with). When does x equal y ? In other words, when are the elements with the same Fourier coe?cients equal? Let a i be the Fourier coe?cients of two elements x and y (i.e. a i =<y,e i >=<x,e i >.Butthisisequivalentto 0=<y?x,e i >,?i =1,2... This implies x = y i? the orthonormal system {e i } is complete by Theorem 502.Hence we proved the following: Theorem 504 (Parseval Equality) If {e i } is a complete orthonormal sys- tem in a Hilbert space H then for each x ∈H,x= P ∞ i=1 a i e i 3 or x = P N i=1 a i e i ′ where a i =<x,e i >. Moreover kxk 2 = P ∞ i=1 a 2 i . To summarize the previous Tndings, let {e i } be a complete orthonormal system of a Hilbert space H and let a i =<x,e i >, i =1,2,... be Fourier coef- Tcients of x with respect to {e i }. Then the Fourier series P ∞ i=1 a i e i converges to x (with respect to the norm of H). That is, lim n→∞ n X i=1 a i e i =lim n→∞ n X i=1 <x,e i >e i = x or equivalently ° ° ° ° ° n X i=1 <x,e i > ?x ° ° ° ° ° → 0. This implies that ifx,y ∈HhavethesameFouriercoe?cients, thenkx?yk H = 0whichmeansx = y. Depending on the space we deal with this may mean that x = y a.e. Example 505 Add L 2 ([0,2π]) 6.3. LINEAR OPERATORS 241 6.3 Linear operators In the previous two sections on C(X)andL p (X) we studied normed vector (linear) spaces whose elements were functions. In this section, we study functions that operate between two normed vector spaces. We call these functions operators (to distinguish them from the functions that are elements of the normed vector spaces). We will focus primarily on operators that preserve the algebraic structure of vector (linear) spaces. These functions are called linear operators. Because normed vector spaces are also metric spaces, we will also address the issue of how linearity relates to continuity. DeTnition 506 Let (X,k·k X ) and (Y,k·k Y ) be normed vector spaces. A function T : X → Y is called a linear operator if T(αx + βx 0 )=αTx + βTx 0 , ?x,x 0 ∈ X and α,β ∈R. Example 507 Consider the Banach space (C([0,1]),k·k ∞ ). Assume that a function g :[0,1] ×[0,1] → R is continuous. DeTne T : C([0,1]) → C([0,1]) by (Tx)(t)= R [0,1] g(t,s)x(s)ds. For instance, g(t,s) couldbeajointdensity function and x(s)=s. Then Tx(t) is the mean of s conditional on t.Itis easy to show that T is linear (due to the linearity of the integral). We would like to characterize continuous linear operators. First we prove an important fact about continuity of linear operators. Theorem 508 Let X,Y be normed vector spaces and T : X → Y be a linear operator. Then T is continuous on X i? T is continuous at any one element in X. Proof. (?)BydeTnition. (?)Let T be continuous at x 0 ∈ X and x ∈ X be arbitrary. Let <x n >? X and x n → x. Then <x n ?x + x 0 >→ x 0 . Therefore, by Theorem 248, T(x n ?x+x 0 ) →Tx 0 (because T is continuous at x 0 ). But if T(x n ?x+x 0 )= Tx n ?Tx+ Tx 0 → Tx 0 (where the equality follows from linearity of T ), then Tx n ?Tx→ 0 ?Tx n →Tx.Hence T is continuous at x. Here we stress that all one needs to establish continuity is that T is continuity at one point. The result is a simple consequence of the linearity of the operator T (justasweprovedinearlierchaptersthatalinearfunction is continuous). But one should not be confused; it is not the case that all 242 CHAPTER 6. FUNCTION SPACES linear operators are continuous since it may not be continuous at any points in X (See Example 514). Just as we considered restricting the space of all functions F(X,Y)from ametricspaceX to a metric space Y to the subset B(X,Y) of all bounded functions in the introduction to this chapter, now we introduce a bounded linear operator and deTne a new norm. DeTnition 509 Let X,Y be normed vector spaces and T : X → Y be a linear operator. T is said to be bounded on X if ?K ∈ R ++ such that kTxk Y ≤ K ·kxk X , ?x ∈ X. We note that this type of boundedness is di?erent from that in DeTnition 163. In that case, we would say ?M such that kf(x)k Y ≤ M,?x ∈ X. The next example shows how di?erent they are. Example 510 Let (X,k·k X )=(R,|·|)=(Y,k·k Y ) and T : X →Y be given by Tx=2x. T as a linear function is not bounded on R with respect to k·k Y since 2x can be arbitrarily large. But T as a linear operator is bounded in the sense of DeTnition 509 since kTxk Y ≤ K ·kxk X ?|2x|≤ 2|x|,?x ∈ X. In the remainder of the book, when we say that a linear operator is bounded, we mean it in the sense of DeTnition 509. The following result shows that a bounded linear operator is equivalent to a continuous operator. Theorem 511 Let T : X → Y be a linear operator. Then T is continuous i? T is bounded. Proof. (?) Assume that T is bounded and let kx n k X → 0.Then ?K such that kTx n k Y ≤ Kkx n k X → 0asn →∞. But this implies kTx n k Y → 0so that T is continuous at zero and hence continuous on X by Theorem 508. (?) By contraposition. In particular, we will prove that if T is not bounded, then T is not continuous. If T is not bounded, then ?n ∈ N, ?x n ∈ X with x n 6=0suchthatkTx n k Y >nkx n k X . But this implies ° ° ° ° Tx n nkx n k X ° ° ° ° Y > 1. Setting y n = x n n·kx n k X , we know ky n k X → 0asn →∞.But kTy n k Y > 1, ?n ∈ N.ThusTy n cannot converge to 0 and T is not continuous at 0 (and hence not continuous). 6.3. LINEAR OPERATORS 243 Example 512 Consider the linear operator T : C([0,1]) → C([0,1]) deTned in Example 507 by (Tx)(t)= R [0,1] g(t,s)x(s)ds.Sinceg :[0,1]×[0,1] →R is a continuous function on a compact domain, it is bounded or |g(x 1 ,x 2 )|≤ M 1 ,?(x 1 ,x 2 ) ∈ [0,1] × [0,1].Also,x(t):C([0,1]) → C([0,1]) is bounded by virtue of being in C([0,1]) or |x(t)|≤ M 2 , ?t ∈ [0,1]. Thus, (Tx)(t)= Z [0,1] g(t,s)x(s)ds ≤ M 1 Z [0,1] |x(s)|ds ≤ M 1 M 2 . DeTnition 513 Let L(X,Y) be the set of all linear operators T : X → Y where X,Y arenormedvectorspaces.LetBL(X,Y) be the set of all bounded linear operators in L(X,Y). The next example shows thatBL(X,Y) isa proper subset ofL(X,Y).Coupled with Theorem 511 it also shows that not all linear operators are continuous. Example 514 Consider the normed vector space of all polynomials P : [0,1] → R with the sup norm k·k ∞ . DeTne T : P([0,1]) → P([0,1]) by (Tx)(t)= dx(t) dt ,t∈ [0,1].Tis called the di?erentiation operator. It is easy to check that T is linear (since the derivative of a sum is equal to the sum of the derivatives). But T is not bounded. To see why, let <x n (t) >= t n ,?n ∈N. Then kx n k ∞ =sup{|t n |,t∈ [0,1]} =1and (Tx n )(t)= dx n (t) dt = n · t n?1 . Therefore, kTx n k ∞ =sup{n|t| n?1 ,t∈ [0,1]} = n,?n ∈ N. Then T is not bounded since there is not a Txed number K such that kTx n k ∞ kx n k ∞ = n ≤ K. The sequence of functions x n (t)=t n converges to x 0 (t)= ? 0 t ∈ [0,1) 1 t =1 but the sequence of their derivatives x 0 n (t)=nt n?1 doesn?t converge to the derivative of x 0 (t) (which actually doesn?t exist). In the introduction to this Chapter we deTnedthesupnormonB(X,Y) ? F(X,Y). What would be the consequences of equipping BL(X,Y)withthe sup norm? More speciTcally, how large would (BL(X,Y),k·k ∞ )be? The next example shows it would be very, very small. 244 CHAPTER 6. FUNCTION SPACES Example 515 Take X = Y = R. 5 All linear functions f : R→R are of the form Tx = ax but these are not bounded with respect to the sup norm. Hence the only element that would belong to BL(X,Y) of all bounded linear operators equipped with the sup norm would be Tx=0, ?x ∈R. DeTnition 516 Let T ∈ BL(X,Y). Then T isboundedbyassumptionso ?K such that kTxk Y ≤ K ·kxk X , ?x ∈ X. We call the least such K the (operator) norm of T and denote it kTk where kTk =inf{K : K>0 and kTxk Y ≤ Kkxk X ,x∈ X}. (6.3) Exercise 6.3.1 Prove that the function kTk in (6.3) is a norm onBL(X,Y). What is the relation between the sup norm and this new operator norm? In the introduction to this Chapter we deTned the sup norm on B(X,Y) ? F(X,Y) of all bounded (linear and nonlinear) fuctions f : X →Y.Nowwe have deTned the operator norm on BL(X,Y) of all linear operators (func- tions) T : X → Y. We show in the next example that these two norms are very di?erent. Example 517 In Example 510 we had (X,k·k X )=(R,|·|)=(Y,k·k Y ) and T : X → Y be given by Tx =2x. T is not bounded on R with re- spect to the sup norm since k2xk ∞ =sup{|2x|,x ∈ R} = ∞. However the operator norm is bounded in the sense of DeTnition 509 since k2xk = inf{K : K>0 and |2x|≤ K|x|,x∈R} =2. In the remainder of this section and the next, when we refer to the norm of a linear operator we mean the norm given in (6.3) if not speciTed otherwise. In the following theorem, we show that the norm of a linear operator can be expressed in many di?erent ways. Theorem 518 The norm of a bounded linear operator T : X → Y can be expressed as: (i) kTk =inf{K : K>0 and kTxk Y ≤ Kkxk X ,x∈ X}; (ii) kTk =sup{kTxk Y ,x∈ X,kxk X ≤ 1}; (iii)kTk =sup{kTxk Y ,x∈ X,kxk X =1}; (iv)kTk =sup n kTxk Y kxk X ,x∈ X,x 6=0 o . 5 We cannot take [?3,3] ?R since Xissupposed to be a vector subspace but [?3,3] is not because, for instance, it is not closed under scalar multiplication (e.g. if we take the scalar 4 we have [?12,12] " [?3,3]). 6.4. LINEAR FUNCTIONALS 245 Proof. Denote the right hand sides of expressions (i), (ii), (iii), and (iv) as M 1 ,M 2 ,M 3 ,M 4 . We want to show that M 1 = M 2 = M 3 = M 4 . From (i), we have kTxk Y ≤ M 1 kxk X ,?x ∈ X. Now if kxk X ≤ 1,then kTxk Y ≤ M 1 . Since M 2 is the supremum of such a set, then M 2 ≤ M 1 . Since sup{kTxk Y ,x∈ X,kxk X =1}? sup{kTxk Y ,x∈ X,kxk X ≤ 1},then M 3 ≤ M 2 . Next, since kTxk Y kxk X = ° ° °T 3 x kxk X ′° ° ° Y for x 6=0,if we let z = x kxk X ,then kzk X = ° ° ° x kxk X ° ° ° X = kxk X kxk X = 1 and hence M 3 = M 4 . From the deTnition of M 4 ,it follows that if kxk X 6=0,then kTxk Y kxk X ≤ M 4 or kTxk Y ≤ M 4 kxk X . Since M 1 is the inTmum, we have M 1 ≤ M 4 . Thus we have M 1 ≤ M 4 = M 3 ≤ M 2 ≤ M 1 , which implies the desired result. Corollary 519 Let X,Y be normed vector spaces and let T : X → Y be a bounded linear operator. Then kTxk Y ≤kTk·kxk X . The next theorem establishes the most important result of this section; namely that BL(X,Y) is a complete normed vector space provided that (Y,k·k Y ) is complete. We cannot use the previous result on completeness in function spaces (Theorem 449) because BL(X,Y) is equipped with a di?erent norm. However, the proof is similar to that used to establish that B(X,Y) is complete whenever (Y,k·k Y )iscomplete. Theorem 520 The space BL(X,Y) of all bounded linear operators from anormedvectorspaceX to a complete normed vector space Y is itself a complete normed vector space. Proof. (Sketch) Let <T n > be a Cauchy sequence in BL(X,Y). For Txed x ∈ X , <T n (x) > is Cauchy in Y. Since Y is complete, <T n (x) > converges to an element in Y, call it Tx.ThuswecandeTne an operator T : X → Y by Tx = lim n→∞ T n (x). It is easy to show that T is bounded and that <T n >→T in BL(X,Y). 6.4 Linear Functionals In this section we study the special case of linear operators that map elements (in this case functions) from a normed vector space to R. 246 CHAPTER 6. FUNCTION SPACES DeTnition 521 Let (X,k·k X ) be a normed vector space. A linear operator F : X → R is called a linear functional. That is, a linear functional is areal-valuedfunctionF on X such that F(αx + βx 0 )=αF(x)+βF(x 0 ), ?x,x 0 ∈ X and α,β ∈R We note that if X is a Tnite dimensional vector space (e.g. R n ), then F is usually called a function. The functional nomenclature is typically used when X is an inTnite dimensional vector space (e.g. ! p ,C(X),L p ). DeTnition 522 F : X →R is said to be bounded on X if ?K ∈R ++ such that |F(x)|≤ K ·kxk X , ?x ∈ X. Since a bounded linear functional is a special case of a bounded linear operator, everything we proved in the previous Section 6.3 is also valid for linear functionals. We summarize it in the following Theorem. Theorem 523 Let F be a linear functional on a normed vector space X. Then: (i) F is continuous i? F is continuous at any point in X; (ii) F is continuous i? F is bounded; (iii) The set of all bounded linear func- tionals is a complete vector space with the norm of F deTned by kFk = sup{|F(x)|,x∈ X,kxk X ≤ 1} or by any other equivalent formula from The- orem 518. Proof. Follows proofs in the previous section. Part (iii) uses fact that (R,|·|) is complete (so that the set of all bounded linear functionals is always complete). We note that the set of all bounded linear functionals on X has a special name. DeTnition 524 Given a normed vector space X, the set of all bounded linear functionals on X is called the dual of X, denoted X ? . The next set of examples illustrate functionals on Tnite and inTnite di- mensional vector spaces. Example 525 Let R n be n-dimensional Euclidean space with the Euclidean norm. Let a =(a 1 ,...,a n ) be a Txed non-zero vector in R n .DeTne the ?inner 6.4. LINEAR FUNCTIONALS 247 (or dot) product? functional F 1 : R n →R by F 1 =<a,x>= a 1 x 1 +...+a n x n . 6 It is clear that F 1 is linear since F 1 (αx+βx 0 )=<a,(αx+βx 0 ) > = α<a,x>+β<a,x 0 > = αF 1 (x)+βF 1 (x 0 ). It is also easily established that F 1 is bounded since by the Cauchy-Schwartz inequality we have |F 1 (x)| = | <a,x>| ≤ kak X kxk X ,?x ∈ R n . Finally, since kF 1 k =sup{|F 1 (x)|,x∈ X,kxk X ≤ 1} ≤ kak X and kF 1 k ≥ |F 1 (a)| kak X = kak 2 X kak X = kak X , we have kF 1 k = kak X . Figure 6.4.1 illustrates such functionals in R 2 . Example 526 Consider the Banach space (! 1 ,k·k 1 ). DeTne the linear func- tional F 2 : ! 1 → R by F 2 (x)= P ∞ i=1 x i where x =<x i > ∞ i=1 . Then |F 2 (x)|≤ P ∞ i=1 |x i | = kxk 1 ,?x ∈ ! 1 . This implies that F 2 is bounded and that kF 2 k ≤ 1. Also for x = e 1 =(1,0,...) ∈ ! 1 we have kF 2 k ≥ |F 2 (e 1 )| ke 1 k 1 = 1 1 =1. Combining these two inequalities yields kF 2 k =1. Example 527 Let X = C([a,b],k·k ∞ ). DeTne the functional F 3 : X → R by F 3 (x)= R [a,b] x(ω)dω, x ∈ X. We can interpret this as the expectation of a random variable drawn from a uniform distribution on support [a,b]. It is clear that F 3 is linear. To see that F 3 is bounded, note that |F 3 (x)| = ˉ ˉ ˉ ˉ Z [a,b] x(ω)dω ˉ ˉ ˉ ˉ ≤ Z [a,b] |x(ω)|dω ≤ sup ω∈[a,b] |x(ω)|·(b?a)=(b?a)·kxk ∞ ,?x ∈ X. On the other hand, if x = x 0 where x 0 (ω)=1?ω ∈ [a,b], then kx 0 k ∞ =1 and |F 3 (x 0 )| = R [a,b] 1dω = b?a. Hence kF 3 k≥ |F 3 (x 0 )| kx 0 k ∞ = b?a and combining these inequalities kF 3 k = b?a. Example 528 Reconsider Example 526 with a di?erent norm. In particular, let X =(! 1 ,k·k ∞ ) andletthelinearfunctionalF 4 : ! 1 → R by F 4 (x)= P ∞ i=1 x i where x =<x i > ∞ i=1 ∈ ! 1 . In this case, F 4 is unbounded. To see this 6 We introduced this notation in DeTnition 209. 248 CHAPTER 6. FUNCTION SPACES deTned the sequence <x n >∈ ! 1 as a sequence of 1?s in the Trst n places and zeros otherwise (i.e. <x n >=< 1,...,1,0,0,...>where the last 1 occurs in the n th place). Then kx n k ∞ =sup{|x i |,i∈N} =1 and F 4 (x n )=n. Thus, |F 4 (x n )| = n·kx n k ∞ , with kx n k ∞ =1,?n ∈N. Therefore, kF 4 k = ∞. 6.4.1 Dual spaces As you may have noticed from Examples 525 to 528, it is quite simple to determine whether ?something? is a bounded linear functional. But now we move on to tackle the converse. Given a normed vector space X, is it possible to represent (or characterize) all bounded linear functionals on X?Inother words, we want to determine the dual of X. Here we simply consider the dual of some of the most common normed vector spaces. The dual of the euclidean space R n In Example 525 of this section, we showed that a functional F 1 : X → R deTned by F 1 (x)=<a,x>where x ∈ X = R n and a ∈ R n is a bounded linear functional with kF 1 k = kak X . The functional F 1 is represented by the point a;thatis,ifwevarya,wevaryF 1 . Let F be the set of all such F 1 . InthecasewhereX = R 2 , F 1 are just planes and F is the set of all planes. Obviously, F? X ? . Nowweshowthattherearenoothers.Thatis,F? X ? so that F = X ? . Theorem 529 The dual space of R n is R n itself. That is, each bounded linear functional G on R n can be represented by an element b ∈ R n such that G(x)=<b,x>for all x ∈R n . Proof. (Sketch)Let G ∈ (R n ) ? (i.e. Gis a bounded linear functional onR n ). Let {e 1 ,...,e n } be the natural basis in R n .DeTne b i = G(e i )fori =1,...,n. Then the point b =(b 1 ,...,b n ) ∈ R n represents G.Thatis,forx ∈ R n we have G(x)=<x,b>. (6.4) By the Cauchy-Schwartz inequality we have kGk ≤ kbk X and by plugging x = bin (6.4) we obtain kGk≥kbk X so that kGk = kbk X . This equality establishes that an operator T : X ? →X deTned by T(G): (G(e 1 ),...,G(e n )) = b is an isometry (see DeTnition 171). This means that 6.4. LINEAR FUNCTIONALS 249 T preserves distances and hence it preserves topological properties of spaces (X,k·k X )and(X ? ,k·k). It is easy to verify that T is a bijection and that T is a linear operator. Hence T preserves the algebraic (in this case linear) structure of these two spaces.InthiscasewesaythatT is an isomorphism. Putting these two together we have that T is an isometric isomorphism between (X,k·k X )and(X ? ,k·k) and hence these two spaces are indistin- guishable from the point of view of the number of elements, as well as the algebraic and topological structure. Hence they are e?ectively the same space with di?erently named elements. The dual of a separable hilbert space Since Euclidean space is a separable, complete inner product space, one might like to know if there is a similar result to Theorem 529 for any separable Hilbert space. The answer is yes. Theorem 530 ThedualofaseparableHilbertspaceH is H itself. That is, for every bounded linear functional F on a separable, complete inner product space H , there is a unique element y ∈ H such that: (i) F(x)=<x,y>, ?x ∈H; and (ii) kFk = kyk. Proof. (Sketch) By Theorem 501a separable Hilbert space contains a countable, complete orthonormal basis {e i ,i ∈ N}. Let F be a bounded linear functoinal on H.Setb i = F(e i ),i=1,2,... It is easy to show that P ∞ i=1 b 2 i ≤kFk < ∞. Hence by Parseval?s Theorem 504 there exists a b ∈H such that b = P ∞ i=1 b i e i where b i are the Fourier coe?cients of b.Moreover, kbk H ≤ kFk. Let x = P ∞ i=1 x i e i where x i are the Fourier coe?cients of x. Then by Parseval?s equality x n (= P n i=1 x i e i ) → x(= P ∞ i=1 x i e i )asn →∞. Furthermore because F is continous and linear F(x)= lim n→∞ F ? n X i=1 x i e i ! = lim n→∞ n X i=1 x i F(e i ) = lim n→∞ n X i=1 x i b i = ∞ X i=1 x i b i =<x,b>. so that kFk≤kbk H . A similar result can be proven for a nonseparable, complete inner product space. Thus we can conclude that the dual space of any Hilbert space is a Hilbert space itself (i.e. H ? = H). 250 CHAPTER 6. FUNCTION SPACES Since ! 2 and L 2 ([a,b]) are separable Hilbert spaces, we have ! ? 2 = ! 2 and L ? 2 ([a,b]) = L 2 ([a,b]). InthecaseofL 2 ([a,b]), we can claim that for any bounded, linear functional F : L 2 ([a,b]) →R there exists a unique function g ∈ L 2 ([a,b]) such that F(f)= R b a gfdx,?f ∈ L 2 ([a,b]).This will be shown in Theorem 532. The dual space of ! p While the previous section applied to inner product spaces, what about the dual space to a complete normed vector space that is not a Hilbert space? In this section, we consider the dual of ! p for p 6=2. Let p ∈ [1,∞)andletz ∈ ! q where p,q are conjugate. Then F : ! p →R given by F(x)= ∞ X i=1 x i z i for x =<x i > ∞ i=1 ∈ ! p (6.5) is a bounded linear functional on ! p . This follows immediately from Holder?s inequality (Theorem 479). We now show that all bounded linear functionals on ! p are of the form (6.5). Theorem 531 Let p ∈ [1,∞) and q satisfy 1 p + 1 q =1.IfF ∈ ! ? p , there exists an element z =<z i >∈ ! q such that F(x)= ∞ X i=1 x i z i for all x =<x i > ∞ i=1 ∈ ! p and kFk = kzk q . Proof. (Sketch) Let F be a bounded linear functional on ! p .Let{e i ,idN} be the set of vectors having the i-th entry equal to one and all other entries equal to zero. Set F(e i )=z i ,i∈N.Givenx =<x 1 ,x 2 ,... >∈ ! p ,lets n bethe vector consisting of the Trst n coordinates of x (i.e. s n = P n i=1 x i e i ). Then s n ∈ ! p and kx?s n k p p = P ∞ i=n+1 |x i | p ?→ 0asn →∞. Due to linearity and continuity of F, F (x)= P ∞ i=1 x i F (e i )= P ∞ i=1 x i z i and kFk≤kzk q . By plugging x =<x i > where x i = ? |z i | q?2 z i when z i 6=0 0whenz i =0 6.4. LINEAR FUNCTIONALS 251 we get that kzk q ≤kFk. This shows that kFk = kzk q . Theorem 531 establishes that ! ? p = ! q where p and q are conjugate. Thus, the dual space of ! 1 is ! ∞ . However, the reverse is not true. That is, the dual space of ! ∞ is not ! 1 or ! ? ∞ ! ! 1 . We show this in the next section. The dual space of L p An important theorem, known as the Riesz representation theorem, estab- lishes a result similar to Theorem 531 for L p . Let X ?R,1≤ p<∞ and q conjugate to p (i.e. 1 p + 1 q =1). Let g ∈ L q (X).DeTne a functional F : L p →R by F(f)= Z X fgdm for all f ∈ L p (X). It is easy to see that F is a bounded linear functional on L p (X). Linearity follows from linearity of the integral and boundedness follows from the Holder inequality. Then we have the result that each linear functional on L p (X) can be obtained in this manner (i.e. L ? p = L q ). Theorem 532 (Riesz Representation) Let F be a bounded linear func- tional on L p (X) and 1 ≤ p<∞. Then there is a function g ∈ L q (X) such that F(f)= Z X fgdm and kFk = kgk q . Proof. (Sketch) Let F be a bounded linear functional on L p .Inalthe previous cases, in Tnding the element b that represents a given functional we used the same procedure; we set b i = F(e i )where{e i } is a basis. In L p ,we use indicator functions. First assume that m(X) < ∞ (later we relax this assumption). For any E ? X which is L-measurable (i.e. E ∈ L), χ E ∈ L p (X). Thus given F we deTne a set function ν : L→R by ν(E)=F(χ E )forE ? L. ν is a Tnite signed measure which is absolutely continuous with respect to m.Thenby the Radon Nikdodyn Theorem 434 there is an L?integrable function g that represents ν (i.e. ν(E)= R E gdm = R X χ E gdm. By linearity of F we have F(?)= Z X g?dm 252 CHAPTER 6. FUNCTION SPACES for all simple functions ? ∈ L p (x)and|F(?)| ≤ kFkk?k p . Then it can be shown that g ∈ L q (X). Because the set of simple functions is dense in L p (X),then F(f)= R X gfdm for all f ∈ L p (X). If m(X)=∞, since m is σ-Tnite, there is an increasing sequence of L- measurable sets <X n > with Tnite measure whose union is X. Thus we apply the result proven inthe TrstpartoftheprooftodeTne <g n > on X n .Then show that g n →g and F(f)= R fg n dm for all f ∈ L p . Here we note that the dual of L ∞ (X)isnotL 1 (X). That is, not all bounded functionals on L ∞ ([a,b]) can be represented by F (f)= R X fg, where g ∈ L 1 (X).The proof of this result is easier to see after some future results on separation, so we wait until then. 6.4.2 Second Dual Space In the previous section we showed that the dual space X ? of all bounded linear functionals deTned on a normed linear space X is a normed linear space itself. Then it is possible to speak of the space (X ? ) ? of bounded linear functionals deTned on X ? which is called the second dual space X ?? of X. Of course X ?? is also a normed vector space. Let us try to deTne some elements of X ?? . Given a Txed element x 0 in X we can deTne a functional ψ : X ? →R by ψ x 0 (f)=f (x 0 )wheref runs through all of X ? . Notice that ψ assigns to each element f ∈ X ? its value at a certain Txed element of X. We have ψ x 0 (αf 1 +βf 2 )=(αf 1 +βf 2 )(x 0 )= αf 1 (x 1 )+βf 2 (x 2 )=αψ x 0 (x 1 )+βψ x 0 (x 2 )(sincef 1 ,f 2 are linear functionals) and ˉ ˉ ψ x 0 (f) ˉ ˉ = |f (x 0 )| ≤ kfkkxk (since f is bounded). Hence ψ x 0 is a bounded linear functional on X ? . Besides the notation f (x),which indicates the value of the functional f at a point x, we will Tnd it useful to employ the symmetric notation f (x) ≡hf,xi. It is not a coincidence that a value of a functional is denoted thesamewayasthescalarproductbecause any bounded linear functional g deTned on a Hilbert space can be represented by a scalar product (i.e. ?y ∈H such that g(x)=hy,xi , ?x ∈H by Theorem 530). For Txed f ∈ X ? we can consider hf,xi as a functional on X and for Txed x ∈ X as a functional on X ? (i.e. as an element of X ?? ). Let us deTne a new norm k·k 2 on X by the following kxk 2 =sup ? |hf,xi| kfk ,f∈ X ? ,f6=0 ? 6.4. LINEAR FUNCTIONALS 253 Howisthisnormrelatedtotheoriginalnormk·k X on X ?Weshallshow that kxk X = kxk 2 . Let f be an arbitrary non zero element in X ? . Then |hf,xi| ≤kfkkxk X ? kxk X ≥ |hf,xi| kfk . Since this inequality is true for any f then kxk X ≥ sup ? |hf,xi| kfk ,fdX ? ,f6=0 ? = kxk 2 . (6.6) Now to the converse inequality. By Theorem 540 of the next section, for any element x ∈ X , x 6= 0 there is a bounded linear functional f 0 such that |hf 0 ,xi| = kf 0 kkxk X ? |hf 0 ,xi| kf 0 k = kxk X . Consequently kxk 2 =sup ? |hf,xi| kfk ,f∈ X ? ,f6=0 ? ≥kxk X (6.7) Combining inequalities (6.6) and (6.7), we have that kxk 2 = kxk X . Since hf,xi for Txed x ∈ X is a linear functional on X ? , then by (iv) of Theorem 518, the expression sup n |hf,xi| kfk ,f∈ X ? ,f6=0 o is the norm of this functional. But this expression is identical to k·k 2 . If we now deTne a mapping J : X ?→ X ?? by J (x)=hf,xi,f ∈ X ? , then by the virtue of the identity kxk X = kxk 2 = kJ (x)k, the space X is isometric with some subset F of X ?? . See Figure 6.4.2.1. Thus X and F ? X ?? are isometrically isomorphic (i.e. they are indistinguishable so we may write X = F and X ? X ?? ). There is a class of normed vector spaces X for which the mapping J : X → X ? is onto (i.e. X = X ? ). DeTnition 533 The space X is said to be re?exive if X = X ?? . As we will see later this property plays a very important role in optimiza- tion theory. Let us check some known vector spaces for re?exivity. Example 534 The Euclidean space R n is re?exive. Why ? We showed in the previous section (Theorem 529) that even the Trst dual of R n is R n (i.e. (R n ) ? = R n ). Hence (R n ) ?? =((R n ) ? ) ? =(R n ) ? = R n . 254 CHAPTER 6. FUNCTION SPACES Example 535 InTnite dimensional Hilbert spaces (! 2 ,L 2 ) are re?exive. By Theorem 530 we have that H ? = H which means that any Hilbert space is re?exive. That is, ! ?? 2 = ! 2 , L ?? 2 = L 2 . What about ! p , L p when p 6=2? Example 536 If 1 <p<∞ , then by Theorem 531, ! ? p = ! q where 1 p + 1 q =1. Thus ! ?? p = ? ! ? p ¢ ? = ! ? q = ! p because p,q are mutally conjugate. Similarly L ?? p = L p . Thus if 1 <p<∞ ,then ! p ,L p are re?exive. If p =1byTheorem531 ! ? 1 = ! ∞ . But ! ? ∞ ) ! 1 . Hence (! ? 1 ) ? = ! ? ∞ ) ! 1 so that ! 1 is not re?exive. ! ∞ is also not re?exive. Similarly L 1 , L ∞ are not re?exive. We will show this in the next section. Example 537 It can be shown that the space C ([a,b]) of all continuous functions on [a,b] is not re?exive. Exercise 6.4.1 Show that X is re?exive i? X ? is re?exive. 6.5 Separation Results In this section we state and prove probably the most important theorem in functional analysis; the Hahn-Banach theorem. It has numerous applications. We will concentrate on a geometric application and will formulate it as a separation result for convex sets. Using this theorem we prove the existence of a competitive equilibrium allocation in a general setting. First we deTne a new notion. DeTnition 538 Let X be a normed vector space. A functional P : X →R is called sublinear if: (i) P(x + x 0 ) ≤ P(x)+P(x 0 ), ?x,x 0 ∈ X; and (ii) P(αx)=αP(x), ?x ∈ X and α ∈R ++ . Exercise 6.5.1 Let X be a normed vector space. Show that the norm k·k X : X →R is a sublinear functional. 6.5. SEPARATION RESULTS 255 The Hahn-Banach theorem provides a method of constructing bounded linear functionals on X with certain properties. One Trst deTnes a bounded linear functional on a subspace of a normed vector space where it is easy to verify the desired properties. Then the theorem guarantees that this functional can be extended to the whole space while retaining the desired properties. Theorem 539 (Hahn-Banach) Let X be a vector space and P : X → R be a sublinear functional on X .LetM be a subspace of X and let f : X →R be a linear functional on M satisfying f(x) ≤ P(x),?x ∈ M. (6.8) Then there exists a linear functional F : X →R on the whole of X such that F(x)=f(x),?x ∈ M (6.9) and F(x) ≤ P(x),?x ∈ X. (6.10) Proof. (Sketch)Choosex 1 ∈ X\M.DeTne a linear subspaceM 1 ={x : x = αx 1 +y, y ∈ M}.Let F bean extension off to M 1 . Since F is linear, then F (αx 1 +y)=αF (x 1 )+ F (y)=αF(x 1 )+f(y). Thus F is completely determined by F (x 1 ). Next we derive lower and upper bounds for F(x 1 )inorderforF to satisfy (6.9) and (6.10) for x ∈ M 1 .ThusF is an extension of f from M to M 1 where M ? M 1 . This process can be repeated and Zorn?s lemma guarantees that F canbeextendedtothewholespaceX. In order to apply Zorn?s lemma we deTne a partial order on the set S = {all linear functionals g : D → R where D is a subspace of X and g(x) ≤ P(x), ?x ∈ D} in the following way. Let g 1 ,g 2 . Then g 1 <g 2 if D(g 1 ) ? D(g 2 )andg 1 (x)=g 2 (x), ?x ∈ D(g 1 ). Then we must check that every totally ordered subset of S has an upper bound (in which case the assumptions of Zorn?s lemma are satisTed). At Trst sight the Hahn-Banach Theorem 539 doesn?t look like a ?big deal?. Its signiTcance in functional analysis, however, becomes apparent through its wide range of applications, many of them involving a clever choice of the subadditive functional P. We will state just three propositions. The Trst result says that a bounded linear functional deTnedonavector subspace can be extended on the whole vector space. 256 CHAPTER 6. FUNCTION SPACES Theorem 540 Let M be a vector subspace of a normed vector space X. Let f be a bounded linear functional on M , then there exists a bounded linear functional F on X s.t. F (x)=f (x), ?x ∈ M and kFk = kfk. Proof. The function P (x)=kfkkxk is a sublinear functional on X and |f (x)| ≤ kfkkxk = p(x) (See Exercise 6.5.1). Then by Hahn-Banach The- orem, ?F (an extention of f on X) with property that |F (x)| ≤ p(x)= kfkkxk, ?x ∈ X. This means that F is bounded on X and also kFk≤kfk. Because F is an extension of f, then kfk≤kFk. Hence kFk = kfk. Exercise 6.5.2 Carefully compare the assumptions of the Hahn-Banach The- orem 539 and Theorem 540. Theorem 540 can be used to show that the dual of L ∞ ([a,b]) is not L 1 ([a,b]). Lemma 541 Not all bounded functionals on L ∞ ([a,b]) can be represented by F (f)= R [a,b] fg, where g ∈ L 1 ([a,b]). That is, (L ∞ ([a,b])) ? ! L 1 ([a,b]). Proof. C ([a,b]) is a vector subspace of L ∞ ([a,b]). Let F 1 : C ([a,b]) ?→ R be a linear functional which assigns to each f ∈ C ([a,b]) the value f (a) (i.e. F 1 (f)=f (a)). SincekF 1 k =sup n |F 1 (f)| kfk c([a,b]) ,kfk 6=0 o =sup n |f(a)| sup{|f(x)|,xD[a,b]} o ≤ 1,F 1 is bounded and by Theorem 540 F 1 can be extended to a bounded linear functional F on the whole set L ∞ ([a,b]). Let?s assume, by contradiction, that there is g ∈ L 1 ([a,b]) such that F can be represented by F (f)= R b a fgdx, ?f ∈ C ([a,b]). Let hh n i be a sequence of continuous functions on [a,b]which are bounded by 1, have h n (a)=1, andsuchthath n (x) → 0 for all x 6= a. For example set h n (x)= £ 1 b?a (b?x) ¤ n . Then for each g ∈ L 1 , R b a h n g → 0 by the Bounded Convergence Theorem 386). Since F(h n )= R b a gh n by as- sumption, we have F(h n ) → 0. But F (h n )=h n (a) = 1 for all n,whichisa contradiction. Corollary 542 L 1 (X) and L ∞ (X) are not re?exive. Proof. We know by Theorem 532 that L ? 1 = L ∞ and by Lemma 541 that L ? ∞ ! L 1 . Combining these two results we have the (L ? 1 ) ? = L ? ∞ ! L 1 .Furthermore since L 1 is not re?exive, neither is L ∞ by Exercise 6.4.1. The second result states that given a normed vector space X,itsdualX ? has ?su?ciently? many elements (i.e. at least as many elements as X itself). 6.5. SEPARATION RESULTS 257 Theorem 543 Let X be a normed vector space and let x 0 6=0be any element of X. Then there exists a bounded linear functional F on X such that kFk =1 and F(x 0 )=kx 0 k. Proof. Let M be the subspace consisting of all multiples of x 0 (i.e. M = {αx 0 ,α∈R}. DeTne f : M ?→ R by f (αx 0 )=αkx 0 k. Then f is a linear functional on M. DeTne P : X ?→ R by P (y)=kyk.Pis a sublinear functional on X satisfying f(x) ≤ P(x)forx ∈ M. Then by the Hahn- Banach theorem there exists a linear functional F : X ?→ R that is an extention of f and F (x) ≤ P(X)=kxk,?x ∈ X. For ?x, we have F (?x) ≤ k?xk = kxk in which case |F (x)| ≤ kxk, ?x ∈ X. Thus F is bounded and kFk =sup n |F(x)| kxk ,?x ∈ X,x 6=0 o =1. Also, since F is an extension of f, F (x 0 )=F (1·x 0 )=1·kx 0 k = kx 0 k. The third proposition is a geometric version of the Hahn-Banach theorem. It is a separation result for convex sets. Before stating it we have to introduce a few geometric concepts. DeTnition 544 Let K ? X be convex. A point x ∈ K is an internal point of a convex set K if given any y ∈ X, ?ε>0 such that x + δy ∈ K for all δ satisfying |δ| <ε. Geometrically, the statement that x is an internal point of K means that the intersection of K with any line L through x contains a segment with x as a midpoint. See Figure 6.5.1. DeTnition 545 Let 0 (a zero vector) be an internal point of a convex set K.Thenthesupport function P : X → R ++ of K (with respect to 0)is given by P(x)=inf n λ : x λ ∈ K,λ > 0 o . The support function has a simple geometric interpretation. Let x ∈ X. Draw the line segment (a ray) from 0 through x.Thereisapointy on this segmentthatisaboundarypointofK . Then the scalar λ for which λy = x is P(x)sothatP(x)y = x. See Figure 6.5.2. We have the following properties for this support function. Lemma 546 If K is a convex set containing 0 as an internal point then the support function P has the following properties:(i) P (αx)=αP (x) for α ≥ 0;(ii) P (x+y) ≤ P (x)+P (y); (iii) {x : P (x) < 1}? K ?{x : P (x) ≤ 1} 258 CHAPTER 6. FUNCTION SPACES Proof. (i) Let α>0, P (αx)=inf n λ : αx λ ∈ K,λ > 0 o =inf ( α λ α : x λ α ∈ K, λ α > 0 ) =inf ? αβ : x β ∈ K,β > 0 ? = αinf ? β : x β ∈ K,β > 0 ? = αP(x) where β = λ α > 0. (ii) Letα =inf ? λ : x λ ∈ K,λ > 0 a = P(x)andβ =inf n μ : y μ ∈ K,μ > 0 o = P(y). Take λ and μ such that x λ ∈ K and y μ ∈ K.Then α ≤ λ and β ≤ μ. Since K is convex, μ λ λ+μ ? 3 x λ ′ + μ μ λ+μ ?μ y μ ? ∈ K because λ λ+μ + μ λ+μ =1.Then x+y λ+μ ∈ K.ThusP (x+y) ≤ λ + μ (because P (x+y)istheinTmum of such scalars). Hence P (x+y) ≤ λ+μ ≤ α +β = P (x)+P (y). (iii) It follows from the deTnition of P. Example 547 Let X = R 2 with the Euclidean norm. Let K = {(x 1 ,x 2 ) ∈ R 2 : k(x 1 ,x 2 )k≤ 1}. Obviously K (the unit ball) is convex. Consider a point x 1 =(2,2) outside the ball. Then P((2,2)) = ? λ :( 2 λ , 2 λ ) ∈ K,λ > 0 a . But ( 2 λ , 2 λ ) ∈ K i? ° ° ( 2 λ , 2 λ ) ° ° ≤ 1 ?? 4 λ 2 + 4 λ 2 ≤ 1 or λ ≥ √ 8,soP((2,2)) = √ 8 > 1. Now consider a point x 2 =( 1 2 , 1 2 ) inside the ball. Then P(( 1 2 , 1 2 )) = q 1 2 < 1.See Figure 6.5.3. DeTnition 548 Two convex sets K 1 ,K 2 are separated by a linear func- tional F if ?α ∈R such that F(x) ≤ α, ?x ∈ K 1 and F(x) ≥ α, ?x ∈ K 2 . 6.5. SEPARATION RESULTS 259 Theorem 549 (Separation) Let K 1 ,K 2 be two convex sets of a normed vector space X. Assume that K 1 has at least one internal point and that K 2 contains no internal point of K 1 . Then there is a nontrivial linear functional separating K 1 and K 2 . Proof. (Sketch) Let K 1 and K 2 be two convex subsets of X and without loss of generality let 0 ∈ K 1 and x 0 ∈ K 2 .DeTne K = x 0 + K 1 ? K 2 .See Figure 6.5.4. 0 is an internal point of K and x 0 is not an internal point of K (this latter fact follows since K 2 contains no internal points of K 1 ). Thus by (iii) of Lemma 546 P(x) ≤ 1 for all x ∈ K and P(x 0 ) ≥ 1whereP is a support function of K. Let M be a vector subspace (i.e. M = {x : x = αx 0 ,α∈R}).DeTne f : M → R by f(x)=f (αx 0 )=αP (x 0 ).fis a linear functoinal that satisTes f(x) ≤ p(x) ?x ∈ M. Hence by the Hahn-Banach Theorem 539 there exists an extension of f (i.e. a linear function F : X →R satisfying F(x) ≤ P(x) ?x ∈ X. This functional F separates K 1 and K 2 .Why?Takex ∈ K with x = x 0 +y?z where y ∈ K 1 and z ∈ K 2 .Then F(x) ≤ P(x) ≤ 1forx ∈ K. Since F is linear F(x 0 )+F(y)?F(z) ≤ 1 ?? F(y)+(F(x 0 )?1) ≤ F(z) (6.11) Since x 0 ∈ M, F(x 0 )=f(x 0 )=p(x 0 ) ≥ 1 ?? F(x 0 )?1 ≥ 0 (6.12) Combining (6.11) and (6.12) we have F(y) ≤ F(z)foranyy ∈ K 1 and z ∈ K 2 . Hence sup y∈K 1 F (y) ≤ inf z∈K 2 F (z). Thus F separates K 1 ,K 2 and F isanon-zerofunctional(sinceF (x 0 )=1). There are several corollaries and modiTcationsofthisimportantsepara- tion theorem. Corollary 550 (Separation of a point from a closed set) If K is a nonempty, closed, convex set and x 0 /∈ K, then there exists a continuous linear func- tional F not identically zero such that F(x 0 ) < inf x∈K F(x). 260 CHAPTER 6. FUNCTION SPACES Proof. By translating by ?x 0 , we reduce the Corollary to the case where x 0 =0. Sincex 0 /∈ K and K is closed, then by Exercise 4.1.3 we have 0 <d=inf x∈K kx 0 ?xk. Let Bd 2 (x 0 ) be the open ball around x 0 with radius d 2 . By the Separation Theorem 549 there exists a linear functional f such that sup x∈B d 2 (0) f(x) ≤ inf y∈K f(y)=α. Thus f(x) ≤ α for x ∈ Bd 2 (x 0 ). If x ∈ Bd 2 (x 0 ), then ?x ∈ Bd 2 (x 0 )which implies that f(?x)=?f(x) ≤ α. Hence |f(x)|≤ α for all x ∈ Bd 2 (x 0 ). This implies continuity at 0 and by Theorem 508 continuity everywhere. To show strict inequality, take x ∈ Bd 2 (x 0 )andλ>0 such that λx ∈ Bd 2 (x 0 ) (this is possible since 0 is an internal point of Bd 2 (x 0 )). We have 0 <λf(x)=f(λx) ≤ α. Thus we have f(0) = 0 <α=inf y∈K f(y). See Figure 6.5.5. Corollary 551 (Strict Separation) Suppose that a nonempty, closed, con- vex set K 1 and a nonempty, compact convext set K 2 are disjoint. Then there exists a continuous linear functional F, not identically zero , that strictly separates them (i.e. sup x∈K 1 F(x) < inf x∈K 2 F(x)). Proof. If K 1 ,K 2 are convex, then K 1 ? K 2 is convex. Since K 1 is closed and K 2 is compact, then K 1 ? K 2 is closed. Since K 1 ∩ K 2 = ?,then 0 /∈ K 1 ?K 2 .Now apply Corollary 550 with x 0 =0andK = K 1 ?K 2 . See Figure 6.5.6. It doesn?t su?ce to assume both sets K 1 and K 2 are closed. One of them has to be compact. For an example of this, see Aliprantis and Border Exam- ple 5.51. This doesn?t contradict the Separation Theorem as it might seem because Theorem 549 requires the additional assumption of the existence of an internal point of at least one of the sets. Exercise 6.5.3 Show that if K 1 is closed and K 2 is compact, then K 1 ?K 2 is closed. 6.5.1 Existence of equilibrium Let S be a Tnite dimensional Euclidean space with norm k·k =( P n i=1 |x 2 i |) 1 2 . There are I consumers, indexed by i =1,...,I. Consumer i chooses among 6.5. SEPARATION RESULTS 261 commodity points in a set X i ? S and maximizes utility given by u i : X i → R.ThereareJ Trms, indexed by j =1,...,J.Firm j chooses among points in asetY j ? S describing its technological possibilities and maximizes proTts. We say that an (I + J)-tuple ({x i } I i=1 ,{y j } J j=1 ) describing the consump- tion x i of each consumer and the production y j of each producer is an allo- cation for this economy. An allocation is feasible if: x i ∈ X i , ?i; y j ∈ Y j , ?j;and P I i=1 x i ? P J j=1 y j ≤ 0 (where there is free disposal). An allocation is Pareto Optimal if it is feasible and if there is no other feasible allocation ({x 0 i } I i=1 ,{y 0 j } J j=1 ) such that u i (x 0 i ) ≥ u i (x i ),?i and u i (x 0 i ) >u i (x i )forsomei. An allocation ({x ? i } I i=1 ,{y ? j } J j=1 ) together with a continuous linear functional φ : S →R is a competitive equilibrium if: (a) ({x ? i } I i=1 ,{y ? j } J j=1 )isfeasi- ble; (b) for each i, x ∈ X i ,and φ(x) ≤ φ(x ? i ) implies u i (x) ≤ u i (x ? i ); and (c) for each j, y ∈ Y j implies φ(y) ≤ φ(y ? j ). Theorem 552 (Second Welfare Theorem) Let: (A1) X i is convex for each i; (A2) if x,x 0 ∈ X i ,u i (x) >u i (x 0 ) and α ∈ (0,1), then u i (αx +(1? α)x 0 ) >u i (x 0 ) for each i;(A3) u i : X i → R is continuous for each i; (A4) the set Y = P J j=1 Y j is convex. 7 Under (A1) ?(A4),let({x ? i } I i=1 ,{y ? j } J j=1 ) be a Pareto Optimal allocation. Assume that for some h ∈ {1,...,I}, ?bx h such that u h (bx h ) >u h (x ? h ). Then there exists a continuous linear functional φ : S →R, not identically zero on S,suchthat: ?i,x ∈ X i and u i (x i ) ≥ u i (x ? i ) ? φ(x) ≥ φ(x ? i ) (6.13) and ?j,y ∈ Y j ? φ(y) ≤ φ(y ? j ). (6.14) If ?i,?x 0 i such that φ(x 0 i ) <φ(x ? i ), (6.15) then ? ({x ? i } I i=1 ,{y ? j } J j=1 ),φ a is a competitive equilibrium. Proof. (Sketch) Since S is Tnite dimensional and the aggregate technolog- ical possibilities set is convex (A4), for the existence of φ it is su?cient to show that the set of allocations preferred to {x ? i } I i=1 given by A = P I i=1 A i is convex where A i = {x ∈ X i : u i (x) ≥ u i (x ? i )},?i and that A does not contain any interior points of Y. Then apply Theorem 549. To complete the proof, 7 The assumption that S is Tnite dimensional is also important, but can be weakened in the inTnite dimensional case to assume that Y has an interior point. 262 CHAPTER 6. FUNCTION SPACES it is su?cient to show (b) holds in the deTnition of a competitive equilibrium which follows from contraposition of (6.13). You should recognize that φ(x)=<p,x>canbeconsideredaninner product representation of prices. 6.6 Optimization of Nonlinear Operators In this chapter we have dealt with linear operators and functionals. While we showed very deep results in linear functonal analysis - the Riesz Rep- resentation Theorem and the Hahn Banach Theorem to name just a few - there are many problems in economics that involve nonlinear operators. For instance, the operator in most dynamic programing problems, such as the growth example suggested in the introduction to this chapter, does not satisfy the linearity property of an operator. In particular, an operator T : X → Y as simple as T(x)=a + bx does not possess the linearity property since T(αx+βx 0 )=a+b(αx+βx 0 ) 6= αTx+βTx 0 . Such a function does possess a monotonicity property (i.e. if x ≤ x 0 , then Tx≤ Tx 0 ). Nonlinear functional analysis is a very broad area covering topics such as Txed points of nonlinear operators (which we touched on a subsection of 6.1), nonlinear monotone operators, variational methods and optimization of nonlinear operators. In this section, we show how variational methods and Txed point theory (in the form of dynamic programming) can be used to prove the existence of an optimum of a nonlinear operator. 6.6.1 Variational methods on inTnite dimensional vec- tor spaces Most books of economic analysis dealing with optimization focus on Tnding necessary conditions for a function deTnedonagivensetwhichisasubset of a Tnite dimensional Euclidean space R n . These conditions are called Trst order conditions (in the case of inequality or mixed constraints they are called Kuhn-Tucker conditions). Our main focus of this chapter is the optimization of functions deTnedonaninTnite dimensional vector space (i.e. optimization of functionals). While we have already encountered linear functonals in Section 6.4, in this section we will consider a broader class of functionals than linear ones; we will consider continuous functionals which are concave (or convex as the 6.6. OPTIMIZATION OF NONLINEAR OPERATORS 263 case may warrant). Our main concern is existence theory (i.e. given an opti- mization problem consisting in maximizing (minimizing) a concave (convex) functional over some feasible set, usually deTned by constraints, we want to know whether an optimal solution can be found. Hence we will deal with su?cient conditions. In the second part of the section we also touch upon the problem of Tnding this optimal solution which means stating the necessary conditions for an optimun. Example 553 The types of problems we can consider are: the existence of a Pareto-optimal allocation of an economy with an inTnite commodity space; the existence of an optimal solution of an inTnite horizon growth model. Su?cient Conditions for an Optimal Solution In this subsection we address the fundamental question ?Does a functional have a maximum (or minimum) on a given set?? In Chapter 4, we proved a very important result; the Extreme Value Theorem 262 stated that a con- tinuous function deTned on a compact subset of a metric space attains its minimum and maximum. Does this theorem apply to functionals (functions whose domain is a subset of an inTnite-dimensional vector space)? Clearly the answer is yes since a vector space is a metric space and dimensionality is not mentioned in the theorem at all. Consider the following example. Example 554 Let a functional f be deTned on C ([0,1]) by f (x)= R 1 2 0 x(t)dt? R 1 1 2 x(t)dt. We want to solve the optimization problem maxf (x) subject to kxk≤ 1.To establish continuity of f (x), we need only establish boundedness since f (x) is a linear functional and Theorem 511 establishes that bounded- 264 CHAPTER 6. FUNCTION SPACES ness is su?cient (and also necessary) for continuity. Hence |f (x)| = ˉ ˉ ˉ ˉ ˉ Z 1 2 0 x(t)dt? Z 1 2 0 x(t)dt ˉ ˉ ˉ ˉ ˉ ≤ Z 1 2 0 |x(t)|dt + Z 1 1 2 |x(t)|dt = Z 1 0 |x(t)|dt ≤ Z 1 0 sup t[0,1] |x(t)|dt =sup tD[0,1] |x(t)| Z 1 0 dt =sup tD[0,1] |x(t)| = kxk. Suppose now we want to Tnd the maximum of this continuous functional on theclosedunitballinC ([0,1]). But the maximum cannot be attained. Why? Our problem means maximizing the shaded area. See Figure 6.6.2.??? The steeper the middle part of x(t) is the larger is the area. But this line cannot be vertical because x(t) wouldn?t be a function. The steepest line clearly doesn?t exist. Exercise 6.6.1 Show the non-existence of a maximum in Example 554 rig- orously. Hint: Use the geometric insight provided above. What went wrong with establishing a maximum in Example 554? We have a continuous functional on a closed unit ball that doesn?t attain its maximum. Recall however by Example 459 that a closed unit ball is not acompactsetinC ([0,1]). In fact there is a theorem saying that a closed unit ball in a normed vector space X is compact if and only if X is Tnite dimensional. 8 Ifa?nice?setlikeaclosedunitball is not compact, then compactness must be an extremly restrictive assumption in inTnite dimensional vector spaces. And it really is. Compact sets in inTnite dimensional vector spaces doesn?t contain interior points. Thus the Extreme Value Theorem is practi- cally unusable in optimizing functionals. 8 See Rudin ???? 6.6. OPTIMIZATION OF NONLINEAR OPERATORS 265 If Example 554 were formulated in L ∞ (0,1), the maximum would be attained by the discontinuous function x(t)= ? 1for0≤ x< 1 2 ?1for 1 2 ≤ x ≤ 1 . L ∞ (0,1) is also inTnite dimensional. Is this a contradiction to what we claimed above? No. There are many optimizing problems in inTnite dimen- sional vector spaces that attain optima but an optimum cannot be guaranteed by the assumptions of continuity and compactness. To this end, we will in- troduce a new type of convergence in a vector space X.In terms of this new convergence we?ll deTne a new type of continuity and compactness so that the collection of these ?new types? of compact sets is much broader than the collection of original compact sets. In particular, we will identify a class of such vector spaces in which the closed unit ball is ?weakly? compact. Semicontinuous and concave functionals Before introducing this ?new type? of convergence, we deTne certain prop- erties of functionals. The concept of convexity and concavity for functionals is analogous to the one for functions. DeTnition 555 Let K be a convex subset of a normed vector space X .A functional f : K ?→ R is called: (i) Concave if for any u,v ∈ K and for any α ∈ [0,1],f(αu+(1?α)v) ≥ αf (u)+(1?α)f (v);(ii) Convex if f (αu+(1?α)v) ≤ αf (u)+(1?α)f (v) Exercise 6.6.2 Show that f is concave i? ?f is convex. Exercise 6.6.3 Verify that the functional f (x)= R 1 0 (x 2 (t)+|x(t)|)dt de- Tned on L 2 [0,1] is convex. Next we introduce the concept of semicontinuity of functionals (or func- tions as the case may be). Why don?t we simply use continuity? Recall we used the assumption of continuity in the Extreme Value theorem to guaran- tee the existence of both amaximumand a minimum. Here we will show that the assumption can be weakened at the cost of guaranteeing either a maximum or a minimum. 266 CHAPTER 6. FUNCTION SPACES DeTnition 556 A functional f deTned on a normed vector space X is said to be: (i) upper semicontinuous at x 0 if given ε>0, there is a δ>0 such that f (x)?f (x 0 ) <εfor kx?x 0 k <δ; (ii) lower semicontinuous at x 0 if f (x 0 )?f (x) <εfor kx?x 0 k <δ. Exercise 6.6.4 Show that f is usc i? ?f is lsc. Exercise 6.6.5 Show that f is continuous at x 0 if f is both usc and lsc at x 0 . Exercise 6.6.6 A sequence version deTnition of semicontinuity is the fol- lowing: (i) f is usc at x 0 if for any sequence hx n i converging to x 0 , limsup n→∞ f (x n ) ≤ f (x 0 );(ii) f is lsc at x 0 if liminf n→∞ f (x n ) ≥ f (x 0 ).Show that the sequence deTnition is equivalent to that in 556. Now the Extreme Value Theorem can be reformulated: Theorem 557 An upper (lower) semicontinuous functional f on a compact subset K ofanormedvectorspaceX achieves a maximum (minimum) on K. Proof. Let M =sup x∈K f (x)(M may be ∞ ). There is a sequence hx n i from K such that f (x n ) → M. Since K is compact there is a convergent subsequence hx n k i → x 0 ∈ K.Clearly,f (x n k ) → M and since f is usc, f (x 0 ) ≥ limsup k→∞ f (x n k ) = lim k→∞ f (x n k )=M. Because x 0 ∈ K, f (x 0 ) must be Tnite and because M is the supremum on K, f (x 0 )=M. Hence f attains a maximum at x 0 ∈ M. Hereafter, we will formulate our optimization problem in terms of maxi- mization (i.e given a functional f deTnedonasubsetK of a normed vector space, Tnd max xDK f (x)). In this case the underlying assumptions for f are upper semicontinuity and concavity. The problem of Tnding min xDK g(x) where g is lower semicontinuous and convex can be transformed to maximiz- ing one by substitution f = ?g because if g is lsc and convex, then ?g is usc and concave (see Exercise 6.6.2 and 6.6.4). Weak convergence We assume that X is a complete normed vector space (i.e. a Banach space). 6.6. OPTIMIZATION OF NONLINEAR OPERATORS 267 DeTnition 558 Let hx n i be a sequence of elements in X. We say that hx n i converges weakly to x 0 ∈ X if for every continuous linear functional f ∈ X ? we have hf,x n i→hf,x 0 i. We denote weak convergence with ?p?(rather than the standard ?→?) and use notation <x n >p x 0 . Since hf,x n i is a value of the functional f at point x n ,thenhf,x n i ∞ n=1 is the sequence of real numbers. It is easy to prove that weak convergence has usual properties namely. Exercise 6.6.7 Show that: (i) If <x n >p x 0 and <y n >p y 0 , then <x n +y n >p x 0 +y 0 ;(ii) Let hλ n i is a sequence of real numbers. If λ n →λ 0 and <x n >p x 0 , then λ n x n pλ 0 x 0 ;(iii) If x n →x 0 , then <x n >p x 0 ;(iv) a weakly convergent sequence has a unique limit. Hint for (iv): Apply the corollary to the Hahn Banach Theorem to <f,x?y>=0for each f ∈ X ? . Since there are now two types of convergence deTned on X, the original one (i.e. with respect to its norm) is sometimes called strong convergence as opposed to (the newly introduced) weak convergence. Property (iii)ofthe exercise states that if a sequence converges strongly, then it also converges weakly. This statement cannot be reversed in general. That means there are sequences that converge weakly but not strongly. We will see this in the following set of examples where we demonstrate weak convergence in some Banach spaces. Example 559 In the Tnite dimensional vector space R n , strong and weak convergence coincides (i.e. for x n ∈ R n ,<x n >→ x 0 i? <x n >p x 0 ).To see this, let e 1 =(1,0,.....,0),e 2 =(0,1,.....,0),e n =(0,0,.....,1), and hx n i px 0 . Then by deTnition hf,x n i → hf,x 0 i where f is a continuous linear functional on R n . By Theorem 529 we know that each functional f is represented by a scalar product i.e. given f there exists an element b of R n s.t. hf,xi = hb,xi , ?xdX. (Just remeinder that hf,xi denotes the value of the functional f at x (i.e. f (x))andhb,xi is the scalar product of b and x ). If we substitute e i for b we have he i ,x n i =0x 1 n +...+1x 2 n +...+0x n n = x i n →x i 0 , ? i =1,2,...,n (i.e. the ith component of the vector x n tends to the ith component of x 0 ). Thus weak convergence in R n means convergence by com- ponents. But Theorem 223 says that then hx n i→x 0 with respect to the norm that means strongly. 268 CHAPTER 6. FUNCTION SPACES Example 560 Let X = ! 2 and hx n i px 0 . Then as in Example 559 hx n ,e i i = x i n → hx 0 ,e i i = x i 0 , ?i =1,2,..... Thus weak convergence in ! 2 means that the i-th component of hx n i converges to the i-th component of x 0 . But as Example 234 shows this doesn?t imply strong convergence in ! 2 . Example 561 Let X = C([a,b]). It can be shown that weak convergence of a sequence of continuous functions hx n i px 0 means that:(i) hx n i is uniformly bounded (i.e. ?B such that |x n (t)| ≤ B for all n =1,2,... and all t ∈ [a,b]); and (ii) hx n i → x pointwise on [a,b] (i.e. ?t ∈ [a,b] , hx n (t)i → x(t) (as a sequence of real numbers)). Thus weak convergence in C([a,b]) is pointwise convergence (we can say convergence by components) whereas strong convergence (convergence with respect to the sup norm) is uniform. As Examples 166 and 167 show these two don?t always coincide. Using weak convergence allows us to deTne weak closedness, weak com- pactness, and weak continuity (or semicontinuity). We do it the same way we did in Chapter 4 where all these notions were deTned in terms of sequences. DeTnition 562 AsubsetK ? X is weakly closed if for any sequence hx n i of elements from K that converges weakly to x 0 (i.e. hx n i px 0 ), then x 0 ∈ K. What is the relation between strong and weak closedness? While one would expect that if a set is strongly closed then it is weakly closed, actually the reverse is true. Theorem 563 If K is weakly closed then K is (strongly) closed. Proof. Let hx n i ? K and hx n i px 0 . then hx n i ?→ x 0 and because K is weakly closed =? x 0 ∈ K. Hence K is (strongly) closed. To see that Theorem 563 cannot be reversed, we present the following example. 9 Example 564 Let M ? ! 2 where M = ? he i i ∞ i=1 ,e i =(0,0,...,1,0,...),i=1,2,.... a . M is closed (why?) but it is not weakly closed because he i i p h0i and h0i /∈ M. 9 We cannot give an example in R n because in Tnite dimensional space weak closedness and closedness, of course, coincide. 6.6. OPTIMIZATION OF NONLINEAR OPERATORS 269 Theorem 563 and Example 564 say that weak closedness is a stronger assumption then strong closedness. Thus one should be careful in drawing conclusions. DeTnition 565 AsetK ? X is weakly compact if every inTnite sequence from K contains a weakly convergent subsequence. This deTnition 192 of sequential compactness is equivalent to the stan- dard deTnition of compactness by Theorem 193 in metric spaces. After our experience with closedness, one may wonder how weak and strong compact- ness are related. But if weak compactness were stronger assumption than strong compactness (as in the case of closedness) there would be fewer weakly compact sets than compact sets. Then the whole idea of building the weak topology would be useless because the main purpose of the introduction of the weak topology is to make closed unit balls (weakly) compact. Fortunately, it is not the case. Theorem 566 If K ? X is (strongly) compact then it is weakly compact. Proof. Let hx n i be a sequence in K. Because K is compact then there is a convergent subsequence hx n k i and x 0 ∈ K such that hx n k i→ x 0 . But strong convergence implies weak convergence so that hx n k i px 0 . Theorem 566 cannot be reversed as the next example shows. Example 567 Let K ? ! 2 where K = ? he i i ∞ i=0 ,e 0 =(0,0,...,0,....),e i =(0,...,1,0,...) a (note that K = M ∪ {h0i} when M is from Example 564 ). K is weakly compact because any sequence from K contains a weakly convergent sub- sequence (To see this note that since <x n >p x 0 , we have x i n → x i 0 ?i =1,2,...,< x i n >? {0,1} and {0,1} is compact in R. Then there is x i 0 and <x i n k > such that <x i n k >→ x i 0 . Then x 0 =<x i 0 ,x 2 0 ,...,x i 0 ,... > is the point such that <x i n k >p x 0 .) But K is not compact because the dis- tance between any two elements of K\{h0i} is √ 2. Hence there doesn?t exist a convergent subsequence (with respect to the norm k·k 2 ). Theorem 568 If M is weakly compact then M is weakly closed. Exercise 6.6.8 Prove Theorem 568. 270 CHAPTER 6. FUNCTION SPACES DeTnition 569 Let M ? X and f be a functional deTned on M. We say that f is weakly upper semicontinuous (usc) on M if for any x 0 ∈ M and any hx n i ∞ n=1 ? M such that x n px 0 , then f (x 0 ) ≥ lim n→∞ supf (x 0 ). One can deTne weakly lower semicontinuity and weak continuity of func- tionals in an analogous manner. Again, there is a question about whether the assumption of weak up- per semicontinuity of a functional is more restrictive than (strong) upper semicontinuity. Theorem 570 If f is a weakly usc functional on M, then f is usc. Proof. Let x 0 ∈ M and hx n i ? M such that hx n i → x 0 , then x n px 0 and because f is weakly usc then lim n→∞ supf (x n ) ≤ f (x 0 )sothatf is usc. Theconverseisnottrueasthenextexampleshows. Example 571 Let X = L 2 [0,1] and f (a)=1+ R 1 0 a 2 (x)dx. This functional is continuous (and hence usc) but is not weakly usc. Exercise 6.6.9 Show the functional in Example 571 is continuous but not weakly usc. Now we are ready to prove an important theorem which is analogous to the Extreme Value Theorem 262 but uses the concept of the weak topology. A weak topology on X is a topology built in terms of weak convergence instead of (strong) convergence. Theorem 572 Let K be a non-empty weakly compact subset of a Banach space X. Let f be a weakly upper semicontinuous functional on K.Thenf attains its maximum on K. That is, ?x 0 ∈ K such that f (x 0 )=sup x∈K f (x) Proof. By the supremum property, ?hx n i ? K such that lim n→∞ f (x n )= sup x∈K f (x). Since K is weakly compact, there exists a subsequence hx n k i and x 0 ∈ K such that x n k px 0 . Because f is weakly usc then f (x 0 ) ≥ lim n→∞ supf (x n k ) = lim n→∞ f (x n )=sup x∈K f (x). 6.6. OPTIMIZATION OF NONLINEAR OPERATORS 271 Obviously f (x 0 ) ≤ sup x∈K f (x)becausex 0 ∈ K. Thus combining these two inequalities we have f (x 0 )=sup x∈K f (x). Comparing the assumptions of Theorem 572 with the Extreme Value Theorem 262 we see that in Theorem 572 one assumption is weaker (that being compactness) and one is stronger (that being semicontinuity). The problem with this theorem is that it has basically non-veriTable assumptions. How should one check weak compactness and weak semicontinuity in an inTnite dimersional space? Our next step is to Tnd su?cient and at the same time veriTable assumptsions that would guarantee weak compactness of a set K and weak usc of a functional f. Let?s start with weak compactness. We already know that (strong) com- pactness is su?cient for weak compactness but we also know that it is too restrictive in inTnite dimensional vector spaces. In order for a set K to be weakly compact it has to be weakly closed (see Theorem 568). First we ex- amine the conditions for a set K to be weakly closed. Theorem 563 says that it must be closed but that?s not su?cient (see Example 564). There are however quite simple assumptions that guarantee weak closedness of a set K. Theorem 573 If K ? X is closed and convex then it is weakly closed. Proof. Let hx n i ? K and hx n i px 0 . Then we need to show that x 0 ∈ K. Assume the contrary, that is x 0 /∈ K. Then by Corollary 550 of the Separation Theorem, there exists a non-zero continuous linear functional f such that hf,x 0 i < inf x∈K hf,xi. Let hf,x 0 i = c and inf x∈K hf,xi = d in which case c<d.Because f is a linear continuous functional we have d ≤ lim n?→∞ hf,x n i =<f,x 0 >= c<d.Hence d<dwhich is the desired contradiction. Theorem573 says that closedness andconvexity are su?cient assumptions for weak closedness. However, we are looking for su?cient assumptions for weak compactness. To make further progress, we have to restrict attention to certain classes of normed vector spaces. In Section 6.4.2 we deTned a re?exive space as a space for which X ?? = X (see DeTnition 533). We showed that, for example, R n ,! p ,L p for 1 <p<∞ are re?exive whereas ! 1 ,! ∞ ,L 1 , L ∞ , C([a,b]) are not re?exive. From here on we will consider only re?exive, normed vector spaces. Our next result is basically a Heine-Borel theorem for inTnite dimensional spaces. 272 CHAPTER 6. FUNCTION SPACES Theorem 574 (Eberlein- ? Smuljan) A Banach space X is re?exive i? any bounded weakly closed set K ? X is weakly compact. Proof. Can be found in Aliprantis (1985, Theorem 10.13, p.156, Positive Operators). Thus in a re?exive Banach space, weak closedness and boundedness are su?cient assumptions for weak compactness and in any Banach space closed- ness and convexity are su?cient assumptions for weak closedness. Putting all these together we have su?cient assumptions for a set K to be weakly compact in a re?exive Banach space X. Theorem 575 Let X be a re?exive Banach space and K ? X. If K is closed, bounded and convex, then K is weakly compact. Proof. Combine Theorems 573 and 574. Notice that all the assumtions of the theorem are veriTable. Corollary 576 In a re?exive space X, the closed unit ball is a weakly com- pact set. Proof. B 1 (0) = {x ∈ X : k·k ≤ 1} is a closed, bounded, and convex subset of a re?exive space X. Hence by Theorem 575. Let?s turn now to the assumption of a ?weakly upper semicontinuous functional? and try to break it into veriTable parts. We Trst prove a lemma thatgivesusanecessaryandsu?cient condition for a functional f to be weakly upper semicontinuous. Lemma 577 Let X beaBanachspaceandK ? X be weakly closed. Let a functional f be deTned on M. Then f is weakly upper semicontinuous on K i? ?a ∈R, E(a)={υ ∈ K : f (υ) ≥ a} is weakly closed. Proof. (=?)Letf be weakly usc on M , a ∈ R, and hx n i ∞ n=1 ? E(a) such that x n px 0 ∈ M. Then f (x 0 ) ≥ limsup n→∞ f (x n ) ≥ a (because x n ∈ E(a) ?n ). Hence x 0 ∈ E(a)andthusE(a)isweaklyclosed. (?=) By contradiction. Let a ∈ R , E(a) be weakly closed, but f not weakly usc. Then there exists x 0 ∈ M and hx n i ∞ n=1 ? M such that <x n >p x 0 and limsup n→∞ f (x n ) >f(x 0 ). Choose a ∈ R such that limsup n→∞ f (x n ) >a>f(x 0 ). Then there exists a subsequence hx n k i of hx n i such that x n k ∈ E(a), k =1,2,...Because E(a)isweaklyclosedand 6.6. OPTIMIZATION OF NONLINEAR OPERATORS 273 hx n k i px 0 then x 0 ∈ E(a). Thus f (x 0 ) ≥ a>f(x 0 ), which is a contradic- tion. Thus f is weakly usc if E(a)isweaklyclosed. ButbyTheorem573 we know that if a set is closed and convex, then it is weakly closed. When is E(a) closed? Since E(a)isjusttheinverseimageoftheinterval[a,∞) (i.e. E(a)={υ ∈ K : f (υ) ≥ a} = f ?1 ([a,∞))), if f is continuous then the inverse image of a closed set is closed (by a modiTcation of Theorem [200]. Thus if f is continuous, then E(a) is closed. When is E(a)isconvex? Exercise 6.6.10 Show that if f is concave, then E(a) is convex. Combining these two results we have su?cient conditions for weak upper semicontinuity. Theorem 578 A continuous, concave functional f deTned on a closed, con- vex set K ? X is weakly upper semicontinuous. Now when we combine Theorems 575 and 578 with Theorem 572 we get a theorem that guarantees the existence of a maximum and all its assumptions are ?easily? veriTable. Theorem 579 Let K be a non-empty, convex, closed and bounded subset of a re?exive Banach space. Let f be a continuous and concave functional deTned on R. Then f attains its maximum on K (i.e. ?x ? ∈ K such that f (x ? )=sup x∈K f (x)). First we note that minimization requires convexity of the functional f instead of concavity while all other assumptions are the same. Second, we want to stress that this is a nonlinear optimization problem. This means the functional f doesn?t have to be linear (which is quite restrictive). The func- tional f simply has to be continuous and concave in the case of maximization and continuous and convex in the case of minimization. Let us summarize what we have done in this section. In inTnite dimen- sional vector spaces the original Extreme Value Theorem 262 (requiring semi- continuity of a functional and compactness of a set) which guarantees the existence of an optimum cannot be used since the assumption of compactness is too stringent (compact sets don?t contain interior points). By introducing the weak topology on X we deTne weak semicontinuity (an assumption that is stronger than the continuity) and weak compactness (an assumption the 274 CHAPTER 6. FUNCTION SPACES is weaker than compactness). We also must enlist the extra assumptions of concavity (or convexity) of a functional and re?exivity of the space X. Then we showed that with these modiTed assumptions an analogue of the Extreme Value Theorem holds and this version ?covers? more optimization problems (e.g. optimizing over unit balls). 6.6.2 Dynamic Programming An important and frequently used example of operators is dynamic program- ming. In inTnite horizon problems, dynamic programming turns the problem of Tnding an inTnite sequence (or plan) describing the evolution of a vector of (endogenous) state variables into simply choosing a single vector value for the state variables and Tnding the solution to a functional equation. More speciTcally, suppose the primitives of the problem are as follows. Let X denote the set of possible values of (endogenous) state variables with typical element x. We will assume that X ?R n is compact and convex. Let Γ : X 3 X be the constraint correspondence describing feasible values for the endogenous state variable. We will assume Γ(x)isnonempty,compact- valued, and continuous. Let G = {(x,y) ∈ X × X : y ∈ Γ(x)} denote the graph of Γ. Let r : G →R denote the per-period objective or return function which we assume is continuous. Finally let β ∈ (0,1) denote the discount factor. Thus, the ?givens? for the problem are X,Γ,r,β.Inthissectionwe establish under what conditions solutions to the functional equation (FE) v(x 0 )= max y∈Γ(x 0 ) r(x 0 ,y)+βv(y)(FE) ?solve? the sequence problem that is our ultimate objective max <x t+1 > ∞ t=0 ∞ X t=0 β t r(x t ,x t+1 )(SP) s.t. x t+1 ∈Γ(x t ),?t, and x 0 given. First we should establish what we mean by ?solve?. To begin with, we need to know that (SP) is well deTned. That is, we must establish conditions under which the feasible set is nonempty and the objective function is well deTned for all points in the feasible set. To accomplish this, we need to 6.6. OPTIMIZATION OF NONLINEAR OPERATORS 275 introduce some more notation. Call the sequence <x t > a plan.Given x 0 ∈ X,let F(x 0 )={<x t > ∞ t=0 : x t+1 ∈Γ(x t ),t=0,1,...} be the set of feasible plans from x 0 with typical element χ =(x 0 ,x 1 ,...) ∈ F(x 0 ). Let ? k : F(x 0 ) →R be given by ? k (χ)= k X t=0 β t r(x t ,x t+1 ) which is simply the discounted partial sum of returns from any feasible plan χ. Finally, let ? : F(x 0 ) → R be given by ?(χ)=lim k→∞ ? k (χ). While ? k is obviously well deTned, ? may not be since there may be χ such that ?(χ)=±∞. 10 The assumption that Γ(x) 6= ?, ?x ∈ X ensures that F(x 0 ) is nonempty for all x 0 ∈ X. The assumptions that X is compact and Γ is compact-valued and continuous guarantees |r(x t ,x t+1 )| ≤ M<∞ so that since β ∈ (0,1) we have |?(χ)| ≤ M (1?β) < ∞, ?χ ∈ F(x 0 ),?x 0 . Hence SP is well deTned and we can deTne the function v ? : X →R given by v ? (x 0 )= max χ∈F(x 0 ) ?(χ)(?) which is just (SP). Thus by ?solve? we mean that v ? (x 0 )deTned in (SP?) is equal to v(x 0 )deTned in (FE). Before providing conditions under which a solution to (FE) implies a ?so- lution? to (SP), we note the following consequences of the maximum function deTned in (SP?). In particular, by DeTnition 96 we have v ? (x 0 ) ≥ ?(χ),?χ ∈ F(x 0 ) (6.16) and ?ε>0, v ? (x 0 ) <?(χ)+ε,for some χ ∈ F(x 0 ). (6.17) Similarly, v satisTes (FE) if v(x) ≥ r(x,y)+βv(y),?y ∈Γ(x) (6.18) and ?ε>0, v(x) <r(x,y)+βv(y)+ε,for some y ∈Γ(x). (6.19) Now we are ready to prove our main result that if we have a solution to (FE), then we have a solution to (SP). 10 More generally Stokey and Lucas (1989) consider ? in the extended reals. 276 CHAPTER 6. FUNCTION SPACES Theorem 580 If v is a solution to (FE) and satisTes lim k→∞ β k v(x k )=0,? <x t >∈ F(x 0 ),?x 0 ∈ X, (6.20) then v = v ? . Proof. It su?ces to show that if (6.18) and (6.19) hold, then (6.16) and (6.17) are satisTed. Inequality (6.18) implies that ?χ ∈ F(x 0 ), v(x 0 ) ≥ r(x 0 ,x 1 )+βv(x 1 ) ≥ r(x 0 ,x 1 )+β[r(x 1, x 2 )+βv(x 2 )] ≥ ? k (χ)+β k+1 v(x k+1 ),k=1,2... Taking the limit as k →∞and using (6.20), we have (6.16). Fix ε>0andchoose<δ t > ∞ t=1 ?R + such that P ∞ t=1 β t?1 δ t ≤ ε. Inequal- ity (6.19) implies there exists x 1 ∈Γ(x 0 ),x 2 ∈Γ(x 1 ),...so that v(x t ) ≤ r(x t ,x t+1 )+βv(x t+1 )+δ t+1 ,t=0,1,... Then v(x 0 ) ≤ r(x 0 ,x 1 )+βv(x 1 )+δ 1 ≤ r(x 0 ,x 1 )+β[r(x 1 ,x 2 )+βv(x 2 )+δ 2 ]+δ 1 ≤ ? k (χ)+β k+1 v(x k+1 )+ k X t=1 β t?1 δ t ,k=1,2,... Taking the limit as k →∞and using (6.20), we have (6.17). Next we establish that a feasible plan which satisTes (FE) is an optimal plan in the sense of (SP). Theorem 581 Let χ ? ∈ F(x 0 ) be a feasible plan from x 0 which satisTes the functional equation v ? (x ? t )=r(x ? t ,x ? t+1 )+βv ? (x t+1 ) (6.21) with lim t→∞ maxβ t v ? (x ? t ) ≤ 0, (6.22) then χ ? attains the maximum in (SP) for initial state x 0 . 6.6. OPTIMIZATION OF NONLINEAR OPERATORS 277 Proof. It follows by an induction on (6.21) that v ? (x 0 )=r(x 0 ,x ? 1 )+βv ? (x ? 1 ) = r(x 0 ,x ? 1 )+β[r(x ? 1 ,x ? 2 )+βv ? (x ? 2 )] = ? k (χ ? )+β k+1 v ? (x ? k+1 ),k=1,2,... Then as k →∞using (6.22) we have v ? (x 0 ) ≤ ?(χ ? ). Since χ ? ∈ F(x 0 ), we have v ? (x 0 ) ≥ ?(χ ? )by(6.16).Thusχ ? attains the maximum. Now that we have established that a solution v to (FE) is a solution to the (SP) problem we are interested in, we set out to establish the existence of a solution to (FE). Since r is a real valued, bounded, and continuous func- tion, it makes sense to look for solutions in the space of continuous bounded functions C(X)withthesupnormkvk =sup{|v(x)|,x∈ X} studied in sec- tion 6.1. Furthermore, given a solution v ∈ C(X), we can deTne the policy correspondence γ : X → X by γ(x)={y ∈Γ(x):v(x)=r(x,y)+βv(y)}. (6.23) This generates a plan since given x 0 , we have x 1 = γ(x 0 ),x 2 = γ(x 1 ),... To this end, we deTne an operator T : C(X) →C(X)givenby (Tf)(x)= max y∈Γ(X) [r(x,y)+βf(y)]. (6.24) In this case (FE) becomes v = Tv. That is, all we must establish is that T has a unique Txed point in C(X). Before actually doing that, we provide a simple set of su?cient conditions to establish a given operator is a contraction. Lemma 582 (Blackwell?s su?cient conditions for a contraction) Let X ?R n and B(X,R) be the space of bounded functions f : X →R with the sup norm. Let T : B(X,R) →B(X,R) be an operator satisfying: (i) (mono- tonicity) f, e f ∈ B(X,R) and f(x) ≤ e f(x) implies (Tf)(x) ≤ (T e f)(x),?x ∈ X;(ii) (discounting) ?ρ ∈ (0,1) such that [T(f +a)](x) ≤ (Tf)(x)+ρa,a ≥ 0,x∈ X. 11 Then T is a contraction with modulus ρ. Proof. For any f, e f ∈ B(X,R),f≤ e f + ° ° °f ? e f ° ° ° wherewewritef ≤ e f if f(x) ≤ e f(x),?x ∈ X. Then Tf≤ T 3 e f + ° ° °f ? e f ° ° ° ′ ≤ T e f +ρ ° ° °f ? e f ° ° ° 11 Note (f +a)(x)=f(x)+a. 278 CHAPTER 6. FUNCTION SPACES where the Trst inequality follows from (i) and the second from (ii). Reversing the inequality gives T e f ≤ Tf+ρ ° ° ° e f ?f ° ° °. Combining both inequalities gives ° ° °Tf?T e f ° ° °≤ ρ ° ° °f ? e f ° ° °. Now we have the second main theorem of this section. Theorem 583 In (C(X),k·k),Tgiven in (6.24) has the following properties: T(C(X)) ?C(X); Tv= v ∈C(X); and ?v 0 ∈C(X), kv?T n v 0 k≤ β n (1?β) kTv 0 ?v 0 k,n=0,1,2... (6.25) Furthermore, given v,the optimal policy correspondence γ : X → X deTned in (6.23) is compact-valued and u.h.c. Proof. For each f ∈C(X)andx ∈ X, the problem in (6.24) is to maximize a continuous function [r(x,·)+βf(·)] on a compact set Γ(x). Hence, by the Extreme Value Theorem 262, the maximum is attained. Since both r and f are bounded, Tf is also bounded. Since r and f are continuous and Γ, it follows from the Theorem of the Maximum 295 that Tf is continuous. Hence T(C(X)) ?C(X). It is clear that T satisTes Blackwell?s su?cient conditions for a contraction (Lemma 582) since: (i) for f, e f ∈ C(X)withf(x) ≤ e f(x),?x ∈ X, by T given in (6.24) we have (Tf)(x) ≤ (T e f)(x) ? max y∈Γ(X) [r(x,y)+βf(y)] ≤ max y∈Γ(X) h r(x,y)+β e f(y) i ; and (ii) T(f +a)(x)= max y∈Γ(X) [r(x,y)+β(f(y)+a)] =max y∈Γ(X) [r(x,y)+βf(y)] +βa =(Tf)(x)+βa. Since C(X) is a complete normed vector space by Theorem 452 and T is a contraction, then T has a unique Txed point v ∈ C(X) by the Contraction MappingTheorem306whichsatisTes (6.25). The properties of γ follow from the Theorem of the Maximum 295. If we want to say more about v (and γ), we need to impose more structure on the primitives. The next theorem illustrates this. 6.6. OPTIMIZATION OF NONLINEAR OPERATORS 279 Theorem 584 For each y,letr(·,y)be strictly increasing in each of its Trst n arguments and let Γ be monotone in the sense that x ≤ x 0 implies Γ(x) ? Γ(x 0 ). Then v given by the solution to (FE) is strictly increasing. Proof. Let b C(X) ?C(X) be the set of bounded, continuous, nondecreasing functions and e C(X) ? b C(X) be the set of bounded, continuous, strictly increasing functions. Since b C(X) is a closed subset of the Banach space C(X),Corollary 307 and Theorem 583 implyit is su?cient to showT( b C(X)) ? e C(X), which is guaranteed by the assumptions on r and Γ. Existence of solutions with unbounded returns As we mentioned in the introduction, one application of dynamic optimiza- tion in inTnite dimensional spaces is the growth model. In that case the en- dogenous state variable is capital, denotedk t at any point in time t =0,1,2..., with k t ∈ R + and k 0 > 0 given. It is typically not the case that we assume k t lies in a compact set, which is very di?erent from the assumptions of the previous section. That is, the previous section relied heavily on the fact that the return function was bounded (sothatwecouldworkinthespaceof bounded functions). To address this problem, here we will consider a speciTcexample.There is a linear production technology where output y t = Ak t , A>0. Capital depreciates over a period at rate δ>0. Assume that e A = A +(1?δ) > 0. A household is risk neutral (i.e. u(c t )=c t where c t denotes consumption at time t) and discounts the future at rate β. Assume that β ?1 > e A. Since utility is strictly increasing in consumption, there is no free disposal and the budget constraint implies that c t = Ak t +(1? δ)k t ? k t+1 . Hence the household?s reward function is given by r(k t ,k t+1 ) ≡ u(c t )= e Ak t ?k t+1 . The problem of the household is to choose a sequence of capital stocks to maximize the present discounted value of future rewards which is just v ? (k 0 )= sup <k t+1 > ∞ X t=0 β t [ e Ak t ?k t+1 ](SP) s.t. 0 ≤ k t+1 ≤ e Ak t and k 0 > 0given. We will attack this problem in several steps. 280 CHAPTER 6. FUNCTION SPACES 1. Show that the constraint correspondenceΓ : R + →R + given byΓ(k t )= {k t+1 ∈R + : k t+1 ∈ [0, e Ak t ]} is nonempty, compact-valued, continuous, and for any k t ∈ R + , k t+1 ∈ Γ(k t )impliesλk t+1 ∈ Γ(λk t )forλ ≥ 0. Furthermore, show that for some α ∈ (0,β ?1 ),k t+1 ≤ αk t ,?k t ∈R + and ?k t+1 ∈Γ(k t ). (6.26) Nonempty: k t+1 =0∈ Γ(k t ). Compact: By Heine-Borel, it is suf- Tcient to check that Γ is closed and bounded. Boundedness follows since given k t ,k t+1 ∈ [0, e Ak t ]. To see Γ is closed, suppose <x n > is a sequence such that x n ∈ Γ(k t )andx n → x. But x n ∈ Γ(k t )implies x n ∈ [0, e Ak t ]andx n → x implies x ∈ [0, e Ak t ]. Thus x ∈Γ(k t )sothatΓ is closed. Continuous: One way to establish this is to show Γ is uhc and lhc. On the other hand, it is clear that since the upper endpoint is linear in k t , it is continuous in k t and hence Γ(k t )iscontinuous.Ho- mogeneity: If k t+1 ∈ Γ(k t ), then 0 ≤ k t+1 ≤ e Ak t . Multiplying by λ implies 0 ≤ λk t+1 ≤ λ e Ak t = e Aλk t or λk t+1 ∈ Γ(λk t ). Existence of α :Sincek t+1 ∈ Γ(k t ), then k t+1 ≤ e Ak t . Hence just take α = e A. By assumption, e A<β ?1 so α ∈ (0,β ?1 ). 2. Show that the conditions you proved in part 1 implies that for any k 0 , k t ≤ α t k 0 , ?t and for all feasible plans <k t+1 >∈ F(k 0 )={<k t+1 >: k t+1 ∈ Γ(k t ),t=0,1,...},the set of plans feasible from k 0 . To see this , from 1, k t+1 ≤ e Ak t = αk t ≤ α(αk t?1 )=α 2 k t?1 ≤ ... ≤ α t+1 k 0 . Notice that k t can be growing over time, though at rate less than β ?1 . 3. Let G = {(k t ,k t+1 ) ∈R + ×R + : k t+1 ∈Γ(k t )}. Show that r : G →R + is continuous and homogeneous of degree one. Show that e Ak t ?k t+1 ≥ 0,?t and ?B ∈ (0,∞) such that e Ak t ?k t+1 ≤ B(k t +k t+1 ), ?(k t ,k t+1 ) ∈ G. (6.27) Given that (k t ,k t+1 ) ∈ R + ×R + , (6.27) assures a uniform bound on the ratio of the return function u and the norm of its arguments. Con- tinuous: We must show that ?ε>0, ?δ(k t ,k t+1 ,ε) > 0 such that if q (k 0 t ?k t ) 2 +(k 0 t+1 ?k t+1 ) 2 <δ, (6.28) 6.6. OPTIMIZATION OF NONLINEAR OPERATORS 281 then ˉ ˉ ˉ 3 e Ak 0 t ?k 0 t+1 ′ ? 3 e Ak t ?k t+1 ′ˉ ˉ ˉ <ε. (6.29) But ˉ ˉ ˉ 3 e Ak 0 t ?k 0 t+1 ′ ? 3 e Ak t ?k t+1 ′ˉ ˉ ˉ≤ e A ˉ ˉ k 0 t ?k t ˉ ˉ + ˉ ˉ k 0 t+1 ?k t+1 ˉ ˉ by the triangle inequality. If (6.28) is satisTed, then e A ˉ ˉ k 0 t ?k t ˉ ˉ ≤ e A q (k 0 t ?k t ) 2 +(k 0 t+1 ?k t+1 ) 2 < e Aδ and ˉ ˉ k 0 t+1 ?k t+1 ˉ ˉ ≤ q (k 0 t ?k t ) 2 +(k 0 t+1 ?k t+1 ) 2 <δ. Hence let δ =max n ε 2 , ε 2 e A o .Homogeneity: r(λk t ,λk t+1 )= e Aλk t ? λk t+1 = λ h e Ak t ?k t+1 i = λr(k t ,k t+1 ). Nonnegative returns: Since k t+1 ∈ [0, e Ak t ], we know e Ak t ?k t+1 ≥ 0, ?t. Boundedness: Inequality (6.27) is established since e Ak t ?k t+1 ≤ e Ak t +k t+1 ≤ max{ e A,1}(k t +k t+1 ),?(k t ,k t+1 ) ∈ G where B =max{ e A,1}. 4. Show that the conditions you proved in the previous parts imply that for any k 0 and ? <k t+1 >∈ F(k 0 ), lim n→∞ Σ n t=0 β t [ e Ak t ?k t+1 ] exists. This, along with the prior conditions you have proven, estab- lishes that a solution to (SP) satisTes the functional equation v(k t )= sup k t+1 ∈Γ(k t ) [ e Ak t ?k t+1 ]+βv(k t+1 ). (FE) We start by noting n X t=0 β t [ e Ak t ?k t+1 ] ≤ n X t=0 β t B[k t +k t+1 ] ≤ B n X t=0 β t [α t k 0 +α t+1 k 0 ] = Bk 0 (1 +α) n X t=0 (αβ) t ≤ Bk 0 (1 +α)/(1?αβ) 282 CHAPTER 6. FUNCTION SPACES where the Trst inequality follows from part 3, the second from part 2, andthethirdsinceαβ < 1frompart1. Since P n t=0 β t [ e Ak t ?k t+1 ]is increasing and bounded, the limit exists. 5. Showthatv ? deTnedin(SP) is homogeneous of degree one (i.e. v ? (θk 0 )= θv ? (k 0 )) and that for some η ∈ (0,∞), |v ? (k 0 )| ≤ ηk 0 , for any k 0 . To see this, as in part 2, consider <k t+1 >∈ F(k 0 )andLetu(<k t+1 > ) = lim n→∞ P n t=0 β t [ e Ak t ? k t+1 ]. Then v ? (k 0 )=sup <k t+1 >∈F(k 0 ) u(< k t+1 >). For θ>0,θk 0 ∈R + since R + is a convex cone. Furthermore, k 1 ∈ Γ(k 0 ) ? θk 1 ∈ Γ(θk 0 ) as established in part 1. Continuing in this fashion we can show that ? <k t+1 >∈ F(k 0 ), <θk t+1 >∈ F(θk 0 ). Homogeneity: For θ>0, v ? (θk 0 )= sup <θk t+1 >∈F(θk 0 ) u(<θk t+1 >) =sup <θk t+1 >∈F(θk 0 ) ( lim n→∞ n X t=0 β t [ e Aθk t ?θk t+1 ] ) =sup <θk t+1 >∈F(θk 0 ) ( lim n→∞ θ n X t=0 β t [ e Ak t ?k t+1 ] ) = θv ? (k 0 ). Boundedness: Let D = B(1 + α)/(1?αβ) > 0. Then ? <k t+1 >∈ F(k 0 ), it was shown in part 4 that |u(<k t+1 >)| = ˉ ˉ ˉ ˉ ˉ lim n→∞ n X t=0 β t [ e Ak t ?k t+1 ] ˉ ˉ ˉ ˉ ˉ ≤ Dk 0 . Thus, ? <k t+1 >∈ F(k 0 ), |v ? (k 0 )| = ˉ ˉ ˉ ˉ ˉ sup <k t+1 >∈F(k 0 ) u(<k t+1 >) ˉ ˉ ˉ ˉ ˉ ≤ sup <k t+1 >∈F(k 0 ) |u(<k t+1 >)|≤ Dk 0 . 6. As a converse to the above results, next consider seeking solutions to (FE) in the space of functions (denoted H(R + )) that are continuous, homogeneous of degree 1, and bounded in the sense that if f ∈ H(R + ), then |f(k t )| k t < ∞. This notion of boundedness is consistent with DeTni- tion 522. Endow the space with the operator norm, which by Theorem 6.6. OPTIMIZATION OF NONLINEAR OPERATORS 283 518 can be of the form kfk =sup ? |f(k t )| k t ,k t ∈R + ,k t 6=0 ? . (6.30) We must verify that k·k on the vector space H(R + )satisTes the prop- erties of a norm and that (H(R + ),k·k)iscomplete. Normed Vec- tor Space: We must show: (a)kfk ≥ 0 with equality i? f =0;(b) kafk = |a|·kfk;and(c)kf +f 0 k≤kfk+kf 0 k. Complete: Consider any Cauchy sequence <f n > in H(R + ). For all x ∈ R + , <f n (x) > is Cauchy in R and thus has a limit. DeTne f : R + → R by f(x)= lim n→∞ f n (x), ?x ∈ R + . We must show that f ∈ H(R + )byverify- ing that (i) f is homogeneous of degree 1; (ii) bounded in the sense that |f(x)| x < ∞;and(iii)f is continuous. Starting with (i), for λ>0 and x ∈ R + , since f n ∈ H(R + )?n we have f(λx) = lim n→∞ f n (λx)= lim n→∞ λf n (x)=λf(x). Next, for (ii) we know that f is bounded since < kf n k > is a convergent sequence in R and thus bounded (i.e. kfk≤kf N k+1forsomeN ∈N). Finally, for (iii) note that <f n >→ f uniformly by the Cauchy criterion and that H(R + ) ? C(R + ). But uni- form convergence of continuous functions implies the limit function is continuous. 7. Show that for any v ∈ H(R + ),lim n→∞ β t v(k t ) = 0 which establishes the conditions necessary to prove that a solution to (FE) implies a solution to (SP). It follows directly from parts 2 and 6 that for any f ∈ H(R + ), |f(k t )|≤ k t ·kfk≤ α t k 0 so that since αβ < 1, lim n→∞ β t v(k t )=0. 8. DeTne an operator T on H(R + )by (Tf)(k)= sup k 0 ∈Γ(k) h e Ak?k 0 +βf(k 0 ) i . (6.31) Show that T maps functions in H(R + ) to functions in H(R + ). Conti- nuity: Since r and f are continuous by part 3 and f ∈ H(R + ), and Γ is compact valued, we know Tf is continuous by the Theorem of the Maximum. Boundedness: Since r and f are bounded by part 3 and 284 CHAPTER 6. FUNCTION SPACES f ∈ H(R + ), Tf is bounded. Homogeneity: For θ>0, (Tf)(θk)= sup θk 0 ∈Γ(θk) h e Aθk?θk 0 +βf(θk 0 ) i = θ sup θk 0 ∈Γ(θk) h e Ak?k 0 +βf(k 0 ) i = θ sup k 0 ∈Γ(k) h e Ak?k 0 +βf(k 0 ) i = θ(Tf)(k) since θk 0 ∈Γ(θk) ?? k 0 ∈Γ(k)bypart1. 9. Show that T satisTes Blackwell?s su?cient conditions for a contraction (and hence there exists a unique Txed point of (FE) v = Tv). 6.7 Appendix - Proofs for Chapter 6 Proof of Dini?s Theorem 453. Let <f n > be decreasing, f n → f pointwise, and deTne f n = f n ?f.Then - f n ? is a decreasing sequence of non-negative functions with f n → 0 pointwise. For a given x ∈ X and ε>0, ?N(ε,x)suchthat0≤ f N(ε,x) (x) <ε.Since f N(ε,x) is continuous ?δ(x)such that 0 ≤ f N(ε,x) (x 0 ) <εfor all x 0 ∈ B δ(x) (x). Since f n is decreasing, 0 ≤ f n (x 0 ) <εfor all n ≥ N(ε,x)wehavex 0 ∈ B δ(x) (x). Since the collection {B δ(x) (x),x∈ X} is an open covering of X,thereexistsaTnite subcovering of X (i.e. X = ∪ k i=1 B δ(x i ) (x i )). DeTne N(ε)=min i=1,...,k {N(ε,x i )} which is well deTned since N(ε)isjusttheminimumofaTnite set. For a given ε,we found N(ε) such that 0 ≤ f n (x) <εfor all n ≥ N(ε)and for all x ∈ X (i.e. f n → 0uniformlysothatf n →f). Proof of Lemma 456. (?=) This direction is apparent. ( =?)Let D ? C (X) be equicontinuous. Then given x ∈ X and ε>0, ?δ(ε,x) such that |h(x 0 ) ? h(x)| < ε 2 for all x 0 such that d X (x,x 0 ) <δfor all h ∈ D. The collection of open balls n B1 2 δ(x, ε 2 ) (x),xdX o is an open cover- ing of X and since X is compact there exists Tnitely many x 1 ,....,x k s.t. n B1 2 δ(x i , ε 2 ) (x i ),i=1,...,k o covers X. Let δ ≡ 1 2 min ? δ ? x i , ε 2 ¢ ,i=1,...,k a . For x ∈ X,then?isuch that x ∈ B δ ( x i , ε 2 ) (x i ).Let y ∈ X such thatd X (x,y) ≤ 6.7. APPENDIX - PROOFS FOR CHAPTER 6 285 δ. Then d X (y,x i ) ≤ d X (y,x)+d X (x,x i ) ≤ δ+ 1 2 δ ? x i , ε 2 ¢ ≤ δ ? x i , ε 2 ¢ . There- fore for anyh ∈D,|h(x)?h(y)|≤|h(x)?h(x i )|+|h(x i )?h(y)|≤ ε 2 + ε 2 = ε since D is equicontinuous at x i .HenceD is uniformly equicontinuous. Proof of Lemma 457. (?=) Suppose that D is totally bounded. Let ε>0 be given and choose positive numbers ε 1 and ε 2 such that 2ε 1 + ε 2 ≤ ε. Total boundedness of D implies that there exist Tnitely many functions f 1 ,.....,f n such that the collection of open balls {B ε 1 (f i ),i=1,...,n} covers D. Fix x 0. Because {f i, i =1,...,n} is equicontinuous at x 0 (since a Tnite subset of continuous functions is equicontinuous), there exists δ>0such that d Y (f i (x),f i (x 0 )) <ε 2 for all x such that d(x,x 0 ) <δand for all i =1,...,n. To prove that D is equicontinous at x 0 we need to show that d Y (f (x),f(x 0 )) <εfor all x such that d X (x,x 0 ) <δand all f ∈ D. Let f ∈ D.Because{B ε 1 (f i ),i=1,...,n} covers D,then? f i such that f ∈ B ε 1 (f i ). By the triangle inequality d Y (f (x),f(x 0 )) ≤ d Y (f (x),f i (x)) +d Y (f i (x),f i (x 0 )) +d Y (f i (x 0 ),f(x 0 )) ≤ ε 1 +ε 2 +ε 1 ≤ ε holds true for all x such that d X (x,x 0 ) <δand for all f ∈ D. Notice that this direction doesn?t require compactness of either X nor Y, hence total boundedness always implies equicontinuity. (=?) Suppose D is equicontinuous. Given ε>0wewishtocover D by Tnitely many open ε? balls. Choose ε 1 and ε 2 such that 2ε 1 + ε 2 ≤ ε.Using equicontinuity of D at x ∈ X, given ε 1 , ?δ(ε 1 ,x)suchthat for x 0 ∈ B δ(ε 1 ,x) (x),d(f(x 0 ),f(x)) <ε 1 for all f ∈ D.Thecolection ? B δ(ε 1 ,x) (x),x∈ X a is an open covering of X. Since X is compact there exist Tnitely many x 1 ,...,x k such that ? B δ(ε 1 ,x i ) (x i ),i=1,....,k a covers X and d Y (f (x),f(x i )) <ε 1 holds for x ∈ B δ(ε 1 ,x i ) (x i )andallf ∈ D.Now cover Y by Tnitely many open balls {B ε 2 (y j ),j=1,....,m}. Let J be the set of all functions α : {1,...,k} → {1,...,m}. The set J is Tnite. Given α ∈ J, if there exists a function f ∈ D such that f(x i ) ∈ B ε 2 (y α(i) )for each i =1,...,k,choose one such function and label it f α . The Tnite colle- cion of open balls {B ε (f α ),α∈ J} with ε ≤ 2ε 1 + ε 2 covers D.Foreach i =1,,...,k,choose an integer α(i) such that f(x i ) ∈ B ε 2 (y α(i) ). For this index α,the ε-ball around f α contains f (i.e. f ∈ B ε (f α )). Let f ∈D.Then f (x i ) ∈ B ε 2 3 y j (i) ′ for i =1,....,k because {B ε 2 (y j ),j=1,....,m} covers all of Y (which is possible since Y is compact and thus totally bounded). DeTne 286 CHAPTER 6. FUNCTION SPACES the function f j (i) such that f j (i) (x i )dB ε 2 3 y j (i) ′ for i =1,....,k. Let x ∈ X. Choose i such that x ∈ B δ(ε 1 ,x i ) (x i ). Then d Y 3 f (x),f j (i) (x) ′ ≤ d(f (x),f(x i )) +d(f (x i ),f α (x i )) +d(f α (x i ),f α (x)) ≤ ε 1 +ε 2 +ε 1 <ε. Proof of Lemma 467. By construction. DeTne P n (x)on[?1,1] by induction: P 1 (x)=0andP n+1 (x)=P n (x)+ 1 2 (x 2 ?P 2 n (x)),?n ∈N. To prove that P n (x) →|x| on [?1,1] uniformly, we will use Dini?s Lemma 453. In that case we must check: (i) P n (x) ≤ P n+1 (x), ?x ∈ [?1,1]; and (ii) P n (x) converges to |x| pointwise on [?1,1]. We check that 0 ≤ P n (x) ≤ P n+1 (x) ≤|x|, ?x ∈ [?1,1] by induction. To show <P n > is non-decreasing, suppose it holds for n ≥ 1(n =1 is clear). Then P n+2 (x)=P n+1 (x)+ 1 2 ? x 2 ?P 2 n+1 (x) ¢ ≥ P n+1 (x) because 0 ≤ P n+1 (x) ≤|x|? P 2 n+1 (x) ≤|x| 2 . To show P n+2 ≤|x|,use the identity P n+2 = |x|?(|x|?P n+1 (x)) μ 1? 1 2 [|x|+P n+1 (x)] ? . Since |x| ? P n+1 (x) ≥ 0 by assumption, |x| + P n+1 (x) ≤ 2|x| and hence 1? 1 2 [|x|+P n+1 (x)] ≥ 0. Thus the sequence <P n (x) > is increasing and bounded ?x ∈ [?1,1] and therefore it converges to a function f(x). Taking the limit of P n+1 (x)= P n (x)+ 1 2 (x 2 ?P 2 n (x)) yields f = f+ 1 2 (x 2 ?f 2 ) which implies f 2 (x)=x 2 or f(x)=|x| (which we know is continuous). By Dini?s lemma 453 <P n (x) > converges to |x| uniformly on [?1,1]. Proof of Schauder?s Fixed Point Theorem 475. Since K is compact, K is totally bounded. Hence, given any ε>0, there exists a Tnite set{y i ,i=1,...,n}such that the collection{B ε (y i ),i=1,...,n} coversK.We now deTne the convexhullK ε = {θ 1 y 1 +....... +θ n y n : P θ i =1, all θ i ≥ 0}. This is a subset of K since K is convex and K contains all the points y i . We will now map all of K into K ε by a continuous function P ε (y)thatapproxi- mates y (i.e. kP ε (y)?yk <ε,?ydK). To construct this function P ε (y), we must construct n continuous functions θ i = θ i (y) ≥ 0, with P n i=1 θ i =1. First, for i =1,.....,n, we deTne ? i (y)= ? 0if|y i ?y|≥ ε ε?|y i ?y| if |y i ?y| <ε ,i=1,...,n. 6.7. APPENDIX - PROOFS FOR CHAPTER 6 287 Each of these n functions ? i (y) is continuous and the fact that the set {y 1 ,....,y n } is dense in K guarantees ? i (y) > 0forsomei =1,....,n. Now construct continuous functions θ i (y)= ? i (y) P n i=1 ? i (y) ,i=1,......,n,y ∈ K. These functions are well-deTned since P n i=1 ? i (y) > 0. The functions θ i (y) satisfy θ i ≥ 0, P θ i =1. Finally we construct the continuous function P ε (y)=θ 1 (y)y 1 +........+θ n (y)y n . This function maps K into K ε. From the construction of ? i ,θ i (y)=0unless ky i ?yk <ε.Therefore P ε (y) is a convex combination of just those points y i for which ky i ?yk <ε.Hence kP ε (y)?yk = ° ° ° X θ i (y)y i ?y ° ° ° = ° ° ° X θ i (y)(y i ?y) ° ° °≤ X θ i (y)ky i ?yk <ε. This establishes that P ε (y) approximates y. Nowwemaptheconvexset K ε continuously into itself by the function f ε : K ε → K ε where f ε (x) ≡ P ε (f (x)) for all x ∈ K ε . Since K ε is a convex, compact, Tnite-dimensional vector subspace spanned by the n points y 1 ,.....,y n and f ε : K ε → K ε is continuous, there exists a Txed point x ε = f ε (x ε )inK ε due to Brouwer?s Txed point theorem 302. Now we take the limit as ε → 0. Set y ε = f (x ε ). Since K is compact, we may let ε → 0 through some sequence ε 1 ,ε 2 ,..for which <y ε > converges to a limit in K : f (x ε )=y ε →y as ε k → 0. (6.32) We now write x ε = f ε (x ε ) ≡ P ε (f (x ε )) = P ε (y ε ) x ε = y ε +[P ε (y ε )?y ε ] Then kx ε ?yk = ky ε +P ε (y ε )?y ε ?yk = kP ε (y ε )?yk≤kP ε (y ε )?y ε k+ky ε ?yk. The Trst term vanishes since P ε (y) approximates y and the second term vanishes since y ε converges to y as ε k → 0.Hencex ε → y as ε = ε k → 0. 288 CHAPTER 6. FUNCTION SPACES Now since f is continuous then f (x ε ) → f (y). Combining this and (??)we have f (y)=y, for some y ∈ K.Hence y is a Txed point of f. Proof of Riesz-Fischer Theorem 481. Thecaseforp = ∞ is in the text. Second, let p ∈ [1,∞). Let <f n > be a Cauchy sequence in L p .Inorder to Tnd a function to which the sequence converges in light of Example 444, we need to take a more sophisticated approach than for p = ∞. Since <f n > is Cauchy we can recursively construct a strictly increasing sequence <n j > in N such that kf k ?f n k p < 2 ?j , ?k,n≥ n j and ?j ∈N.Then ° ° ° ° ° J X j=1 ˉ ˉ f n j+1 ?f n j ˉ ˉ ° ° ° ° ° p ≤ J X j=1 ° ° f n j+1 ?f n j ° ° p < J X j=1 2 ?j < 1 for each J ∈N by the Minkowski inequality in Theorem 480. Therefore, Z X ? ∞ X j=1 ˉ ˉ f n j+1 ?f n j ˉ ˉ ! p = Z X ? lim J→∞ J X j=1 ˉ ˉ f n j+1 ?f n j ˉ ˉ ! p = Z X lim J→∞ ? J X j=1 ˉ ˉ f n j+1 ?f n j ˉ ˉ ! p = lim J→∞ Z X ? J X j=1 ˉ ˉ f n j+1 ?f n j ˉ ˉ ! p = lim J→∞ ? ? ° ° ° ° ° J X j=1 ˉ ˉ f n j+1 ?f n j ˉ ˉ ° ° ° ° ° p ? ? p ≤ 1 where the third equality follows from the Monotone Convergence Theorem 396.This implies that the sum P J j=1 ˉ ˉ f n j+1 ?f n j ˉ ˉ is Tnite a.e. This means there exists a set A such that mA =0and P ∞ j=1 ˉ ˉ f n j+1 ?f n j ˉ ˉ converges on X\A. Since f n j+1 ?f n j ≤ ˉ ˉ f n j+1 ?f n j ˉ ˉ , we have P ∞ j=1 f n j+1 ?f n j converges on X\A. Let f(x), x ∈ X\Abethe limit of this series. Then by Theorem 364 (the pointwise limit of measurable functions is measurable), f is measurable. Finally, we need to show that f is also p -integrable. To do so, suppose that ε>0isgivenandletN 0 be an integer such that kf k ?f n k p <εfor 6.7. APPENDIX - PROOFS FOR CHAPTER 6 289 k,n≥ N 0 .Then kf n ?fk p = μZ X |f n ?f| p ?1 p = μZ X lim j→∞ ˉ ˉ f n ?f n j ˉ ˉ p ?1 p ≤ μ lim j→∞ inf Z X ˉ ˉ f n ?f n j ˉ ˉ p ?1 p ≤ ε,?n ≥ N 0 where the second equality follows since <f n > is Cauchy and the Trst inequality follows by Fatou?s Lemma 393. Since kfk p = kf ?f n +f n k p ≤ kf ?f n k p +kf n k p ≤ ε+kfk p < ∞, then f ∈ L p and f n → f in L p . Proof of Theorem 520. Let <T n > be a Cauchy sequence in BL(X,Y). For a Txed x ∈ X we have kT n x?T m xk Y ≤kT n ?T m kkxk X so that <T n (x) > is a Cauchy sequence in Y.Since Y is complete, <T n (x) > converges to an element y ∈ Y. Call this element Tx (Tx =lim n→∞ T n (x), ?x ∈ X). Thus we can deTne T : X → Y by Tx= lim n→∞ T n (x). We must show that T is bounded and that <T n >→ T as n →∞. Since <T n > is Cauchy, then given ε>0 ?N such that ?m,n ≥ N we have kT n ?T m k <ε.Hence kT n ?T N k <ε,?n ≥ N or kT n k < kT N k+ε,?n ≥ N. Thus, kTxk Y = lim n→∞ kT n xk Y ≤ lim n→∞ (kT n kkxk X ) ≤ (kT N k+ε)kxk X . Thus T is bounded. For each x ∈ X we have kT n x?Txk Y =lim m→∞ kT n x?T m xk Y ≤ lim m→∞ kT n ?T m kkxk X ≤ εkxk X ,?n ≥ N where the inequality follows from Corollary 519. Thus kT n ?Tk =sup{k(T n ?T)xk Y ,kxk X =1} ≤ εk1k = ε. Thus T n → T in BL(X,Y). Proof of Theorem 529. Let G : X →R be any bounded linear functional on R n from X ? (i.e. G ∈ X ? ). Let {e 1 ,...,e n } be the natural basis in R n12 and deTne b i = G(e i )fori =1,...,n. For x =(x 1 ,...,x n ) ∈R n we have G(x)=G(x 1 e 1 +... +x n e n ) = x 1 G(e 1 )+... +x n G(e n ) = x 1 b 1 +... +x n b n = <x,b>. 12 Recall that the natural (or canonical) basis in R n is deTned to be the set of vectors {e 1 ,...,e n } where e i =(0,...,1,...,0) with ?1 0 in the ith place. 290 CHAPTER 6. FUNCTION SPACES Clearly the functional G is represented by the point b =(b 1 ,...,b n ) ∈ R n . Next we show that kGk = kbk X . First, |G(x)|≤ n X i=1 |x i G(e i )|≤ ? n X i=1 x 2 i !1 2 ? n X i=1 G(e i ) 2 !1 2 = kxk X kbk X where the Trst inequality follows from the triangle inequality and the sec- ond inequality follows from the Cauchy-Schwartz inequality (Theorem 210). Hence kGk≤kbk X by (iv) of Theorem 518. Second, choose x 0 =(b 1 ,...,b n ). Then we have kGk≥ |G(x 0 )| kx 0 k X = |G(b)| kbk X = <b,b> kbk X = kbk 2 X kbk X = kbk X where the inequality follows from Corollary 519. Combining these two in- equalities we have kGk = kbk X . It is easy to show that each functional G ∈ X ? is uniquely represented by the point b ∈ R n . To see this, it is su?cient to prove that an operator T : X ? → X deTned by T(G)=(G(e 1 ),...,G(e n )) = b is a bounded, linear bijection such that G(x)=<b,x>.To see that T is bounded (and hence continuous by Theorem 511) note kTk =sup ? kTGk X kGk X ? ,kGk X ? 6=0 ? =sup ? kbk X kbk X ,kbk X 6=0 ? =1. To see that T is linear, since < (αb + βb 0 ),x>= α<b,x>+β<b 0 ,x>, we know T(αb+βb 0 )=αT(b)+βT(b 0 ). To see that T is a bijection, Trst we establish it is an injection (i.e. one-to-one). Let G 1 6= G 2 . Then ?x ∈ X such that G 1 (x) 6= G 2 (x). Since x = x 1 e 1 +... +x n e n uniquely, we have G 1 (x)=G 1 (x 1 e 1 +...+x n e n ) = x 1 G 1 (e 1 )+... +x n G n (e n ) = x 1 b 1 1 +... +x n b 1 n and similarly G 2 (x)=x 1 b 2 1 +... +x n b 2 n . Then G 1 (x) 6= G 2 (x) ? b 1 1 6= b 2 1 or ... b 1 n 6= b 2 n so that TG 1 6= TG 2 . To see T is a surjection (onto), we must show that for any d ∈ X, ?G ∈ X ? such that T(G)=d. But G(x)=<d,x>∈ X ? and TG= d. 13 13 For example, if X = R 2 and we take d =(3,4) ∈ X, then G (3,4) (x)=3x 1 +4x 2 ∈ X ? . 6.7. APPENDIX - PROOFS FOR CHAPTER 6 291 Proof of Theorem 530. Let {e i ,i ∈ N } is a complete orthonormal system in H.Setb i = F(e i ),?i ∈ N. Then we have n X i=1 b 2 i = F( n X i=1 b i e i ) ≤ kFkk n X i=1 b i e i k = kFk ? n X i=1 b 2 i !1 2 . Taking the square of both sides , we have n X i=1 b 2 i ≤ kFk 2 for arbitrary n.Then ∞ X i=1 b 2 i ≤ kFk 2 < ∞ which means that the series ∞ X i=1 b 2 i is convergent. Then there exists an element y ∈ H whose Fourier coe?cients are b i ,i ∈ N. Since {e i ,i∈ N} is a com- plete orthonormal system (by Parserval?s equality) we have b = ∞ X i=1 b i e i and also kbk ≤ kFk.Let x be any element of H and let {x i ;i ∈ N} be its Fourier coeTcients. Then n X i=1 x i e i → x by Parseval?s Theorem 504. Since F is linear, F(x) = lim n→∞ F ? n X i=1 x i e i ! = lim n→∞ n X i=1 x i F(e i )= lim n→∞ n X i=1 x i b i = ∞ X i=1 x i b i =<x,b>.By the Cauchy-Schuartz inequality | F(x) |≤ kxkkbk,?x ∈ H,sothatkFk ≤ kbk.Combining the two inequali- ties, we have kFk = kbk. Proof of Theorem 531. Suppose x =(x 1 ,.....x n ,....) ∈ ! p and F ∈ ! ? p . Set {e i ,idN} where e i is the vector having the i-th entry equal to one and all other entries equal to zero. Let s n = P n i=1 x i e i . Then s n ∈ ! p and kx?s n k p p = P ∞ i=n+1 |x i | p ?→ 0as n →∞. Thus F (s n )=F ( P n i=1 x i e i )= P n i=1 x i F (e i )and|F (x)?F (s n )| = |F (x?s n )|≤kFkkx?S n k p ?→ 0asn ?→ ∞.HenceF (x)=lim n?→∞ F (s n )= P ∞ i=1 x i F (e i ). Set z i = F (e i ),i∈N and z = hz 1 ,...,z n ,...i . We must show that z ∈ ! q . For this purpose choose a particular x = hx i i : x i = ? |z i | q?2 z i when z i 6=0 0whenz i =0 . For this case ks n k p p = P n i=1 |x i | p = P n i=1 |z i | p(q?1) = P n i=1 |z i | 2 . Moreover 292 CHAPTER 6. FUNCTION SPACES F (s n )= P n i=1 x i z i = P n i=1 |z i | q and|F (s n )|≤kFkks n k p ≤kFk( P n i=1 |z i | q ) 1 p . Hence P n i=1 |z i | q ≤ kFk( P n i=1 |z i | q ) 1 p ? ( P n i=1 |z i | q ) 1 q ≤ kFk holds true for arbitrary n.Thusz = hz i i ∈ ! q and kzk q ≤ kFk. On the other hand by Hyolder?s inequality we have |F (x)| = | P ∞ i=1 x i z i | ≤ kxk p kzk q and thus kFk≤kzk q . This shows that kFk = kzk q . Proof of Riesz Representation Theorem 532. Let us Trst consider m(X) < ∞. In that case χ E ∈ L p (X) for any E ? X which is L-measurable (i.e. E ∈ L). Then deTned a set function ν : L→R by ν(E)=F(χ E ) for E ? L. ν is a Tnite signed measure which is absolutely continuous with respect to m. (Show it???). Then by the Radon Nikdodyn Theorem 434 there is an integrable function g such that ν(E)= R E dm for all E ∈ L. Thus we take F(χ E )= R E gdm for all E ∈ L.If? is a simple function (i.e. it is a Tnite linear combination of characteristic functions), then by linearity of F we have F(?)=F ? n X i=1 c i χ E i ! = n X i=1 c i F ? χ E i ¢ = n X i=1 c i Z E i gdm = n X i=1 c i Z X χ E i gdm = Z X n X i=1 c i χ E i = Z X g?dm. Since |F(?)| ≤ kFkk?k p we have that g ∈ L q (X). (Show it???) Hence there is a function g ∈ L q (X) such that F(?)= R g?dm for all simple functions ?. Since the subset of all simple functions is dense in L p (X),then F(f)= R X gfdm for all f ∈ L p (X). (Show it???) Also show that kFk = kgk q .Thefunctiong is determined uniquely for if g 1 and g 2 determine the same functional F, then g 1 ? g 2 must determine the zero functional; hence kg 1 ?g 2 k q =0whichimpliesq 1 = q 2 a.e. Let m(X)=∞. Since mis σ-Tnite, there is an increasing sequence of L- measurable sets <X n > with Tnite measure whose union is X. By the Trst part of the proof for each nthere is a function g n ∈ L q such that g n vanishes outside X n and F(f)= R fg n dm for all f ∈ L p that vanish outside x. More- over kg n k q ≤ F. Since any function g n is unique on X n (except on sets of measure zero), g n+1 = g n on X n . Set g(x)=g n (x)forx ∈ X n .Theng is a well deTned L-measurable function and |g n | increases pointwise to |g|. Thus by the Monotone Convergence Theorem 396 Z |g| q dm = lim n→∞ Z |g n | q dm ≤kF q k 6.7. APPENDIX - PROOFS FOR CHAPTER 6 293 for g ∈ L q .Forf ∈ L p deTne f n = ? f(x)forx ∈ X n 0forx ∈ X\X n . Then f n → f pointwise and in L p . By the Holder Inequality 479 |fg| is integrable and |f n g|≤|fg| so that by the Lebesgue Dominated Convergence Theorem 404 Z fgdm=lim n→∞ Z f n gdm = lim n→∞ Z f n g n dm =lim n→∞ F(f n )=F(f). Proof of Hahn-Banach Theorem 539. If M = X, then there is nothing to prove. Thus assume M ? X.Then?x 1 ∈ X which is not in M.Let M 1 = {w ∈ X : w = αx 1 +x,α ∈R,x∈ M} (6.33) Onecanprove(seeExercise6.7.1below)thatM 1 deTnedin(6.33)isa subspace of X and that the representation in (6.33) is unique. Next, extend f to M 1 and call this extension F.InorderforF : M 1 →R to be a linear functional, it must satisfy F(αx 1 +x)=αF(x 1 )+F(x)=αF(x 1 )+f(x) (6.34) whee the second equality follows since x ∈ M. Hence F is completely de- termined by the choice of F(x 1 ). Moreover, we must have αF(x 1 )+f(x) ≤ P(αx 1 +x)forallscalarsα and x ∈ X. If α>0,this means F (x 1 ) ≤ 1 α [p(αx 1 +x)?f (x)] = p 3 x 1 + x α ′ ?f 3 x α ′ = p(x 1 +z)?p(z) where z = x α .If α<0,we have F (x 1 ) ≥ 1 α [p(αx 1 +x)?f (x)] = f (y)?p(?x,+y) where y = ? x α .Combining these two inequalities we have f (y)?p(y?x 1 ) ≤ F (x 1 ) ≤ p(x 1 +z)?f (z),?y,z ∈ M???orM 1 (6.35) Conversely, if we can pick F (x 1 )tosatTsfy (6.35) then will satisfy (6.34) and F will satisfy (6.10) on M 1 . For if F (x 1 )satisTses (6.35) then for α>0we have αF (x 1 )+f (x)=α h F (x 1 )+f 3 x α ′i ≤ αp 3 x 1 + x α ′ = p(αx 1 +x) 294 CHAPTER 6. FUNCTION SPACES while for α<0wehave αF (x 1 )+f (x)=?α h ?F (x 1 )+f 3 ? x α ′i ≤?αp 3 ? x α ?x 1 ′ = p(αx 1 +x). So we have now reduced the problem to TndingavalueF (x 1 )tosatisTy (6.35). In order for such a value to exist, we must have f (y)?p(y?x 1 ) ≤ p(x 1 +z)?f (z),?y,z ∈ M (6.36) In other words we need f (y +z) ≤ p(x 1 +z)+p(y?x 1 ) Butthisistrueby(6.8). Hence (6.36) holds. If we Tx y and let z run through all elements of M, we have f (y)?p(y?x 1 ) ≤ inf z∈M {p(x 1 +z)?f (z)}≡ C. Since this if true for any y ∈ M, we have c ≡ sup y∈M {f (y)?p(y?x 1 )}≤ C. We now pick F (x 1 )tosatisTy c ≤ F (x q ) ≤ C. Note that the extention is unique only when c = C. Thus we have extended f from M to M 1 . If M 1 = X we are done. Otherwise, there is an element x 2 ∈ X not in M 1 . Let M 2 be the space spanned by M 1 and x 2 (M 2 = αx 2 +x,α ∈R,x∈ M 1 ). By repeating the process we can extend f to M 2 . If we prove that the collection of all linear bounded functionals deTned on subspaces of X satisTes the assumptions of Zorn?s lemma we are done (because Zorn?s lemma guarantees the existence of a maximal element which we will prove is the desired functional). Consider the collection L of all linear functionals g : D(g) ?→ R deTnedonavectorsubspaceofX such that the vector subspace satisTes: (i) D(g) ? M; (ii) g(x)=f (x), ?x ∈ M; (iii) g(x) ≤ p(x), ?x ∈ D(g). Note that L is not empty since F belongs there. Introduce a partial ordering ???inL as follows. If D(g 1 ) ? D(g 2 )and g 1 (x)=g 2 (x) ?x ∈ D(g n ), then g 1 ? g 2 . One can prove (see Exercise 6.7.2 below) that ???deTned above is a partial ordering in L. 6.7. APPENDIX - PROOFS FOR CHAPTER 6 295 We have to check now that every totally ordered subset of L has an upper bound in L.LetW be a totally ordered subset of L. DeTne the functional h by D(h)=∪ gDW D(g) h(x)=g(x),g∈ W,x ∈ D(g). Clearly h ∈ L anditisanupperboundforW. Note that the deTnition of h is not ambiguous because if g 1, g 2 are any two elements of W, then either g 1 ? g 2 or g 2 ? g 1 and in either case if x ∈ D(g 1 ) ∩D(g 2 ), then g 1 (x)=g 2 (x). Hence this shows that the assumptions of Zorn?s lemma are met and therefore a maximal element F of L exists. We must show that F is the desired functional. That means that D(F)= X. Suppose by contradiction that D(F) ( X. Then ? x 0 ∈ X\D(F)and by repeating the process that we used at the beginning we would construct the extension h of F such that h ? F and h 6= F. This would violate the maximality of F. Exercise 6.7.1 Prove that M 1 deTned in (??)isasubspaceofX and that the representation in (??)isunique. Exercise 6.7.2 Prove that the relation ???deTned in the proof of Theorem 539 is a partial ordering in L. Proof of Separation Theorem 549. Suppose without loss of generality that 0 is an internal point of K 1 . Then K 1 ?K 2 = {x?y : x ∈ K 1 ,y∈ K 2 } is convex by Theorem 216. Let x 0 ∈ K 2 . Since 0 is an internal point of K 1 , then 0?x 0 = ?x 0 must be internal point of K 1 ?K 2 . Let K = x 0 +K 1 ?K 2 . Then K is convex and 0∈ K is its internal point. See Figure 6.5.4. We claim that x 0 is not an internal point of K. Suppose it was. Then 0 wouldbeaninternalpointofK 1 ?K 2 . Then for any y 6= 0 and some positive number α,thepointαy would belong to K 1 ?K 2 (i.e. αy = k 1 ?k 2 for some k 1 ∈ K 1 and k 2 ∈ K 2 ). This implies αy+k 2 1+α = k 1 1+α .Ify is a point of K 2 then the left-hand side represents a point in K 2 because α 1+α y+ 1 1+α k 2 is a convex combination of two points of a convex set K 2. Furthermore, the right-hand side is an internal point of K 1 because k 1 ∈ K 1, 0 ∈ K 1 and 1 1+α < 1. This contradicts the assumption that K 2 contains no internal points of K 1 . Thus if P (x) is the support function of K and x 0 is not an internal point of K we know by (iii) of Lemma 546 that P (x 0 ) ≥ 1. 296 CHAPTER 6. FUNCTION SPACES Let M be a one-dimensional linear subspace spanned by x 0 (i.e. M = {x : x = αx 0 ,α∈R}.DeTne a linear functional f : M → R by f (αx 0 )= αP (x 0 ). We must check that f satisTes the assumptions of the Hahn-Banach Theorem 539. Is f (αx 0 ) ≤ P (αx 0 )forallα?Ifα ≤ 0, then f (αx 0 ) ≤ 0and hence f(αx 0 ) ≤ P (αx 0 )sinceP is non-negative. If α>0, then f (αx 0 )= αP (x 0 )=P (αx 0 ) by property of (i) of P in Lemma 546. Now by the Hahn- Banach Theorem f (x) can be extended to a linear functional F : X → R satisfying F (x) ≤ P (x) for all x ∈ X. Thus for x ∈ K we have F (x) ≤ 1, x = x 0 + y?z with y ∈ K 1 and z ∈ K 2 .Then x?y + x 0 ∈ K and we have F (x?y +x 0 ) ≤ 1andF (x)?F (y)+F (x 0 ) ≤ 1. Since F (x 0 ) ≥ 1, we have F (x) ≤ F (y)foranyx ∈ K 1 and y ∈ K 2 . Then we have sup x∈K 1 F (x) ≤ inf y∈K 2 F (y) and hence F separates K 1 ,K 2 and F is a non-zero functional (since F (x 0 ) ≥ 1).???Check x,y,z ??? Proof of Second Welfare Theorem 552. Proof. Since S is Tnite dimensional (A5) and the aggregate technological possibilities set is convex (A4), for the existence of φ we must show that the set of allocations preferred to {x ? i } I i=1 given by A = P I i=1 A i is convex where A i = {x ∈ X i : u i (x) ≥ u i (x ? i )},?i. Assumptions (A1) ?(A3) are su?cient to guarantee that each A i is convex and so A is convex. Finally, we show that A does not contain any interior points of Y. Suppose to the contrary that y ∈ intY and y ∈ A. Thus, for some {x i } I i=1 with x i ∈ A i for all i, we have y = P I i=1 x i . By assumption, there is some h ∈{1,...,I}, ?bx h such that u h (bx h ) >u h (x ? h ). Let x α h = αbx h +(1? α)x h ,α∈ (0,1). By A1andA2,x α h ∈ X h and u h (x α h ) > u h (x ? h ). Let y α = P i6=h x i + x α h . Since y ∈ intY, it follows that for some su?ciently small ε, y ε ∈ Y. In this case the allocation ? {x i6=h } I i=1 ∪x ε h ,y ε ¢ is feasible and satisTes x i ∈ X i ,?i,u i (x) ≥ u i (x ? i ),?i 6= h, and u h (x α h ) >u h (x ? h ) which contradicts the Pareto Optimality of ({x ? i } I i=1 ,{y ? j } J j=1 ). Therefore the conditions for Theorem 549 are met. To complete the proof, it is su?cient to show (b) holds in the deTnition of a competitive equilibrium. By (6.15), suppose that x i ∈ X i and φ(x i ) < φ(x ? i ). Hence it follows by contraposition of (6.13) that u i (x α i ) <u i (x ? i ) ?α ∈ (0,1). By A3, lim α→0 u i (x α i )=u i (x i ) <u i (x ? i ). 6.8. BIBILOGRAPHY FOR CHAPTER 6 297 6.8 Bibilography for Chapter 6 This material is based on Royden (Chapters ) and Munkres (Chapters ). 298 CHAPTER 6. FUNCTION SPACES Chapter 7 Topological Spaces This chapter is a brief overview of topological spaces; it does not go into details nor prove theorems. Let?s start with an example about Txed points. In Chapter 4 there is a theorem 257 saying that if f : I ?→ I is a continuous mapping from a closed interval I into itself, then there exists a point x 0 ∈ I such that f (x 0 )=x 0 . Is the theorem still true if the line segment I is distorted (i.e. if it is an arc or an arbitrary curve or a circle)? See Figure 7.1 Since every concept behind the theorem is a topological one, the theorem remains true as long as the object change is homeomorphic. We will explain the notions of topological properties and homeomor- phisms, but at this stage we say that topological properties of an object are those that are invariant with respect to various distortions like bending, increasing (magnifying), decreasing (reducing)-all these transformations are homeomorphic, but are not invariant, for example, to tearing or welding. Thus the theorem remains true for an arc or an arbitrary curve but not for the circle. The Trst two objects have two ends but the circle does not have any. Thus there is an ?inside? and ?outside? of the circle but not of the arc or arbitrary curve. It is easy to see that in the case of a circle, the Txed point theorem doesn?t hold. Consider a revolution of the circle about an angle-it is a continuous mapping of the circle into itself with no point remaining Txed. See Figure 7.2.???? DeTnition 585 AsetX together with a collectionO (for open sets) which satisTes the following conditions: (i) ?∈O, X ∈O;(ii)(∪ i∈Υ A i ) ∈O ,for A i ∈O (an arbitrary union of elements ofO belongs toO); (iii) (∩ n i=1 A i ) ∈O for A i ∈ O (a Tnite intersection of elements ofO belongs toO).O is called a 299 300 CHAPTER 7. TOPOLOGICAL SPACES topology on X and its elements are called open sets. Recall the following facts. A set B is called closed if X\B is open. Also ? and X are both open and closed. By using DeMorgan rules we can show that (i) ∪ n i=1 B i is closed for B i closed and (ii) ∩ i∈Υ B i is closed for B i -closed. The intersection of all closed sets containing a set C is called the closure of C written C. Hence the closure of C is the smallest closed set containing C and C ? ˉ C. Exercise 7.0.1 Show that C is closed i? C = ˉ C. The union of all open sets contained in a set D is called the interior of D (written intD) and it is the largest open set contained in D. Exercise 7.0.2 Show that D is open i? intD = D. Example 586 Let (R,|·|) be a metric space. The collection of all open sets (see Def. 104)O satisTes all three properties of Theorem 106 and hence |·| deTnes a topologyO in R. A topology is determined by its metric. You should realizethattwoequivalentmetricsdeterminethesametopology(seenotes after the Theorem 221). Hence any metric space is also a topological space. What about the converse? Consider a topological space X with a topologyO. Does a metric d on X exist that would generate a topologyO. DeTnition 587 If there exists a metric d on X that generates a topologyO we say that this topological space is metrisable. Using this deTnition we can rephrase the question, is any topological space metrisable? The answer is no, as we will see. Example 588 Given a set X ,letO be the collection of all subsets of X (i.e.O is the power set of X). This is the largest possible topology on X.We call it the discrete topology. The discrete topology is not very interesting. All sets are open (and closed), any mapping from X is continuous. This topological space is metrisable, put d(x,x)=0, and d(x,y)=1for x 6= y (i.e. the discrete metric). 301 Example 589 Let X have at least two elements andO contain only ? and X. This is the smallest possible topological space on X called the trival topology on X. This topological space is not metrisable for the following reason. In any metric space, a set containing just one element is closed. In this topological space the closure of a one element set {x} is the whole space X (since this is the only closed set containing x) and hence by Exercise 7.0.1 {x} is not closed. Example 590 Let X be an inTnite set. LetO contain ? and all subsets A ? X such that X\A is Tnite. This topology is called the topology of Tnite complements. Exercise 7.0.3 Show thatO in the preceding example 590 is a topology on X and that the closure of a set A is ˉ A = ? A if A is Tnite X if A is inTnite . This topological space is not metrizable as we will see in the next subsection on separation axioms. Now we will deTne other topological properties the same way we did in metric spaces. Naturally, we cannot use the notion of distance here; all new objects and properties can be deTned only in terms of open sets or in terms of other objects and properties, originally deTnedbyopensets. Aneighborhood of a point x is any open set containing x. Apointx is a cluster point of a set A if any neighborhood of x contains a point of A di?erent from x. Exercise 7.0.4 Show that A is closed i? A contains all its cluster points. As in DeTnition 153 we say that S ? X is dense in X if ˉ S = X. In many casesitisratherdi?cult to deTne the collection of all the open sets; we often use a small subcollection of open sets to deTne all open sets. This is exactely the method that we used in a metric space, where open sets were deTned in terms of open balls (see DeTnition 106). DeTnition 591 AcollectionB x of open sets is called the local basis of X if x ∈ B, ?B ∈ B x and if for any neighborhood U of x there exists B ∈ B x such that B ? U. B is called the topological basis of X if for any x ∈ X there exists B x ?B that is a local bases in x. 302 CHAPTER 7. TOPOLOGICAL SPACES Hence B is a topological basis of X i? for any x ∈ X and for any neigh- borhood U of X there exists B ∈B x such that x ∈ B and B ? U. Exercise 7.0.5 Show that the collection of open balls is the topological basis of n dimensional Euclidean space R n . If B is a topological basis of a topological space X then it satisTes the following:(i) For any x ∈ X there exists B ∈B such that x ∈ B;and(ii)IfB 1 , B 2 ∈B and x ∈ (B 1 ∩B 2 ) there existsB 3 ∈B such that x ∈ B 3 ? (B 1 ∩B 2 ). Assume now that we have a set X (without a topology) and a collection B of subsets of X satisfying conditions (i) and (ii). We say that S ? X is open if for any x ∈ S there exists B ∈ B such that x ∈ B ? S. The collection of all these open sets is a topology on X. Exercise 7.0.6 Prove the above statement. The method of deTning a topology onX through a basis is very important. ThismethodwasusedinChapter3indeTning a topology on R where the basis B was the collection of all open intervals. DeTnition 592 Let X be a topological space with a topologyO and let X 0 ? X. Then we can deTne a topology O 0 on X 0 as the collection of all sets of the form O ∩X 0 where O ∈ O. O 0 is called the relative topology on X 0 created byO and X 0 is called a topological subspace of X. Example 593 Let X = R be a topological space with the usual topology in DeTnition 104. Let X 0 =[0,1). Then the relative topologyO 0 on X 0 is the collection of all sets of the form O∩[0,1) where O is open in R. For example O 0 = £ 0, 1 2 ¢ is open in X 0 because £ 0, 1 2 ¢ =(?1,1) ∩ £ 0, 1 2 ¢ and (?1,1) is open in R. 7.1 Continuous Functions andHomeomorphisms Let X,Y be topological spaces and f : X ?→ Y be a function from X to Y. We say that f is continuous at x 0 ∈ X if for any neighborhood V of f (x 0 )in Y the inverse image f ?1 (V) is a neighborhood of x 0 in X. f is continuous on X if f is continuous at every x ∈ X. This is similar to DeTnition 244 where neighborhood has been substituted for open ball. 7.2. SEPARATION AXIOMS 303 Exercise 7.1.1 Prove that f : X ?→ Y is continuous i? f ?1 (V) is open (closed) in X for any V open (closed) in Y. DeTnition 594 Let f : X ?→ Y be a function from X to Y. Assume that there exists an inverse function f ?1 : Y ?→ X and let both f and f ?1 be continuous. Then we say that f is a homeomorphism of X onto Y and that X and Y are homeomorphic. Homeomorphic means topologically equivalent (i.e.the same from a topological point of view). Example 595 Let X = R with a topology determined by the Euclidean met- ric d 2 (x,y)= q (x 1 ?x 2 ) 2 +(y 1 ?y 2 ) 2 . Let Y = R with a topology deter- minedbythesupmetricd ∞ (x,y)=max{|x 1 ?x 2 |,|y 1 ?y 2 |}. Then X and Y are homeomorphic. Hence from a topological point of view a circle and a square are indistinguishable. We could deTne other topological properties like compactness, connect- edness, and separability but we will only touch upon them. All these notions are deTned in Chapter 4. Notice that although they are deTnedinamet- ric space, these deTnitions don?t use the notion of distance (they are simply formulated in terms of open sets). 7.2 Separation Axioms Since the notion of distance is very natural for us, a metric space is more easily envisioned than a topological space. That is why we take for granted many results. For instance, in a metric space given two di?erent elements x,y ∈ X , x 6= y there exists two disjoint open sets U,V each containing just one element (i.e. x ∈ U , y ∈ V,andU ∩V = ?).See Figure 7.3. But this is not necessarily true in a general topological space. Before we show this we will state separation axioms DeTnition 596 A topological space X is called: (i)a T 0 -space if for any two distinct elements x,y there exists a neighborhood U of x not containing y;(ii) a T 1 -space if for any two distinct elements x,y there exists a neighborhood U of x not containing y and a neighborhood V of y not containing x; (iii) a T 2 -space (or Hausdor? space) if any two distinct elements x,y have disjoint neighborhoods (i.e. there exist two open sets U,V such that x ∈ U , 304 CHAPTER 7. TOPOLOGICAL SPACES y ∈ V and U ∩V = ?); (iv) a T 3 -space ifanyclosedsetA and an element x/∈ A have disjoint neighborhoods (i.e. any point x and any closed set A not containing x can be separated by disjoint open sets); (v) a T 4 -space if any two disjoint closed sets have two disjoint neighborhoods (i.e. any two disjoint closed sets can be separated by disjoint open sets). These axioms can be pictured in Figure 7.4. A regular space is a T 3 -space which is also a T 1 -space. A normal space is a T 4 -space which is also a T 1 -space. Note that there may be slightly di?erent terminology in the literature depending on the book you reference. Exercise 7.2.1 Show that the following sequence of implications holds true: Normal space=? regular space=? Hausdor? space=?T 1 -space=?T 0 -space. Exercise 7.2.2 Show that any metric space is normal. Hint: A positive distance can be always bisected. Combining the statements of Exercises 7.2.1and7.2.2wegetthatany metric space satisTes all the separation axioms. Now we are going to show that none of the implications of Exercise 7.2.1 can be reversed. Example 597 AsetX containing at least two distinct elements with the trivial the topology (deTned in the Example 589) is not a T 0 -space. To see this, let x 6= y . x cannot be separated by an open set from y because the only open set containing x is the whole set X (that also contains y ). Before giving other examples it is useful to state the following theorem that you should prove as an exercise. Exercise 7.2.3 A topological space X is T 1 i? every singleton is closed. Example 598 Let X = {a,b} and O = {?,{a},X} beatopologyonX. Note that by deTnition, {a} is open. To show that O is a topology we must satisfy the conditions in DeTnition 585. Obviously ?,X ∈O by construction. On closedness with respect to arbitrary union, {a} ∪ ? = {a} ∈ O and {a}∪X = X ∈O . On closedness with respect to Tnite intersection {a}∩X = {a}∈O and {a}∩? = ?∈O . Now O is a T 0 -space because for a 6= b, the open set {a} is a neighborhood of the element a not containing b. Note that {b} is closed. According to Exercise 7.2.3 O is not a T 1 -space because {a} is not closed. 7.3. CONVERGENCE AND COMPLETENESS 305 Example 599 Let X = N with the topology of Tnite complements (deTned in Example 590). This is a T 1 -space because {x} is closed for x ∈ N (note N\{x} is inTnite thus open). It is not Hausdorf since: an open set A con- taining 1 has the form A = N\{x 1 ,....,x m } where x i 6=1;anopensetB containing 2 has the form B = N\{y 1 ,.....,y n } where g i 6=2;and the sets A,B are not disjoint. Example 600 Let X = R and the topology O consists of: (i) all open sets in the usual Euclidean topology (i.e. the topology induced by the Euclidean metric); and (ii) all sets of the form U\K where K = ? 1 n ,ndN a and U is open in the usual Euclidean topology. This is a Hausdor? space because open sets of type (i) can be used for separating two distinct points. It is not a T 3 -space because K is closed and 0 /∈ K cannot be separated from K. None of the topological spaces in Examples 597 to 600 are metrizable. Why? Let (X,O)be a topological space which is metrizable by d.Then (X,d) is a metric space. That is, let O 0 be the collection of all open sets of (X,d). Then O 0 coincides with O (ie. this means that (X,O) is metrizable). Then (X,O)and(X,O 0 ) are identical topological spaces one of which is not T 0 and othe other is normal (because it is a metric space by Exercise 7.2.2).Hence, there is a contradiction. The same argument can be used in the other examples. Hence being a normal topological space is a necessary condition for the space to be metrizable. But this condition is not su?cient. For further reading see Kelley (???) 7.3 Convergence and Completeness In Section 4.1 the notion of the convergence of a sequence in a metric space was deTned. In DeTnition 143 we characterize a closed set A as the set con- taining limit points of all convergent sequences from A. DeTning closed sets we can also deTne open sets as their complements. This means we can deTne a topology. That is, a topology in a metric space can be deTnedintermsof convergence of a sequence (as we did in Chapter 4). Can this procedure be used in a topological space? First, we must address whether convergence of a sequence can be introduced in topological space? In DeTnition 136 we see that the concept of distance (metric) is used there (we say that hx n i ?→ x if for any ε>0, ?N such that ?n ≥ N , d(x n ,x) <ε.In this deTnition, ε represents an ε -ball around x (i.e. a neighborhood of x). Hence the deTnition 306 CHAPTER 7. TOPOLOGICAL SPACES can be reformulated as the following: hx n i ?→ x if for any neighborhood U of x there exists N such that ?n ≥ N , x n ∈ U. In this new version only topo- logical notions are used and hence convergence of a sequence can be deTned in a topological space. You may wonder if a topology can be built only in terms of convergence of sequences (the same way it is done in a metric space). The answer is not always. Loosely speaking it is possible in topological spaces that are separable (i.e. containing a countably dense set). Thus separability of a topological (metric) space is an important property. Exercise 7.3.1 Show that if X has a countable basis then it is separable. Among topological spaces that are not separable there exist spaces whose topology cannot be fully built only in terms of convergence of sequences. If we want to build a topology in these spaces in terms of convergence, then the notion of sequence has to be replaced by the more general notion of a net. We will not deal with it here (again see Kelley). The last important property of a metric space is completeness. Is this property topological? That is, can completeness be deTnedintermsofopen sets? As we know, deTning completeness requires the notion of a Cauchy sequence and DeTnition 169 of a Cauchy sequence is based on the concept of distance. It cannot be deTned without a metric. In other words, a Cauchy se- quence cannot be deTned in a general topological space. Hence completeness is not a topological property as the next example shows. Example 601 Let X =(0,1],|·|),Y =([1,∞),|·|) be two metric spaces. Then f : X → Y given by f (x)= 1 x is a homeomorphism ( f is a bijection and f and f ?1 are continuous). Hence these two metric space are topologi- cally equivalent but X is not complete whereas Y is. Total boundedness is not a topological property either. Example 601 showsthissinceX is totally bounded whereas Y is not. Theorem 198 says that compactness in a metric space is equivalent to completeness and to- tal boundedness. Compactness is a topological property while completness and total boundedness are not topological properties individually but if they occur simultaneously they are a topological property.