An Introduction to Mathematical Analysis
in Economics
1
Dean Corbae and Juraj Zeman
December 2002
1
Still Preliminary. Not to be photocopied or distributed without permission of
the authors.
2
Contents
1Introduction 13
1.1 Rulesoflogic ........................... 13
1.2 TaxonomyofProofs ....................... 17
1.3 BibliographyforChapter1.................... 19
2SetTheory 21
2.1 SetOperations .......................... 23
2.1.1 Algebraicpropertiesofsetoperations.......... 24
2.2 CartesianProducts........................ 24
2.3 Relations.............................. 25
2.3.1 Equivalencerelations................... 25
2.3.2 Orderrelations ...................... 27
2.4 CorrespondencesandFunctions ................. 30
2.4.1 Restrictionsandextensions ............... 32
2.4.2 Compositionoffunctions................. 32
2.4.3 Injectionsandinverses.................. 33
2.4.4 Surjectionsandbijections ................ 33
2.5 Finite and InTniteSets...................... 34
2.6 AlgebrasofSets.......................... 38
2.7 BibliographyforChapter2.................... 43
2.8 EndofChapterProblems..................... 44
3 The Space of Real Numbers 45
3.1 TheFieldAxioms ........................ 46
3.2 TheOrderAxioms ........................ 48
3.3 TheCompletenessAxiom .................... 50
3.4 OpenandClosedSets ...................... 53
3.5 BorelSets............................. 60
3
4 CONTENTS
3.6 BibilographyforChapter3.................... 63
3.7 EndofChapterProblems..................... 64
4MetricSpaces 65
4.1 Convergence ............................ 68
4.1.1 Convergenceoffunctions................. 75
4.2 Completeness ........................... 77
4.2.1 Completionofametricspace............... 80
4.3 Compactness ........................... 82
4.4 Connectedness ........................... 87
4.5 NormedVectorSpaces ...................... 88
4.5.1 Convexsets........................ 92
4.5.2 A Tnite dimensional vector space: R
n
.......... 93
4.5.3 Series ........................... 98
4.5.4 An inTnite dimensional vector space: !
p
......... 99
4.6 ContinuousFunctions.......................105
4.6.1 Intermediatevaluetheorem ...............108
4.6.2 Extremevaluetheorem..................110
4.6.3 Uniformcontinuity....................111
4.7 HemicontinuousCorrespondences ................113
4.7.1 TheoremoftheMaximum................122
4.8 FixedPointsandContractionMappings ............127
4.8.1 Fixedpointsoffunctions.................127
4.8.2 Contractions........................130
4.8.3 Fixedpointsofcorrespondences.............132
4.9 Appendix-ProofsinChapter4.................138
4.10BibilographyforChapter4....................144
4.11EndofChapterProblems ....................145
5 Measure Spaces 149
5.1 LebesgueMeasure.........................150
5.1.1 Outermeasure ......................151
5.1.2 L?measurablesets....................154
5.1.3 Lebesguemeetsborel...................158
5.1.4 L-measurablemappings .................159
5.2 LebesgueIntegration.......................170
5.2.1 Riemannintegrals.....................170
5.2.2 Lebesgueintegrals ....................172
CONTENTS 5
5.3 GeneralMeasure .........................184
5.3.1 SignedMeasures .....................185
5.4 ExamplesUsingMeasureTheory ................194
5.4.1 ProbabilitySpaces ....................194
5.4.2 L
1
.............................195
5.5 Appendix-ProofsinChapter5.................200
5.6 BibilographyforChapter5....................211
6 Function Spaces 213
6.1 Thesetofboundedcontinuousfunctions............216
6.1.1 Completeness.......................216
6.1.2 Compactness .......................218
6.1.3 Approximation ......................221
6.1.4 Separability of C(X) ...................227
6.1.5 Fixedpointtheorems...................227
6.2 Classical Banach spaces: L
p
...................229
6.2.1 Additional Topics in L
p
(X) ...............235
6.2.2 Hilbert Spaces (L
2
(X))..................237
6.3 Linearoperators..........................241
6.4 LinearFunctionals ........................245
6.4.1 Dualspaces........................248
6.4.2 SecondDualSpace....................252
6.5 SeparationResults ........................254
6.5.1 Existence of equilibrium . . . ..............260
6.6 OptimizationofNonlinearOperators..............262
6.6.1 Variational methods on inTnite dimensional vector spaces262
6.6.2 DynamicProgramming..................274
6.7 Appendix-ProofsforChapter6.................284
6.8 BibilographyforChapter6....................297
7 Topological Spaces 299
7.1 ContinuousFunctionsandHomeomorphisms..........302
7.2 SeparationAxioms ........................303
7.3 ConvergenceandCompleteness .................305
6 CONTENTS
Acknowledgements
To my family: those who put up with me in the past - Jo and Phil - and
especially those who put up with me in the present - Margaret, Bethany,
Paul, and Elena. D.C.
To my family. J.Z.
7
8 CONTENTS
Preface
The objective of this book is to provide a simple introduction to mathemat-
ical analysis with applications in economics. There is increasing use of real
and functional analysis in economics, but few books cover that material at
an elementary level. Our rationale for writing this book is to bridge the gap
between basic mathematical economics books (which deal with introductory
calculus and linear algebra) and advanced economics books such as Stokey
and Lucas? Recursive Methods in Economic Dynamics that presume a work-
ing knowledge of functional analysis. The major innovations in this book
relative to classic mathematics books in this area (such as Royden?s Real
Analysis or Munkres? Topology) are that we provide: (i) extensive simple
examples (we believe strongly that examples provide the intuition necessary
to grasp di?cult ideas); (ii) sketches of complicated proofs (followed by the
complete proof at the end of the book); and (iii) only material that is rel-
evant to economists (which means we drop some material and add other
topics (e.g. we focus extensively on set valued mappings instead of just point
valued ones)). It is important to emphasize that while we aim to make this
material as accessible as possible, we have not excluded demanding mathe-
matical concepts used by economists and that the book is self-contained (i.e.
virtually any theorem used in proving a given result is itself proven in our
book).
Road Map
Chapter 1 is a brief introduction to logical reasoning and how to construct
direct versus indirect proofs. Proving the truth of the compound statement
?If A,thenB? captures the essence of mathematical reasoning; we take the
truth of statement ?A? as given and then establish logically the truth of
statement ?B? follows. We do so by introducing logical connectives and the
9
10 CONTENTS
idea of a truth table.
We introduce set operations, relations, functions and correspondences
in Chapter 2 . Then we study the ?size? of sets and show the di?erences
between countable and uncountable inTnite sets. Finally, we introduce the
notion of an algebra (just a collection of sets that satisfy certain properties)
and ?generate? (i.e. establish that there always exists) a smallest collection
of subsets of a given set where all results of set operations (like complements,
union, and intersection) remain in the collection.
Chapter 3 focuses on the set of real numbers (denoted R), which is one
of the simplest but most economic (both literally and Tguratively) sets to
introduce students to the ideas of algebraic, order, and completeness prop-
erties.Hereweexposestudentstothemost elementary notions of distance,
open and closedness, boundedness, and simple facts like between any two real
numbers is another real number. One critical result we prove is the Bolzano-
Weierstrass Theorem which says that every bounded inTnite subset of R has
apointwithsu?ciently many points in any subset around it. This result
has important implications for issues like convergence of a sequence of points
which is introduced in more general metric spaces. We end by generating the
smallest collection of all open sets in R known as the Borel (σ-)algebra.
In Chapter 4 we introduce sequences and the notions of convergence, com-
pleteness, compactness, and connectedness in general metric spaces, where
we augment an arbitrary set with an abstract notion of a ?distance? function.
Understanding these ?C? properties are absolutely essential for economists.
For instance, the completeness of a metric space is a very important property
for problem solving. In particular, one can construct a sequence of approxi-
mate solutions that get closer and closer together and provided the space is
complete, then the limit of this sequence exists and is the solution of the orig-
inal problem. We also present properties of normed vector spaces and study
two important examples, both of which are the used extensively in economics:
Tnite dimensional Euclidean space (denoted R
n
) and the space of (inTnite
dimensional) sequences (denoted !
p
). Then we study continuity of functions
and hemicontinuity of correspondences. Particular attention is paid to the
properties of a continuous function on a connected domain (a generalization
of the Intermediate Value Theorem) as well as a continuous function on a
compact domain (a generalization of the Extreme Value Theorem). We end
by providing Txed point theorems for functions and correspondences that
areusefulinproving,forinstance,theexistence of general equilibrium with
competitive markets or a Nash Equilibrium of a noncooperative game.
CONTENTS 11
Chapter 5 focuses primarily on Lebesgue measure and integration since
almost all applications that economists study are covered by this case and
because it is easy to conceptualize the notion of distance through that of
the restriction of an outer measure. We show that the collection of Lebesgue
measurable sets is a σ-algebra and that the collection of Borel sets is a subset
of the Lebesgue measurable sets. Then we provide a set of convergence
theorems for the existence of a Lebesgue integral which are applicable under
a wide variety of conditions. Next we introduce general and signed measures,
whereweshowthatasignedmeasurecanberepresentedsimplybyanintegral
(the Radon-Nikodyn Theorem). To prepare for the following chapter, we end
by studying a simple function space (the space of integrable functions) and
prove it is complete.
We study properties such as completeness and compactness in two impor-
tant function spaces in Chapter 6: the space of bounded continuous functions
(denoted C(X)) and the space of p-integrable functions (denoted L
p
(X)). A
fundamental result on approximating continuous functions inC(X)isgivenin
a very general set of Theorems by Stone and Weierstrass. Also, the Brouwer
Fixed Point Theorem of Chapter 4 on Tnite dimensional spaces is generalized
to inTnite dimensional spaces in the Schauder Fixed Point Theorem. Mov-
ingontotheL
p
(X) space, we show that it is complete in the Riesz-Fischer
Theorem. Then we introduce linear operators and functionals, as well as
the notion of a dual space. We show that one can construct bounded linear
functionals on a given set X in the Hahn-Banach Theorem, which is used to
prove certain separation results such as the fact that two disjoint convex sets
can be separated by a linear functional. Such results are used extensively
in economics; for instance, it is employed to establish the Second Welfare
Theorem. The chapter ends with nonlinear operators and focuses particu-
larly on optimization in inTnite dimensional spaces. First we introduce the
weak topology on a normed vector space and develop a variational method of
optimizing nonlinear functions. Then we consider another method of Tnding
the optimum of a nonlinear functional by dynamic programming.
Chapter 7 provides a brief overview of general topological spaces and
the idea of a homeomorphism (i.e. when two topological spaces X and Y
have ?similar topological structure? which occurs when there is a one-to-one
and onto mapping f from elements in X to elements in Y such that both f
and its inverse are continuous). We then compare and contrast topological
and metric properties, as well as touch upon the metrizability problem (i.e.
Tnding conditions on a topological space X which guarantee that there exists
12 CONTENTS
a metric on the set X that induces the topology of X).
Uses of the book
We taught this manuscript in the Trst year PhD core sequence at the Uni-
versity of Pittsburgh and as a PhD class at the University of Texas. The
program at University of Pittsburgh begins with an intensive, one month re-
medial summer math class that focuses on calculus and linear algebra. Our
manuscript was used in the Fall semester class. Since we were able to quickly
explain theorems using sketches of proofs, it was possible to teach the entire
book in one semester. If the book was used for upper level undergradu-
ates, we would suggest simply to teach Chapters 1 to 4. While we used the
manuscript in a classroom, we expect it will be beneTcial to researchers; for
instance, anyone who reads a book like Stokey and Lucas? Recursive Meth-
ods must understand the background concepts in our manuscript. In fact, it
was because one of the authors found that his students were ill prepared to
understand Stokey and Lucas in his upper level macroeconomics class, that
this project began.
Chapter 1
Introduction
In this chapter we hope to introduce students to applying logical reasoning
to prove the validity of economic conclusions (B) from well-deTned premises
(A). For example, A may be the statement ?An allocation-price pair (x,p)
is a Walrasian equilibrium? and B the statement ? the allocation x is Pareto
e?cient?. In general, statements such as A and/or B may be true or false.
1.1 Rules of logic
In many cases, we will be interested in establishing the truth of statements
of the form ?If A,then B.? Equivalently, such a statement can be written
as: ?A?B?; ?A implies B?; ?A only if B?; ?A is su?cient for B?; or ?B is
necessary forA.? Applied to the example given in the previous paragraph, ?If
A,then B? is just a statement of the First Fundamental Theorem of Welfare
Economics. In other cases, we will be interested in the truth of statements of
the form ?A if and only if B.? Equivalently, such a statement can be written:
?A ? B and B ? A?whichisjust?A ? B?; ?A implies B and B implies
A?; ?A is necessary and su?cient for B?; or ?A is equivalent to B.?
Notice that a statement of the form ?A?B? is simply a construct of two
simple statements connected by ???. Proving the truth of the statement
?A?B? captures the essence of mathematical reasoning; we take the truth
of A as given and then establish logically the truth of B follows. Before
actually setting out on that path, let us deTne a few terms. A Theorem or
Proposition is a statement that we prove to be true. A Lemma is a theorem
we use to prove another theorem. A Corollary is a theorem whose proof is
13
14 CHAPTER 1. INTRODUCTION
obvious from the previous theorem. A DeTnition is a statement that is true
by interpreting one of its terms in such a way as to make the statement true.
An Axiom or Assumption is a statement that is taken to be true without
proof. A Tautology is a statement which is true without assumptions (for
example, x = x). A Contradiction is a statement that cannot be true (for
example, A is true and A is false).
There are other important logical connectives for statements besides ???
and ???: ?∧? means ?and?; ?∨?means?or?;and?~? means ?not?. The
meaning of these connectives is given by a truth table,where?T? stands for
a true statement and ?F? stands for a false statement. One can consider the
truth table as an Axiom.
Table 1
A B ~A A∧B A∨B A?B A?B
T T F T T T T
T F F F T F F
F T T F T T F
F F T F F T T
To read the truth table, consider row two where A is true and B is false.
Then ~A is false since A is true, A∧B is false since B is, A∨B is true since
at least one statement (A) is true, A?B is false since A can?t imply B when
A is true and B isn?t. Notice that if A is false, then A?B is always true
since B can be anything.
Manipulating these connectives, we can prove some useful tautologies.
The Trst set of tautologies are the commutative, associative, and distributive
laws. To prove these tautologies, one can simply generate the appropriate
truth table. For example, the truth table to prove (A∨(B∧C) ? ((A∨B)∧
(A∨C)) is:
A B C B∧C A∨(B∧C) A∨B A∨C (A∨B)∧(A∨C)
T T T T T T T T
T T F F T T T T
T F T F T T T T
T F F F T T T T
F T T T T T T T
F T F F F T F F
F F T F F F T F
F F F F F F F F
1.1. RULES OF LOGIC 15
Since every case in whichA∨(B∧C)istrueorfalse,sois(A∨B)∧(A∨C),
the two statements are equivalent.
Theorem 1 Let A, B,andC be any statements. Then
(A∨B) ? (B∨A) and (A∧B) ? (B∧A) (1.1)
((A∨B)∨C) ? (A∨(B∨C)) and ((A∧B)∧C) ? (A∧(B∧C)) (1.2)
(A∨(B∧C) ? ((A∨B)∧(A∨C)) and (A∧(B∨C)) ? (A∧B)∨(A∧C)) (1.3)
Exercise 1.1.1 Complete the proof of Theorem 1.
The next set of results form the basis of the methods of logical reasoning
we will be pursuing in this book. The Trst (direct) approach (1.4) is the
syllogism, which says that ?if A is true and A implies B,thenB is true?. The
second (indirect) approach (1.5) is the contradiction,whichsaysinwords
that ?if not A leads to a false statement of the form B and not B,thenA is
true. That is, one way to prove A is to hypothesize ~A, and show this leads
to a contradiction. Another (indirect) approach (1.6) is the contrapositive,
which says that ?A implies B isthesameaswheneverB is false, A is false?.
Theorem 2
(A∧(A?B)) ?B (1.4)
((~A) ? (B∧(~B))) ?A (1.5)
(A?B) ? ((~B) ? (~A)). (1.6)
Proof. Before proceeding, we need a few results (we could have established
these in the form of a lemma, but we?re just starting here). The Trst result
1
we need is that
(A?B) ? ((~A)∨B) (1.7)
and the second is
~ (~A) ?A. (1.8)
1
The result follows from
A B A?B ~A∨B
T T T T
16 CHAPTER 1. INTRODUCTION
Inthecaseof(1.4),(A∧(A?B))
(1.7)
? (A∧((~A)∨B))
(1.3)
? (A∧ (~
A))∨(A∧B)) ?B by table 1.1.
In the case of (1.5), ((~A) ? (B∧(~B)))
(1.7)
? (A∨(B∧(~B))) ? A
by table 1.1.
In the case of (1.6), (A ? B)
(1.7)
? ((~A)∨B)
(1.1)
? (B∨(~A))
(1.8)
?
(~ (~B)∨(~A))
(1.7)
? ((~B) ? (~A)).
Note that the contrapositive of ?A?B?isnotthesameastheconverse
of ?A?B?, which is ?B?A?.
Another important way to ?construct? complicated statements from sim-
ple ones is by the use of quantiTers. In particular, a quantiTer allows a
statement A(x) to vary across elements x in some universe U. For example,
x could be a price (whose universe is always positive) with the property that
demand equals supply. When there is an x with the property A(x), we write
(?x)A(x)tomeanthatforsomex in U, A(x) is true.
2
In the context of the
previous example, this establishes there exists an equilibrium price. When
all x have the property A(x), we write (?x)A(x)tomeanthatforallx, A(x)
is true.
3
There are obvious relations between ???and???. In particular
~ ((?x)A(x)) ? (?x)(~A(x)) (1.9)
~ ((?x)A(x)) ? (?x)(~A(x)). (1.10)
The second tautology is important since it illustrates the concept of a coun-
terexample. In particular, (1.10) states ?If it is not true that A(x) is true
for all x, then there must exist a counterexample (that is, an x satisfying
~ A(x)), and vice versa. Counterexamples are an important tool, since
while hundreds of examples do not make a theorem, a single counterexample
kills one.
One should also note that the symmetry we experienced with ?∨?and
?∧? in (1.1) to (1.3) may break down with quantiTers. Thus while
(?x)(A(x)∨B(x)) ? (?(x)A(x)∨?(x)B(x)) (1.11)
canbeexpressedasatautology(i.e.???), it?s the case that
(?x)(A(x)∧B(x)) ? (?(x)A(x)∧?(x)B(x)) (1.12)
2
Thus, we let ??? denote ?for some? or ?there exists a?.
3
Thus, we let ??? denote ?for all?.
1.2. TAXONOMY OF PROOFS 17
cannot be expressed that way (i.e. it is only ???). To see why (1.12) cannot
hold as an ?if and only if? statement, suppose x is the set of countries in the
world, A(x) is the property that x is above average gross domestic product
andB(x)isthepropertythatx is below average gross domestic product, then
there will be at least one country above the mean and at least one country
below the mean (i.e. (?(x)A(x)∧?(x)B(x)) is true), but clearly there cannot
be a country that is both above and below the mean (i.e. (?x)(A(x)∧B(x))
is false).
We can make increasingly complex statements by adding more variables
(e.g. the statementA(x,y) can vary across elements x and y in some universe
e
U). For instance, when A(x,y) states that ?y that is larger than x?where
x and y are in the universe of real numbers, the statement (?x)(?y)(x<y)
says ?for every x there is a y that is larger than x?, while the statement
(?y)(?x)(x<y)says?thereisay which is larger than every x?. Note,
however, the former statement is true, but the latter is false.
1.2 Taxonomy of Proofs
While the previous section introduced the basics of the rules of logic (how to
manipulate connectives and quantiTers to establish the truth of statements),
here we will discuss broadly the methodology of proofs you will frequently
encounter in economics. The most intuitive is the direct proof in the form of
?A?B?, discussed in (1.4). The work is to Tll in the intermediate steps so
that A?A
1
and A
1
?A
2
and ... A
n?1
?B are all tautologies.
In some cases, it may be simpler to prove a statement like A ? B by
splitting B into cases. For example, if we wish to prove the uniqueness of the
least upper bound of a set A ?R, we can consider two candidate least upper
bounds x
1
and x
2
in A and split B intothecaseswhereweassumex
1
is the
least upper bound implying x
1
≤ x
2
and another case where we assume x
2
is the least upper bound implying x
2
≤ x
1
.But(x
1
≤ x
2
) ∧ (x
2
≤ x
1
) ?
(x
1
= x
2
) so that the least upper bound is unique. In other instances, one
might want to split A into cases (call them A
1
and A
2
), show A?(A
1
∨A
2
)
and then show A
1
?A and A
2
?A. For example, to prove
(0 ≤ x ≤ 1) ?
?
x
2
≤ x
¢
we can use the fact that
(0 ≤ x ≤ 1) ? (x =0∨(0 <x≤ 1))
18 CHAPTER 1. INTRODUCTION
where the latter case allows us to consider the truth of B by dividing through
by x.
Anotherdirectmethodofproof,calledinduction,worksonlyforthenat-
ural numbers N ={0,1,2,3,...}. Suppose we wish to show (?n ∈N)A(n)is
true. This is equivalent to proving A(0)∧(?n ∈N)(A(n) ?A(n+1)). This
works since A(0) is true and A(0) ? A(1) and A(1) ? A(2) and so on. In
the next chapter, after we introduce set theory, we will show why induction
works.
As discussed before, two indirect forms of proof are the contrapositive
(1.6) and the contradiction (1.5). In the latter case, we use the fact that
~ (A?B) ? (A∧(~B)) and show (A∧(~B)) leads to a contradiction
(B∧(~B)). Since direct proofs seem more natural than indirect proofs, we
now give an indirect proof of the First Welfare Theorem, perhaps one of the
most important things you will learn in all of economics. It is so simple, that
it is hard to Tnd a direct counterpart.
4
DeTnition 3 Given a Tnite vector of endowments y,anallocationx is fea-
sible if for each good k,
X
i
x
i,k
≤
X
i
y
i,k
(1.13)
where the summation is over all individuals in the economy.
DeTnition 4 A feasible allocation x is a Pareto e?cient allocation if
there is no feasible allocation x
0
such that all agents prefer x
0
to x.
DeTnition 5 An allocation-price pair (x,p) in a competitive exchange econ-
omy is a Walrasian equilibrium if it is feasible and if x
0
i
is preferred by i
to x
i
, then each agent i is maximized in his budget set
X
k
p
k
x
0
i,k
>
X
k
p
k
y
i,k
(1.14)
(i.e. i?s tastes outweigh his pocketbook).
Theorem 6 (First Fundamental Theorem of Welfare Economics) If (x,p)
is a Walrasian equilibrium, then x is Pareto e?cient.
4
See Debreu (1959, p.94).
1.3. BIBLIOGRAPHY FOR CHAPTER 1 19
Proof. By contradiction. Suppose x is not Pareto e?cient. Let x
0
be a
feasible allocation that all agents prefer to x. Then by the deTnition of
Walrasian equlibrium, we can sum (1.14) across all individuals to obtain
X
i
?
X
k
p
k
x
0
i,k
!
>
X
i
?
X
k
p
k
y
i,k
!
?
X
k
p
k
?
X
i
x
0
i,k
!
>
X
k
p
k
?
X
i
y
i,k
!
.
(1.15)
Since x
0
is a feasible allocation, summing (1.13) over all goods we have
X
k
X
i
p
k
x
0
i,k
≤
X
k
X
i
p
k
y
i,k
. (1.16)
But(1.15)and(1.16)imply
X
k
X
i
p
k
y
i,k
>
X
k
X
i
p
k
y
i,k
,
which is a contradiction.
Here B is the statement ?x is Pareto E?cient?. So the proof by contra-
diction assumes ~ B, which is ?Suppose x is not Pareto E?cient?. In that
case, by deTnition 4, there?s a preferred allocation x
0
which is feasible. But
if x
0
is preferred to x,thenitmustcosttoomuchifitwasn?tchoseninthe
Trst place (this is 1.14). But this contradicts that x
0
was feasible.
1.3 Bibliography for Chapter 1
An excellent treatment of this material is in McA?ee (1986, Economics 241
handout). See also Munkres (1975, p. 7-9) and Royden (1988, p. 2-3).
20 CHAPTER 1. INTRODUCTION
Chapter 2
Set Theory
The basic notions of set theory are those of a group of objects and the idea of
membership in that group. In what follows, we will Tx a given universe (or
space) X and consider only sets (or groups) whose elements (or members)
are elements of X. We can express the notion of membership by ?∈ ?sothat
? x ∈ A?means?x is an element of the set A?and?x/∈ A?means?x is not
an element of A?. Since a set is completely determined by its elements, we
usually specify its elements explicitly by saying ?The set A is the set of all
elements x in X such that each x has the property A (i.e. that A(x)istrue)?
and write A = {x ∈ X : A(x)}.
1
This also makes it clear that we identify
sets with statements.
Example 7 Agent i
0
s budget set, denoted B
i
(p,y
i
)={x
i
∈ X :
P
k
p
k
x
i,k
≤
P
k
p
k
y
i,k
}, is the set of all consumption goods that can be purchased with
endowments y
i
.
DeTnition 8 If each x ∈ A is also in the set B (i.e. x ∈ A ? x ∈ B), then
we say A is a subset of B (denoted A ? B). If A ? B and ?x ∈ B such
that x/∈ A, then A is a proper subset of B. If A ? B, then it is equivalent
to say that B contains A (denoted B ? A).
DeTnition 9 A collection is a set whose elements are subsets of X. The
power set of X,denotedP(X), is the set of all possible subsets of X (it has
2
#(X)
elements, where #(X) denotes the number of elements (or cardinality)
1
In those instances where the space is understood, we sometimes abbreviate this as
A = {x : A(x)}.
21
22 CHAPTER 2. SET THEORY
of the set X). A family is a set whose elements are collections of subsets of
X.
DeTnition 10 Two sets are equal if (A ? B)∧(B ? A) (denoted A = B).
DeTnition 11 A set that has no elements is called empty (denoted ?).
Thus, ? = {x : x ∈ X : A(x)∧(~A(x))}.
The empty set serves the same role in the theory of sets as 0 serves in the
counting numbers; it is a placeholder.
Example 12 Let the universe be given by X = {a,b,c}. We could let
A = {a,b},B= {c} be subsets of X, C = {A,B},D = {?},P(X)=
{?,{a},{b},{c},{a,b},{a,c},{b,c},X} be collections, and F = {C} be a
family.
The next result provides the Trst example of the relation between set
theory and logical rules we developed in Chapter 1. In particular, it relates
???and???aswellas?=?and???.
Theorem 13 Let A = {x ∈ X : A(x)} and B = {x ∈ X : B(x)}.Then (a)
A ? B ? (?x ∈ X)(A(x) ? B(x)) and (b) A = B ? (?x ∈ X)(A(x) ?
B(x)).
Proof. Just use deTnition (8) in (a) A ? B ? x ∈ A ? x ∈ B ? A(x) ?
B(x)anddeTnition (10) in (b) A = B ? (A ? B) ∧ (B ? A) ? (?x ∈
X)(A(x) ?B(x)).
The following are some of the most important sets we will encounter in
this book:
? N = {1,2,3,...},the natural or ?counting? numbers.
? Z = {...,?2,?1,0,1,2,...}, the integers. Z
+
= {0,1,2,...},the non-
negative integers.
? Q = {
m
n
: m,n ∈Z, n 6=0}, the rational numbers.
? Chapter 3 will discuss the real numbers, which we denote R.Thisset
just adds what are called irrational numbers to the above rationals.
2.1. SET OPERATIONS 23
The are several important results you will see at the end of this chapter.
The Trst establishes that there are fundamentally di?erent sizes of inTnite
sets. While some inTnite sets can be counted, others are uncountable. These
results are summarized in Theorem 71 and Theorem 80. The second re-
sult establishes that there always exists a smallest collection of subsets of a
given set where all results of set operations (like complements, union, and
intersection) remain in the collection (Theorem 87).
2.1 Set Operations
The following operations help us construct new sets from old ones. The Trst
three play the same role for sets as the connectives ?~?, ?∧?, and ?∨?played
for statements.
DeTnition 14 If A ? X,wedeTne the complement of A (relative to
X)(denotedA
c
) to be the set of all elements of X that do not belong to A.
That is, A
c
= {x ∈ X : x/∈ A}.
DeTnition 15 If A,B ? X,wedeTne their intersection (denoted A∩B)
to be the set of all elements that belong to both A and B. That is, A∩B =
{x ∈ X : x ∈ A∧x ∈ B}.
DeTnition 16 If A,B ? X and A ∩ B = ?,then we say A and B are
disjoint.
DeTnition 17 If A,B ? X,wedeTne their union (denoted A∪B)tobe
the set of all elements that belong to A or B or both (i.e. or is inclusive).
That is, A∪B = {x ∈ X : x ∈ A∨x ∈ B}.
DeTnition 18 If A,B ? X,wedeTne their di?erence (or relative com-
plement of A in B)(denotedA\B) to be the set of all elements of A that
do not belong to B. That is, A\B = {x ∈ X : x ∈ A ∧ x/∈ B}.
Each of these deTnitions can be visualized in Figure 2.1.1 through the
useofVennDiagrams.ThesedeTnitions can easily be extended to arbitrary
collections of sets. Let Λ be an index set (e.g. Λ = N or a Tnite subset of N)
and let A
i
,i∈ Λ be subsets of X.Then∪
i∈Λ
A
i
= {x ∈ X :(?i)(x ∈ A
i
)}.
Indexed families of sets will be deTned formally after we develop the notion
of a function in Section 5.2.
24 CHAPTER 2. SET THEORY
2.1.1 Algebraic properties of set operations
The following commutative, associative, and distributive properties of sets
are natural extensions of Theorem 1 and easily seen in Figure 2.1.2.
Theorem 19 Let A, B, C be any sets. Then (i) (C) A ∩ B = B ∩ A,
A∪B = B∪A;(ii) (A) (A∩B)∩C = A∩(B∩C), (A∪B)∪C = A∪(B∪C);
and (iii) (D) A∩(B∪C)=(A∩B)∪(A∩C), A∪(B∩C)=(A∪B)∩(A∪C).
Exercise 2.1.1 Prove Theorem 19.This amounts to applying the logical con-
nectives and above deTnitions. Besides using Venn Diagrams, we can just use
the deTnition of ∩ and ∪. For example,to show A∩B = B∩A,itissu?cient
to note x ∈ A∩B ? (x ∈ A) ∧ (x ∈ B)
1.1
? (x ∈ B) ∧ (x ∈ A) ? x ∈ B∩A.
The following properties are used extensively in probability theory and
are easily seen in Figure 2.1.3.
Theorem 20 (DeMorgan?s Laws) If A, B, C are any sets, then (a) A\(B∪
C)=(A\B)∩(A\C), and (b) A\(B∩C)=(A\B)∪(A\C).
Proof. (a) 2 parts.
(i,?) Suppose x ∈ A\(B∪C). Then x ∈ A and x/∈ (B∪C). Thus x ∈ A
and (x/∈ B and x/∈ C). This implies x ∈ A\B and x ∈ A\C. But this is
just x ∈ (A\B)∩(A\C).
(ii,?)Supposex ∈ (A\B)∩(A\C). Then x ∈ (A\B)andx ∈ (A\C).
Thus x ∈ A and ( x/∈ B or x/∈ C). This implies x ∈ A and x/∈ (B ∪C).
Butthisisjustx ∈ A\(B∪C).
Exercise 2.1.2 Finish the proof of Theorem 20.
2.2 Cartesian Products
There is another way to construct new sets out of given ones; it involves the
notion of an ?ordered pair? of objects. That is, in the set {a,b} there is no
preference given to a over b;i.e. {a,b} = {b,a} so that it is an unordered
pair. We can also consider ordered pairs (a,b) where we distinguish between
the Trst and second elements.
2
2
Don?t confuse this notation with the interval consisting of all real numbers such that
a<x<b.
2.3. RELATIONS 25
DeTnition 21 If A and B are nonempty sets, then the cartesian product
(denoted A×B) is just the set of all ordered pairs {(a,b):a ∈ A and b ∈ B}.
Example 22 A = {1,2,3},B= {4,5},A×B = {(1,4),(1,5),(2,4),(2,5),
(3,4),(3,5)}.
Example 23 A =[0,1]∪[2,3],B=[1,2]∪[3,4],A×B in Figure 2.2.1
This set operation also generalizes to Tnite and inTnite index sets.
2.3 Relations
To be able to compare elements of a set, we need to deTne how they are
related. The general concept of a relation underlies all that will follow. For
instance, just comparing the real numbers 1 and 2 requires such a deTnition.
Furthermore, a correspondence or function is just a special case of a relation.
In what follows, our deTnitions of relations, correspondences, and functions
are meant to emphasize that they are simply special kinds of sets.
DeTnition 24 Given two sets A and B,abinary relation between mem-
bers of A and members of B is a subset R ? A × B.Weusethenotation
(a,b) ∈ R to denote the relation R on A×B andreadit?a is in the relation
R to b?. If A = B we say that R is the relation on the set A.
Example 25 Let A = {Austin, Des Moines, Harrisburg} and B = {Texas,
Iowa, Pennsylvania}. Then the relation R = {(Austin,Texas), (DesMoines,Iowa),
(Harrisburg,Pennsylvania)} expresses ?is the state capital of?.
In general, we can consider n-nary relations between members of sets
A
1
, A
2
, ..., A
n
which is just the subset R ? A
1
×A
2
×...×A
n
.
A relation is characterized by a certain set of properties that it possesses.
We next consider important types of relations that di?er in their symmetry
properties.
2.3.1 Equivalence relations
DeTnition 26 An equivalence relation on a set A is a relation ? ~
0
hav-
ing the following three properties: (i) Re?exivity, x ~ x, ?x ∈ A; (ii) Sym-
metry, if x ~ y,theny ~ x, ?x,y ∈ A;and (iii) Transitivity, if x ~ y and
y ~ z,thenx ~ z, ?x,y,z ∈ A.
26 CHAPTER 2. SET THEORY
Example 27 Equality is an equivalence relation on R.
Example 28 DeTne the congruence modulo 4 relation ?M
0
onZby?x,y ∈
Z, xMy if remainders obtained by dividing x and y by 4 are equal. For ex-
ample, 13M65 because dividing 13 and 65 by 4 give the same remainder of
1.
Exercise 2.3.1 Show that congruence modulo 4 is an equivalence relation.
DeTnition 29 Given an equivalence relation ~ on a set A and an element
x ∈ A, we deTne a certain subset E of A called the equivalence class de-
termined by x by the equation E = {y ∈ A : y?x}. Note that the equivalence
class determined by x contains x since x?x.
Example 30 The equivalence classes of Z for the relation congruence mod-
ulo4aredeterminedbyx ∈{0,1,2,3} where E
x
= {z ∈Z : z =4k +x,k ∈Z}
(i.e. x is the remainder when z is divided by 4).
Equivalence classes have the following property.
Theorem 31 Two equivalence classes E and E
0
are either disjoint or equal.
Proof. Let E = {y ∈ A : y?x} and E
0
= {y ∈ A : y?x
0
}. Consider E∩E
0
. It
can be either empty (in which case E and E
0
are disjoint) or nonempty. Let
z ∈ E∩E
0
. We show that E = E
0
. Let w ∈ E. Then w?x. Since z ∈ E∩E
0
,
we know z?x and z?x
0
so that by transitivity x?x
0
. Also by transitivity w?x
0
so that w ∈ E
0
.ThusE ? E. Symmetry allows us to conclude that E
0
? E
as well. Hence E = E
0
.
GivenanequivalencerelationonA, let us denote by E the collection of
all equivalence classes. Theorem 31 shows that distinct elements of E are
disjoint. On the other hand, the union of all the elements of E equals all of
A because every element of A belongs to an equivalence class. In this case
we say that E is a partition of A.
DeTnition 32 A partition of a set A is a collection of disjoint subsets of
A whose union is all of A.
Example 33 It is clear that the equivalence classes of Z in Example (30) is a
partition since, for instance, E
0
= {...,?8,?4,0,4,8,...},E
1
= {...,?5,?1,1,5,...},
E
2
= {...,?6,?2,2,6,...},E
3
= {...,?7,?3,3,7,...} are disjoint and their
union is all of Z. Another simple example is a coin toss experiment where the
sample space S = {Heads,Tails} has mutually exclusive events (i.e.Heads∩
Tails= ?).
2.3. RELATIONS 27
2.3.2 Order relations
Arelationthatisre?exive and transitive but not symmetric is said to be an
order relation. If we consider special types of non symmetry, we have special
types of order relations.
DeTnition 34 Arelation?R?onA is said to be a partial ordering of
asetA if it has the following properties: (i) Re?exivity, xRx, ?x ∈ A; (ii)
Antisymmetry, if xRy and yRx,thenx = y,?x,y ∈ A; and (iii) Transitivity,
if xRy and yRz, then xRz, ?x,y,z ∈ A. We call (R,A) a partially ordered
set.
Example 35 ? ≤?isapartialorderingonR and ??? is a partial ordering on
P(A). It is clear that ≤ is not symmetric on R; just take x =1and y =2.
It is also clear that ? is not symmetric on P(A); if A = {a,b}, then while
{a} ? A it is not the case that A ? {a}. Finally, ?-
1
?onR×R given by
(x
1
,x
2
) -
1
(y
1
,y
2
) if x
1
≤ y
1
and x
2
≤ y
2
is a partial ordering since it is
clear that -
1
is not symmetric on R×R because ≤ is not symmetric even
on R.
DeTnition 36 A partially ordered relation ?R?onA is said to be a total (or
linear) ordering of A if (i) Completeness, for any two elements x,y ∈ A
we have either xRy or yRx.We call (R,A) a totally ordered set. A chain
in a partially ordered set is a subset on which the order is total.
Thus, a total ordering means that any two elements x and y in A can
be compared, unlike a partial ordering where there are elements that are
noncomparable.
Exercise 2.3.2 Show that if A ? B and B is totally ordered, then A is
totally ordered.
We write x ? y if x 1 y and x 6= y, and call ???astrict partial or strict
total ordering.
Example 37 ?<? is a strict total ordering on R while ?≤? is a total ordering
on R, both of which follow by the completeness axiom of real numbers. ???is
not a total ordering on P(A) since if A = {a,b}, there is no inclusion relation
between the sets {a} and {b}.?-
1
?on R×R given in Example 35 is not a
28 CHAPTER 2. SET THEORY
total ordering because we can?t compare elements where x
1
≤ y
1
and x
2
≥ y
2
.
However, a line passing through the origin having positive slope is a chain.
On the other hand, the relation ?-
2
?onR×R given by (x
1
,x
2
)-
2
(y
1
,y
2
) if
x
1
≤ y
1
or if x
1
= y
1
and x
2
≤ y
2
is a total ordering.
3
This is also known as
a lexicographic ordering since the Trst element of the totally ordered set has
the highest priority in determining the ordering just as the Trst letter of a
word does in the ordering of a dictionary. We compare ?-
1
?to?-
2
?inFigure
2.3.1 for the following four elements x =
?
1
4
,
1
4
¢
, y =
?
1
2
,
1
2
¢
, z =
?
1
4
,
3
4
¢
in
R×R. There are 3 pairwise comparisons for each relation. First consider ?
-
1
? . We have x -
1
y, x -
1
z but y and z are not comparable under ?-
1
?,
which is why we call it a partial ordering. Next consider ? -
2
? where each
pair is comparable (i.e. we have x -
2
z, x -
2
y, and z -
2
y)whichiswhy
we call it a total ordering. Notice that by transitivity they can be ranked (all
can be placed in the dictionary).
There are other types of order relations.
DeTnition 38 A weak order relation assumes: (i) transitivity; (ii) com-
pleteness; and (iii) non symmetry (just the negation of symmetry deTned in
26.
4
Weak order relations form the basis for consumer choice.
Example 39 Preference relations: We can represent consumer preferences
by the binary relation % deTned on a non-empty, closed, convex consumption
set X. If (x
1
,x
2
) ∈% or x
1
% x
2
we say ?consumption bundle x
1
is at least
as good as x
2
?. We embody rationality or consistency by completeness and
transitivity.
5
Exercise 2.3.3 Why aren?t preference relations just total orderings? Why
are they weak orderings? Show why indi?erence is an equivalence relation.
Because elements of a partiallyorderedset are not necessarilycomparable,
it may be the case that a maximum and/or minimum of a two element set
doesn?t even exist. We turn to this next.
3
Don?tbeconfusedthatwehaveleftoutacase(i.e. x
1
>y
1
) by considering only
x
1
≤ y
1
or if x
1
= y
1
and x
2
≤ y
2
. For instance, if the two elements we are considering
are (2,3) and (1,7), simply take x =(1,7) and y =(2,3). The point is that any two real
numbers can be compared using ?≤?.
4
Re?exivity is implied by completeness.
5
Experiments show that transitivity is often violated.
2.3. RELATIONS 29
DeTnition 40 Let - be a partial ordering of X. An upper bound for a set
A ? X is an element u ∈ X satisfying x - u, ?x ∈ A. The supremum
of a set is its least upper bound and when the set contains its supremum we
call it a maximum. A lower bound for a set A ? X is an element l ∈ X
satisfying l - x, ?x ∈ A. The inTmum of a set is its greatest lower bound
and when the set contains its inTmum we call it a minimum.
DeTnition 41 AsetS is bounded above if it has an upper bound; bounded
below if it has a lower bound; bounded if it has an upper and lower bound;
unbounded if it lacks either an upper or a lower bound.
We deTne the operators x ∨ y to denote the supremum and x ∧ y the
inTmum of the two point set {x,y}.
6
If X is a total order, then x and y
are comparable, so that one must be bigger or smaller than the other in
which case x ∨ y =max{x,y} and x ∧ y =min{x,y}. However, if X is a
partial order, then x and y may not be comparable but we can still Tnd their
supremum and inTmum.
DeTnition 42 A lattice is a partially ordered set in which every pair of
elements has a supremum and an inTmum.
Exercise 2.3.4 Show that: (i) every Tnite set in a lattice has a supremum
and an inTmum; and (ii) if a lattice is totally ordered, then every pair of
elements has a minimum and a maximum. Hint: (i) sup{x
1
,x
2
,x
3
} =
sup{sup{x
1
,x
2
},x
3
}.
Exercise 2.3.5 Show that a totally ordered set L is always a lattice.
Next we give examples of partially oredered sets that are not totally
ordered yet have a lattice structure. For any set X, an example is P(X)with
? is a lattice where if A,B ∈P(X), then A∨B = A∪B and A∧B = A∩B..
Example 43 Let X = {a,b},so that P(X)={?,{a},{b},{a,b}}. Then,
for instance, {a}∨ {b} = {a,b}, {a}∧{b} = ?, {a}∧{a,b} = {a}, and
{a}∨? = {a}.
6
Here?s another place where we don?t have enough good symbols to go around. Don?t
confuse ?∨?and?∧? here with the logical connectives in Chapter 1.
30 CHAPTER 2. SET THEORY
Example 44 R×R is a lattice with the ordering ?1
1
?. The inTmum and
supremum of any two points x,y are given by x∨y =(max{x
1
,y
1
},max{x
2
,y
2
})
and x∧y =(min{x
1
,y
1
},min{x
2
,y
2
}). SeeFigure2.3.2whereweconsider
the noncomparable elements x =(1,0) and y =(0,1).
Example 45 The next example shows that not every partially ordered set is
a lattice. We show this by resorting to the following subset X = {(x
1
,x
2
) ∈
R : x
2
1
+ x
2
2
≤ 1}. For -
1
on X, sup{(0,1),(1,0)} does not exist. See Figure
2.3.3.
While the next result is stated as a lemma, we will take it as an axiom.
7
It will prove useful in separation theorems which are used extensively in
economics.
Lemma 46 (Zorn) If A is a partially ordered set such that each totally or-
dered subset (a chain) has an upper bound in A,thenA has a maximal
element.
Example 47 (1,1) is the maximal element of ?-
1
?onA =[0,1]×[0,1].The
upper bounds of each chain in A are given by the intersection of the lines
(chains) with the x =1or y =1axes. See Figure 2.3.4.
ADD WELL ORDERING???
2.4 Correspondences and Functions
In your Trst economics classes you probably saw downward sloping demand
and upward sloping supply functions, and perhaps even correspondences (e.g.
backward bending labor supply curves). Given that we have already intro-
duced the idea of a relation, here we will deTne correspondences and functions
simply as a relation which has certain properties.
DeTnition 48 Let A and B be any two sets. A correspondence G, de-
noted G : A →→ B, is a relation between A and P(B) (i.e. G ? A×P(B)).
That is, G isarulethatassignsasubsetG(a) ? B to each element a ∈ A.
7
It is can be shown to be equivalent to the Axiom of choice.
2.4. CORRESPONDENCES AND FUNCTIONS 31
DeTnition 49 Let A and B be any two sets. A function (or mapping)
f, denoted f : A → B, is a relation between A and B (i.e. f ? A × B)
satisfying the following property: if (a,b) ∈ f and (a,b
0
) ∈ f, then b = b
0
.
That is, f is a rule that assigns a unique element f(a) ∈ B to each a ∈ A.
A is called the domain of f,sometimes denoted D(f).Therange of f,
denoted R(f),is{b ∈ B : ?a ∈ A such that (a,b) ∈ f}.Thegraph of f is
G(f)={(a,b)∈f : ?a ∈ A}.
Thus, a function can be thought of as a single valued correspondence. A
function is deTned if the following is given: (i) The domain D(f). (ii) An
assignment rule a → f(a)=b, a ∈ D(f). Then R(f)isdeterminedby
these two. See Figure 2.4.1a for a function and 2.4.1b for a correspondence,
as well as Figure 2.4.2a and Figure 2.4.2b for another interpretation which
emphasizes ?mapping?.
Example 50 A sequence is a function f : N→B for some set B.
DeTnition 51 Let f be an arbitrary function with domain A and R(f) ? B.
If E ? A,then the (direct) image of E under f,denotedf(E), is the subset
{f(a)|a ∈ E ∩D(f)}? R(f). SeeFigure2.4.3a.
Theorem 52 Let f be a function with domain A and R(f) ? B and let
E,F ? A.(a)IfE ? F, then f(E) ? f(F). (b) f(E ∩F) ? f(E)∩f(F),
(c) f(E ∪F)=f(E)∪f(F),(d)f(E\F) ? f(E).
Proof. (a) If a ∈ E,then a ∈ F so f(a) ∈ f(F). But this is true ?a ∈ E,
hence f(E) ? f(F).
Exercise 2.4.1 Finish the proof of Theorem 52.
DeTnition 53 If H ? B, then the inverse image of H under f, denoted
f
?1
(H), is the subset {a|f(a) ∈ H}? D(f). See Figure 2.4.3b.
It is important to note that the inverse image is di?erent from the inverse
function (to be discussed shortly). The inverse function need not exist when
the inverse image does. See Example 65.
Theorem 54 Let G,H ? B.(a)IfG ? H,,thenf
?1
(G) ? f
?1
(H). (b)
f
?1
(G ∩ H)=f
?1
(G) ∩ f
?1
(H),(c)f(G ∪ H)=f
?1
(G) ∪ f
?1
(H),(d)
f
?1
(G\H)=f
?1
(G)\f
?1
(H).
32 CHAPTER 2. SET THEORY
Proof. (a) If a ∈ f
?1
(G), then f(a) ∈ G ? H so a ∈ f
?1
(H).
Exercise 2.4.2 Finish the proof of Theorem 54.
Exercise 2.4.3 Let f : A →B be a function. Prove that the inverse images
f
?1
({a}) and f
?1
({a
0
}) are disjoint. FIX
2.4.1 Restrictions and extensions
Example 55 Let A = R\{0} and f(a)=
1
a
. Then R(f)=R\{0}.Se
Figure 2.4.4a
To deal with the above ?hole? in the domain of f in Example 55, we can
employ the idea of restricting or extending the function to a given set.
DeTnition 56 Let ?? D(f). The restriction of f to the set ?, which
we will denote f|
r
?, is given by {(a,b) ∈ f : a ∈ ?}.Let? ? D(f).
The extension of f on the set ?, which we will denote f|
e
?, is given by
?
(a,b):b =
?
f(a) a ∈ D(f)
g(a) a ∈?\D(f)
?
.
Example 57 See Figure 2.4.4b for a restriction of
1
a
to ? = R
++
and Figure
2.4.4c for an extension of
1
a
on ? = R is
?
b =
?
1
a
a 6=0
0 a =0
?
.
Note that extensions are not generally unique.
2.4.2 Composition of functions
DeTnition 58 Let f : A → B and g : B
0
→ C.LetR(f) ? B
0
. The
composition g ?f is the function from A to C given by g ?f = {(a,c) ∈
A×C : ?b ∈ R(f) ? B
0
3 (a,b) ∈ f and (b,c) ∈ g}.
8
See Figure 2.4.5.
Note that order matters, as the next example shows.
Example 59 Let A ? R,f(a)=2a, and g(a)=3a
2
? 1. Then g ? f =
3(2a)
2
?1=12a
2
?1 while f ?g =2(3a
2
?1) = 6a
2
?2.
8
Alternatively, we create a new function h(a)=g(f(a)).
2.4. CORRESPONDENCES AND FUNCTIONS 33
2.4.3 Injections and inverses
DeTnition 60 f : A → B is one-to-one or an injection if whenever
(a,b) ∈ f and (a
0
,b) ∈ f for a,a
0
∈ D(f), then a = a
0
.
9
DeTnition 61 Let f be an injection. If g = {(b,a) ∈ B × A :(a,b) ∈ f},
then g is an injection with D(g)=R(f) and R(g)=D(f). The function g
is called the inverse to f and denoted f
?1
. SeeFigure2.4.6.
2.4.4 Surjections and bijections
DeTnition 62 If R(f)=B, f maps A onto B (in this case, we call f a
surjection). See Figure 2.4.7
DeTnition 63 f : A → B is a bijection if it is one-to-one and onto (or an
injection and a surjection).
Example 64 Let E =[0,1] ? A = R, H =[0,1] ? B = R,andf(a)=2a.
See Figure 2.4.8. R(f)=R so that f is a surjection, the image set is
f(E)=[0,2],the inverse image set is f
?1
(H)=[0,
1
2
],fis an injection and
has inverse f
?1
(b)=
1
2
b, and as a consequence of being one-to-one and onto,
is a bijection. Notice that if F =[?1,0], then f(F) ∩ f(E)={0} and
f(E∩F)=f(0) = {0},so that in the special case of injections statement (b)
of Theorem 52 holds with equality.
Example 65 Let E =[0,1] ? A = R, H =[0,1] ? B = R,andf(a)=a
2
.
SeeFigure2.4.9.R(f)=R
+
so that f is not a surjection, the image set is
f(E)=[0,1],the inverse image set is f
?1
(H)=[?1,1],fis not an injection
(since, for instance, f(?1) = f(1) = 1), and is obviously not a bijection.
However, the restriction of f to R
+
or R
?
(in particular, let f
+
≡ f|
r
R
+
and
f
?
≡ f|
r
R
?
)isaninjectionandf
?1
+
(b)=
√
b while f
?1
?
(b)=?
√
b. Finally,
notice that if F =[?1,0], then f(F) ∩f(E)=[0,1] but that f(E ∩F)=
f(0) = 0,which is why we cannot generally prove equality in statement (b) of
Theorem 52.
The next theorem shows that composition preserves surjection. It is useful
to prove that statements about inTnite sets.
9
Alternatively, we can say f is one-to-one if f(a)=f(a
0
)onlywhena = a
0
.
34 CHAPTER 2. SET THEORY
Theorem 66 Let f : A → B and g : B → C be surjections. Their composi-
tion g?f is a surjection.
Exercise 2.4.4 Prove Theorem 66. Answer: We must show that for g?f :
A → C given by (g?f)(a)=g(f(a)),it is the case that ?c ∈ C, there exists
a ∈ A such that (g?f)(a)=c. To see this, let c ∈ C.Sinceg is a surjection,
?b ∈ B such that g(b)=c. Similarly, since f is a surjection, ?a ∈ A such
that f(a)=b. Then (g?f)(a)=g(f(a)) = g(b)=c.
2.5 Finite and InTnite Sets
The purpose of this section is to compare sizes of sets with respect to the
number of elements they contain. Take two sets A = {1,2,3} and B =
{a,b,c,d}. ThenumberofelementsofthesetA (also called the cardinality
of A, denoted card(A)) is three and of the set B isfour.Inthiscasewesay
that the set B is bigger than the set A.
It is hard, however, to apply this same concept in comparing, for instance,
the set of all natural numbers N with the set of all integers Z. Both are
inTnite. Is the ?inTnity? that represents card(N) smaller than the ?inTnity?
that represents card(Z)? One might think the statement was true because
there are integers that are not real numbers (e.g. ?1,?2,?3,...). We will
show however that this statement is false, but Trst we have to introduce a
di?erent concept of the size of a set known as countability and uncountablity.
To illustrate it, one of the authors placed a set of 3 coins in front of his 3
year old daughter and asked her ?Is that collection of coins countable??. She
proceeded to pick up the Trstcoinwithherrighthand,putitinherleft
hand, and said ?1?, pick up the second coin, put it in her left hand, and said
?2?, and pick up the Tnal coin, put it in her left hand, and said ?3?. Thus, she
put the set of coins into a one-to-one assignment with the Trst three natural
numbers. We will now make use of one-to-one assignments between elements
of two sets.
DeTnition 67 Two sets A and B are equivalent if there is a bijection
f : A → B.
DeTnition 68 An initial segment (or section) of N is the set a
n
= {i ∈
N : i ≤ n}.
2.5. FINITE AND INFINITE SETS 35
DeTnition 69 AsetA is Tnite if it is empty or there exists a bijection
f : A →a
n
for some n ∈N.IntheformercaseA has zero elements and in
the latter case A has n elements.
Lemma 70 Let B be a proper subset of a Tnite set A. There does not exist
a bijection f : A →B.
Proof. (Sketch) Since A is Tnite, ?f : A →a
n
.IfB is a proper subset of A,
then it contains m<nelements. But there cannot be a bijection between n
and m elements.
Exercise 2.5.1 Prove lemma 70 more formally. See lemma 6.1 in Munkres.
Lemma70saysthatapropersubsetofaTnite set cannot be equivalent
with the whole set. This is quite clear. But is it true for any set? Let?s
consider N = {1,2,3,4,...} and a proper subset N\{1} = {2,3,4,...}. We
can construct a one-to-one assignment from N onto N\{1} (i.e. 1 → 2,
2 → 3,...). Thus, in this case, it is possible for a set to be equivalent with its
proper subset. Given Lemma 70, we must conclude the following.
Theorem 71 N is not Tnite.
Proof. By contradiction. Suppose N is Tnite. Then .f : N→N\{1} deTned
by f(n)=n + 1 is a bijection of N with a proper subset of itself. This
contradicts Lemma 70.
DeTnition 72 AsetA is inTnite if it is not Tnite. It is countably inT-
nite if there exists a bijection f : N→ A.
Thus, N is countably inTnite since f can be taken to be the identity
function (which is a bijection).
DeTnition 73 Asetiscountable if it is Tnite or countably inTnite. A set
that is not countable is uncountable.
Next we examine whether the set of integers, Z,iscountable. Thatis,
are N and Z equivalent? This isn?t apparent since N = {1,2,...} has one
end of the set that goes to inTnity, while Z = {...,?2,?1,0,1,2,...} has two
ends of the set that go to inTnity. But it is possible to reorganize Z in a
way that looks like N since we can simply construct Z = {0,1,?1,2,?2,...}.
One can think of this set as being constructed from two rows {0,1,2,...}
and {?1,?2,...} by alternating between the Trst and second rows. This is
formalized in the next example.
36 CHAPTER 2. SET THEORY
Example 74 The set of integers, Z, is countably inTnite. The function
f : Z→N deTned by
f(z)=
?
2z if z>0
?2z +1 if z ≤ 0
is a bijection.
Exercise 2.5.2 Prove that a Tnite union of countable sets is countable.
Next we examine whetherN×N~N. As in the preceding example where
we had two rows, we can think about enumerating the set N×N in Figure
2.5.1. As in the preceding example, each row has inTnitely many elements
but now there are an inTnite number of rows. Yet all of the elements of
this ?inTnite matrix? can be enumerated if we start from (1,1) and then
continue by following the arrows. This enumeration provides us with the
desired bijection as shown next.
Example 75 The cartesian product N×N is countably inTnite. First, let
the bijection g : N×N→A, where A ? N×N consists of pairs (x,y) for
which y ≤ x, be given by g(x,y)=(x+ y?1,y). Next construct a bijection
h : A →N given by h(x,y)=
1
2
(x?1)x+y. Then the composition f = h?g
is the desired bijection.
We can actually weaken the condition for proving countability of a given
set A. The next theorem accomplishes this.
Theorem 76 Let A be a non-empty set. The following statements are equiv-
alent: (i) There is a surjection f : N→A. (ii) There is an injection g : A →
N. (iii) A is countable.
Proof. (Sketch) (i)?(ii). Given f,deTne g : A → N by g(a) =smallest
element of f
?1
({a}). Since f is a surjection, the inverse image f
?1
({a})is
non-empty so that g is well deTned. g is an injection since if a 6= a
0
, the sets
f
?1
({a})andf
?1
({a
0
}) are disjoint (recall Exercise 2.4.3), so their smallest
elements are distinct proving g : A →N is an injection.
(ii)?(iii). Since g : A → R(g) is a surjection by deTnition, g : A → R(g)
is a bijection. Since R(g) ?N, A must be countable.
(iii)?(i). By deTnition.
2.5. FINITE AND INFINITE SETS 37
Exercise 2.5.3 Finish parts (ii)?(iii) and (iii)?(i) of the proof of Theorem
76. See Munkres 7.1 (USES WELL ORDERING).
Example 77 The set of positive rationals, Q
++
, is countably inTnite. DeTne
a surjection g : N×N→Q
++
by g(n,m)=
m
n
.SinceN×N is countable
(Example 75), there is a surjection h : N→N×N. Then f = g?h : N→Q
++
is a surjection (Theorem 66) so by Theorem 76, Q
++
is countable.
The intuition for the preceding example follows simply from Figure 2.5.1
if you replace the ?,? with ?/?. That is, replace (1,1) with the rational
1
1
,
(1,2) with the rational
1
2
,(3,2) with the rational
3
2
,etc.
Theorem 78 A countable union of countable sets is countable.
Proof. Let {A
i
,i∈ Λ} be an indexed family of countable sets where Λ is
countable. Because each A
i
is countable, for each i we can choose a surjection
f
i
: N→A
i
. Similarly, we can choose a surjection g : N→¤.DeTne h :
N×N→∪
i∈Λ
A
i
by h(n,m)=f
g(n)
(m), which is a surjection. Since N×N
is in bijective correspondence with N (recall Example 75), the countability
of the union follows from Theorem 76.
The next theorem provides an alternative proof of example 75.
Theorem 79 A Tnite product of countable sets is countable.
Proof. Let A and B be two non-empty, countable sets. Choose surjective
functions g : N → A and h : N → B. Then the function f : N×N→A×B
deTned by f(n,m)=(g(n),h(m)) is surjective. By Theorem 76, A × B is
countable. Proceed by induction for any Tnite product.
While it?s tempting to think that this result could be extended to show
that a countable product of countable sets is countable, the next Theorem
shows this is false. Furthermore, it gives us our Trst example of an uncount-
able set.
Theorem 80 Let X = {0,1}. The set of all functions x : N→X, denoted
X
ω
, is uncountable.
10
10
An alternative statement of the theorem is that the set of all inTnite sequences of X
is uncountable.
38 CHAPTER 2. SET THEORY
Proof. We show that any function g : N→X
ω
is not a surjection. Let
g(n)=(x
n1
,x
n2
,...,x
nm
,...)whereeachx
ij
is either 0 or 1. DeTne a point
y =(y
1
,y
2
,...,y
n
,...)ofX
ω
by letting
y
n
=
?
0ifx
nn
=1
1ifx
nn
=0
.
Now y ∈ X
ω
and y is not in the image of g. That is, given n, g(n)andy
di?er in at least one coordinate, namely the n
th
.Thusg is not a surjection.
The diagonal argument used above (See Figure 2.5.2) will be useful to
establish the uncountability of the reals, which we save until Chapter 3.
Exercise 2.5.4 Consider the following game known as ?matching pennies?.
You (A)andI(B) each hold a penny. We simultaneously reveal either
?heads? (H)or?tails?(T) to each other. If both faces match (i.e. both
heads or both tails) you receive a penny, otherwise I get the penny. The ac-
tion sets for each player are S
A
= S
B
= {H,T}. Now suppose we decide
to play this game every day for the indeTnite (inTnite) future (we?re opti-
mistic about medical technology). Before you begin, you should think of all
the di?erent combinations of actions you may employ in the inTnitely re-
peated game. For instance, you may alternate H and T starting with H in
the Trst round. Prove that although the number of actions you play in the
inTnitely repeated game is countable and the set of actions S
A
is Tnite, the
set of possible combinations of actions (S
A
×S
A
×...) is uncountable.
2.6 Algebras of Sets
An algebra is just a collection of sets (which could be inTnite) that is closed
under (Tnite) union and complementation. It is used extensively in proba-
bility and measure theory.
DeTnition 81 A collection A of subsets of X is called an algebra of sets
if (i) A
c
∈A if A ∈A and (ii) A∪B ∈A if A,B ∈A.
Note that ?,X∈A since, for instance, A ∈A? A
c
∈A by (i) and then
A∪A
c
= X ∈ A by (ii). It also follows from De Morgan?s laws that (iii)
A∩B ∈A if A,B ∈A. The deTnition extends to larger collections (just take
unions two at a time).
2.6. ALGEBRAS OF SETS 39
Theorem 82 Given any collection C of subsets of X, there is a smallest
algebra A which contains C.
Proof. (Sketch) It is su?cient to show there is an algebra A containing
C such that if B is any algebra containing C,thenB ?A.LetF be the
family of all algebras that contain C (which is nonempty since P(X) ∈ F).
Let A = ∩{B : B ∈ F}. Then C is a subcollection of A since each B in F
contains C. All that remains to be shown is that A is an algebra (i.e. if A
and B are in A,thenA∪B and A
c
are in ∩{B : B ∈ F}). It follows from
the deTnition of A that B ?A. See Figure 2.6.1
Exercise 2.6.1 Finish the proof of Theorem 82 .If A and B are in A,then
for each B ∈F, we have A ∈ B and B ∈ B.SinceB is an algebra, A∪B
∈ B. Since this is true for every B ∈F,wehaveA ∪ B ∈∩{B|B ∈ F}.
Similarly, if A ∈A,thenA
c
∈A.
We say that the smallest algebra containing C is called the algebra gen-
erated by C. By construction, the smallest algebra is unique. Notice the
proof makes clear that the intersection of any collection of algebras is itself
an algebra.
Example 83 Let X = {a,b,c}. The following three collections are algebras:
C
1
= {?,X},C
2
= {?,{a},{b,c},X},C
3
= P(X). The following two collec-
tions are not algebras: C
4
= {?,{a},X} since, for instance, {a}
c
= {b,c} /∈
C
4
and C
5
= {?,{a},{b},{b,c},X} since {b}
c
= {a,c} /∈ C
5
. However, the
smallest algebra which contains C
4
is just C
2
. To see this, we can apply the
argument in Theorem 82. Let F = {C
2
,P(X)} be the family of all algebras
that contain C
4
.ButA = C
2
∩P(X)=C
2
.
Exercise 2.6.2 Let X = N. Show that the collection A = {A
i
: A
i
is Tnite
or N\A
i
is Tnite} is an algebra on N andthatitisapropersubsetofP(N).
The next theorem proves that it is always possible to construct a new
collection of disjoint sets from an existing algebra with the property that its
union is equivalent to the union of subsets in the existing algebra. This will
become very useful when we begin to think about probability measures.
Theorem 84 Let A be an algebra comprised of subsets {A
i
: i ∈Λ}.
11
Then
there is a collection of subsets {B
i
: i ∈Λ} in A such that B
n
∩B
m
= ? for
n 6= m and ∪
i∈Λ
B
i
= ∪
i∈Λ
A
i
.
11
Note that the index set Λ can be countably or even uncountably inTnite.
40 CHAPTER 2. SET THEORY
Proof. (Sketch)The theorem is trivial when the collection is Tnite (see Ex-
ample 85 below). When the collection is indexed on N,weletB
1
= A
1
and
for each n ∈N\{1} deTne
B
n
= A
n
\[A
1
∪A
2
∪...∪A
n?1
]
= A
n
∩A
c
1
∩A
c
2
∩...∩A
c
n?1
.
Since the complements and intersections of sets in A are in A, B
n
∈ A and
by construction B
n
? A
n
. The remainder of the proof amounts to showing
that the above constructed sets are disjoint and yield the same union as the
algebra.
Exercise 2.6.3 Finish the proof of Theorem 84 above. See Royden Prop2 p.
17.
Note that Theorem 84 does not say that the new collection {B
i
: i ∈Λ}
is necessarily itself an algebra. The next example shows this.
Example 85 Let X = {a,b,c} and algebra A = P(X) with A
1
= {a},
A
2
= {b}, A
3
= {c}, A
4
= {a,b}, A
5
= {a,c},A
6
= {b,c}, A
7
= ?,A
8
= X.
Let B
1
= A
1
. By construction B
2
= A
2
\A
1
= {b},B
3
= A
3
\{A
1
∪A
2
} =
{c},B
n
= A
n
\{A
1
∪ A
2
∪ ... ∪ A
n?1
} = ? for n ≥ 4. Note that the new
collection {{a},{b},{c}} is not itself an algebra, since it?s not closed under
complementation and that if we chose a di?erent sequence of A
i
we could
obtain a di?erent collection {B
i
: i ∈Λ}.
In the next chapter, we will learn an important result: any (open) set
of real numbers can be represented as a countable union of disjoint open
intervals. Hence we cannot guarantee that the set is in an algebra, which
is closed only under Tnite union, even if all the sets belong to the algebra.
Thus we extend the notion of an algebra to countable collections that are
closed under complementation and countable union.
DeTnition 86 A collection X of subsets of X is called a σ?algebra of sets
if (i) A
c
(= X\A) ∈X if A ∈X and (ii) ∪
n∈N
A
n
∈X if each A
n
∈X.
Asinthecaseofalgebras,?,X ∈ X and ∩
∞
n=1
A
n
=(∪ =
∞
n=1
A
c
n
)
c
∈ X
which means that a σ-algebra is closed under countable intersections as well.
Furthermore, we can always construct the unique smallest σ-algebra con-
taining a given collection X (called the σ-algebra generated by X)byforming
the intersection of all the σ-algebras containing X). This result is an exten-
sion of Theorem 82.
2.6. ALGEBRAS OF SETS 41
Theorem 87 Given any collection X of subsets of X,thereisasmallest
σ-algebra that contains X.
Exercise 2.6.4 Prove Theorem 87.
Exercise 2.6.5 Let C , D be collections of subsets of X.(i)Showthat
the smallest algebra generated by C is contained in the smallest σ-algebra
generated by C.(ii)IfC ?D , show the smallest σ-algebra generated by C is
contained in the smallest σ-algebra generated by D.
42 CHAPTER 2. SET THEORY
Figures for Sections 2.1 to 2.2
Figure 2.1.1: Set Operations
Figure 2.1.2: Distributive Property
Figure 2.1.3: DeMorgan?s Laws
Figure 2.2.1: Cartesian Product
Figures for Sections 2.3 to 2.4
Figure 2.3.1: Illustrating Partial vs Total Ordering
Figure 2.3.2: A Lattice
Figure 2.3.3: A Partially Ordered Set that?s not a Lattice
Figure 2.3.4: Chains and Upper Bounds
Figure 2.4.1a: f : A →B and 2.4.1b: Not a function
Figure 2.4.2a: Graph of f in A×B and 2.4.2b: Not a function in A×B
Figure 2.4.3a: Image Set and Figure 2.4.3b: Inverse Image Set
Figure 2.4.4a: Graph of
1
a
Figure 2.4.4b: Restriction of
1
a
and Figure 2.4.4c: Extension of
1
a
Figure 2.4.5: Composition
Figure 2.4.6: Inverse Function
Figure 2.4.7. Surjection
Figure 2.4.8. Graph of 2a
Figure 2.4.9. Graph of a
2
Figures for Section 2.5
Figure 2.5.1: Countability of N×N
Figure 2.5.2: Uncountability of {0,1}
ω
Figures for Section 2.6
Figure 2.5.1: Smallest σ?algebras
2.7. BIBLIOGRAPHY FOR CHAPTER 2 43
2.7 Bibliography for Chapter 2
Sections 2.1 to 2.2 drew on Bartle (1978, Ch 1), Munkres (1975, Ch.1), and
Royden (1988, Ch 1, Sec. 1,3,4). The material on relations and correspon-
dences in Section 2.3 is drawn from Royden (1988, Ch1, Sec 7), Munkres
(1975, Ch1, Sec 3), Aliprantis and Border (1999, Ch.1, Sec 2),and Mas-
Collel, Whinston, and Green (1995, Ch 1, Sec B). The material on functions
in Section 5.2 is drawn from Bartle (1976, Ch 2) and Munkres (1975, Ch1,
Sec 2). Section 2.5 drew from Munkres (1975, Ch1, Sec 6-7) and Bartle
(1976, Ch.3).
44 CHAPTER 2. SET THEORY
2.8 End of Chapter Problems.
1. Let f : A → B be a function. Prove the following statements are
equivalent.
? (i) f is one-to-one on A.
? (ii) f(C ∩D)=f(C)∩f(D)forallsubsetsC and D of A.
? (iii) f
?1
[f(C)] = C for every subset C of A.
? (iv) For all disjoint subsets C and D of A,theimagesf(C)and
f(D)aredisjoint.
? (v) For all subsets C and D of A with D ? C,we have f(C\D)=
f(C)\f(D).
2. Prove that a Tnite union of countable sets is a countable set.
.
Chapter 3
The Space of Real Numbers
In this chapter we introduce the most common set that economists will en-
counter. The real numbers can be thought of as being built up using the
set operations and order relations that we introduced in the preceding chap-
ter. In particular we can start with the most elementary set N (the counting
numbers we all learned in pre-kindergarten) upon which certain operations
like ?+? and ?·?aredeTned. The naturals are closed (i.e. for any two counting
numbers, say n
1
and n
2,
the operation n
1
+n
2
is contained in N). However,
N is not closed with respect to certain other operations like ???sincefor
example 2?4 /∈ N. To handle that example we need the integers Z, which
is closed under ?+?, ?·?, and ???(i.e. 2? 4 ∈ Z). However, Z can?t handle
operations like dividing 2 pies between 3 people (i.e.
2
3
/∈Z). To handle that
example we need the rationals Q, which is closed under ?+?,???, ?·?, and ?÷?.
(i.e.
2
3
∈ Q). But the rationals can?t handle something as simple as Tnding
the length of the diagonal of a unit square. That is,
√
2 /∈ Q. To extend Q
to include such cases, besides the operations ?+?,???, ?·?, and ?÷?, we could
use Dedekind cuts which makes use the order relation ?≤?. A Dedekind cut
in Q is an ordered pair (D,E) of nonempty subsets of Q with the properties
D∩E = ?, D∪E, and d<e, ?d ∈ D and ?e ∈ E.Anexampleofacutin
Q is, for ξ ∈Q,
D = {x ∈Q : x ≤ ξ},E= {x ∈Q : x>ξ}.
In this case, we say that ξ ∈ Q represents the cut (D,E). If a cut can be
represented by a rational number, it is called a rational cut. It is simple
to see that there are cuts in Q which cannot be represented by a rational
45
46 CHAPTER 3. THE SPACE OF REAL NUMBERS
number. For example, take the cut
D
0
= {x ∈Q : x ≤ 0orx
2
≤ 2},E
0
= {x ∈Q : x>0andx
2
> 2}.
As we will show in this chapter, (D
0
,E
0
)cannotberepresentedbyarational
number. Such cuts are called irrational cuts. Each irrational cut deTnes a
unique number. The set of all such numbers is called the irrational numbers.
In this way, we can extend the rationals by adding in these irrationals.
Rather than build up the real numbers as discussed above, our approach
will simply be to take the real numbers as given, list a set of axioms for
them, and derive properties of the real numbers as consequences of these
axioms. The Trst group of axioms describe the algebraic properties, the sec-
ond group the order properties, and we shall call the third the completeness
axiom. With these three groups of axioms we can completely characterize
the real numbers. In the next chapter we will focus on important issues like
convergence, compactness, completeness, and connectedness in spaces more
general than the real numbers. However, to understand those concepts it is
often helpful to provide examples from R,which is why we start here.
In this chapter we focus on four important results in R. The Trst (see
Theorem 108)is that any open set in R can be written in terms of a countable
union of open intervals. The next two results are proven using the Nested
Intervals Property (see Theorem 116) which says that a decreasing sequence
of closed, bounded, nonempty intervals ?converges? to a nonempty set. The
Trst important result that this is used to prove is the Bolzano-Weierstrass
Theorem (118) which says that every bounded inTnite subset ofRhas a point
with su?ciently many points in any subset around it. It is also used to prove
the important ?size? result (see Theorem 122) that open intervals in R are
uncountable.
3.1 The Field Axioms
The functions or binary operations ?+? and ?·?onR×R to R satisfy the
following axioms. It shouldn?t be surprising that we require the operations
to satisfy commutative, associative, and distributive properties as we did in
Chapter 2 with respect to the set operations ?∪?and?∩?.
Axiom 1 (Algebraic Properties of R) x, y,z ∈R satisfy:
3.1. THE FIELD AXIOMS 47
A1. x+y = y +x.
A2. (x+y)+z = x+(y +z)
A3. ?0 ∈R 3 x+0=x, ?x ∈R
A4. ?x ∈R, ?w ∈R3x+w =0
A5. x·y = y ·x
A6. (x·y)·z = x·(y ·z)
A7. ?1 ∈R 3 1 6=0andx·1=x, ?x ∈R
A8. ?x ∈R3 x 6=0,?w ∈R 3x·w =1
A9. x·(y +z)=x·y +x·z
Any set that satisTes Axiom 1 is called a Teld (under ?+? and ?·?). If
we have a Teld, we can perform all the operations of elementary algebra,
including the solution of simultaneous linear equations. It follows from A1
that the 0 in A3 is unique, which was used in formulating A4,A7, and A8.
It also follows that the w in A4 is unique and denoted ??x?. Subtraction
?x?y?isdeTned as ?x +(?y)?. That 1 in A7isuniquefollowsfromA5.
The w in A8 can also be shown to be unique.
Exercise 3.1.1 Let a,b ∈ R. Prove that the equation a + x = b has the
unique solution x =(?a)+b.Witha 6=0, prove that the equation a·x = b
has the unique solution x =
?
1
a
¢
·b. (This is Theorem 4.4 of Bartle).
In what follows, we drop the ?·? to denote multiplication and write xy
for x· y. Furthermore, we write x
2
for xx and generally x
n+1
=(x
n
)x with
n ∈ N. It follows by mathematical induction that x
n+m
= x
n
x
m
for x ∈ R
and n,m ∈N. We shall also write
x
y
instead of
3
1
y
′
·x.RecallthatwedeTned
the rationals as Q = {
m
n
: m,n ∈Z, n 6=0}.
Theorem 88 There does not exist a rational number q ∈Q such that q
2
=2.
48 CHAPTER 3. THE SPACE OF REAL NUMBERS
Proof. Suppose not. Then
?
m
n
¢
2
=2form,n ∈Z, n 6=0. Assume, without
loss of generality, that m and n have no common factors. Since m
2
=2n
2
is an even integer,then m must be an even integer.
1
In that case we can
represent it as m =2k for some integer k. Hence (m
2
=)4k
2
=2n
2
or
n
2
=2k
2
which implies that n is also even. But this implies that m and n
are both divisible by 2, which contradicts the assumption that m and n have
no common factors.
Theorem 88 says that the cut (D
0
,E
0
) in the introduction to this chapter
is not rational.
DeTnition 89 All of the elements of R which are not rational numbers are
irrational numbers.
In Section 3.3 we provide a complementary result to Theorem 88 to es-
tablish the existence of irrational numbers.
3.2 The Order Axioms
The next class of properties possessed by the real numbers have to do with
the fact that they are ordered. The order relation ?≤?deTned on R is a
special, and most important, case of the more general relations discussed in
Chapter 2.
2
Axiom 2 (Order Properties of R) The subset P of positive real numbers
satisTes
3
B1. If x,y ∈ P,thenx+y ∈ P.
B2. If x,y ∈ P,thenx·y ∈ P.
B3. If x ∈ R, then one and only one of the following holds: x ∈ P, x =0,
or ?x ∈ P.
Note that B3 implies that if x ∈ P,then?x/∈ P. More importantly, B3
guarantees that R is totally ordered with respect to the order relation ?≤?.
1
Otherwise, if m is odd we can represent it as m =2k+1forsomeintegerk.Butthen
m
2
=4k
2
+4k +1=2(2k
2
+2k) + 1 is odd, contradicting the fact that m
2
is even.
2
In fact, the order relations in Chapter 2 were developed to generalize these concepts
to more abstract spaces than R.
3
Later, we will associate P with the notation R
++
.
3.2. THE ORDER AXIOMS 49
DeTnition 90 Any system satisfying Axiom 1 and Axiom 2 is called an
ordered Teld.
By deTnition then, R is an ordered Teld.
DeTnition 91 Let x,y ∈R. If x?y ∈ P,thenwesayx>yand if x?y ∈
P ∪{0},thenwesayx ≥ y. If ?(x ? y) ∈ P, then we say x<yand if
?(x?y) ∈ P ∪{0}, then we say x ≤ y.
Exercise 3.2.1 Show that (R,≤) is a totally ordered set.
The following properties are a consequence of Axiom 2.
Theorem 92 Let x,y,z ∈ R.(i)Ifx>yand y>z,thenx>z.(i)
Exactly one holds: x>y, x = y, x<y. (iii) If x ≥ y and y ≥ x,then
x = y.
Proof. (i) If x?y ∈ P and y?z ∈ P, then B1 ? (x?y)+(y?z) ∈ P or
(x?z) ∈ P
Exercise 3.2.2 Finish the proof of Theorem 92. (Bartle 5.4)
The next theorem is one of the simplest we will encounter, yet it is one of
the most far-reaching. For one thing, it implies that given anystrictlypositive
real number, there is another smaller and strictly positive real number so that
there is no smallest strictly positive real number!
4
Theorem 93 (Half the distance to the goal line) If x,y ∈R with x>
y, then x>
1
2
(x+y) >y.
Proof. x>y? x+x>x+y and x+y>y+y ? 2x>x+y>2y.
Now we deTne a very useful functoin onRthat assigns to each real number
its distance from the origin.
DeTnition 94 If x ∈ R,theabsolute value of x, denoted |·|: R→R
+
,
is deTned by
|x| =
?
x if x ≥ 0
?x if x<0
4
Another thing it proves is that even if a defense is continuously penalized half the
distance to the goal line, the o?ense will never score unless they Tnally run a play.
50 CHAPTER 3. THE SPACE OF REAL NUMBERS
This function satisTes the well-known property of triangles; that is, the
lenght of any side of a triangle is less than the sum of the lengths of the other
two sides.
Theorem 95 (Triangle Inequality) If x,y ∈R,then|x+y|≤|x|+|y|.
Proof. (Sketch) Since x ≤ |x| and y ≤ |y|, then |x|? x ∈ P ∪{0} and
|y| ? y ∈ P ∪ {0}. By Axiom B1, (|x|?x)+(|y|?y) ∈ P ∪ {0}. But
(|x|?x)+(|y|?y)=(|x|+|y|)?(x+y), so (|x|+|y|)?(x+y) ∈ P ∪{0}
or (x+y) ≤ (|x|+|y|). But then |x+y|≤|x|+|y|.
3.3 The Completeness Axiom
This axiom distinguishes R from other totally ordered Telds like Q.To
begin, we use the deTnition of upper and lower bounds from 40 with the
order relation ?≤?onR. If S ? R has an upper and/or lower bound, it has
inTnitely many (e.g. if u is an ub of S,thenu + n is an ub for n ∈ N).
Supremum and inTmum that were deTned in 40 for the general case, can be
characterized in R by the following lemma.
Lemma 96 Let S ? R.Thenu ∈ R is a supremum (or sup or least
upper bound (lub)) of S i? (i) s ≤ u, ?s ∈ S and (ii) ?ε>0, ?s ∈ S such
that u?ε<s.
5
Similarly, ! ∈R is an inTmum (or inf or greatest lower
bound (glb)) of S i? (i) ! ≤ s, ?s ∈ S and (ii) ?ε>0, ?s ∈ S such that
s<!+ε. See Figure 3.3.1.
Proof. (?) (i) holds by deTnition (just use ?≤?onR in 40). To see (ii),
suppose u is the least upper bound. Because u?ε<u,then u?ε cannot
be an upper bound. This implies ?s ∈ S such that u?ε<s.
(?) (i) implies u is an upper bound again by deTnition. To see that (ii)
implies u is the least upper bound, consider v<u.Then u?v = ε>0(or
v = u?ε). By (ii), ?s ∈ S such that u?ε = v<s,hence v is not an upper
bound.
InthecasewhereS does not have an upper (lower) bound, we assign
supS = ∞ (infS = ?∞).
5
Statement (i) makes u an ub while (ii) makes it the lub.
3.3. THE COMPLETENESS AXIOM 51
Example 97 A set may not contain its sup. To see this, let S = {x ∈ R :
0 <x<1} and S
0
= {x ∈ R :0≤ x ≤ 1}. Any number u ≥ 1 is an ub for
both sets, but while S
0
contains the ub 1, S does not contain any of its ub!
Also, it?s clear that no number c<1 canbeanubforS.Toseethis,just
apply our famous Theorem 93 . That is, since c<1, then ?s =
1+c
2
>cand
s ∈ S.
Theorem 98 There can be only one supremum for any S ?R.
Proof. If u
1
and u
2
are lub, then they are both ub. Since u
1
is lub and u
2
is
ub, then u
1
≤ u
2
. Similarly, since u
2
is lub and u
1
is ub, then u
2
≤ u
1
. Then
u
1
= u
2
.
The next axiom is critical to establish that Rdoes not have any ?holes? in
it. In particular, it will be su?cient to establish that the set R is ?complete?
(a term that will be made precise in Chapter 4). Don?t be fooled however, it
takes more work than just stating the Axiom to establish completeness.
Axiom 3 (Completeness Property of R) Every non-empty set S ? R
which has an upper bound has a supremum.
From the completeness axiom, it is easy to establish that every non-empty
set which has a lower bound has an inTmum.
A consequence of Axiom 3 is that N (a subset of R) is not bounded above
in R.
Theorem 99 (Archimedian Property) If x ∈R, ?n
x
∈N such that x<
n
x
.
Proof. Suppose not. Then x is an ub for N and hence by Axiom 3 N has a
sup, call it u, and u ≤ x.
6
Since u?1 is not an ub, ?n
1
∈ N 3 u?1 <n
1
.
Then u<n
1
+1andsincen
1
+1∈N, this contradicts that u is an ub of N.
It follows from Theorem 99 that there exists a rational number between
any two real numbers.
Theorem 100 If x,y ∈R with x<y, ?q ∈Q such that x<q<y.
6
Note that u is not necessarily in N, which is why we choose to subtract 1 ∈ N in the
next statement.
52 CHAPTER 3. THE SPACE OF REAL NUMBERS
Exercise 3.3.1 Prove Theorem 100. (Royden p.35)
The following theorem complements the result that there are elements
of R which are not rational in Theorem 88 of section 3.1. It provides an
existence proof of an irrational. We present it since it makes use of Axiom
3. Without the Axiom, the set S = {y ∈ R
+
: y
2
≤ 2} does not have a
supremum.
Theorem 101 ?x ∈R
+
such that x
2
=2.
Proof. Let S = {y ∈R
+
: y
2
≤ 2}.ClearlyS is non-empty (take 1) and S is
bounded above (take 1.5). Let x =supS,which exists by Axiom 3. Suppose
x
2
6=2. Theneitherx
2
< 2orx
2
> 2. First take x
2
< 2. Let n ∈ N be
su?ciently large so that
2x+1
n
< 2 ?x
2
. Then
?
x+
1
n
¢
2
≤ x
2
+
2x+1
n
< 2.
7
This means x +
1
n
∈ S which contradicts that x is an upper bound. Next
take x
2
> 2. Let m ∈ N be su?ciently large so that
2x
m
<x
2
? 2. Then
?
x?
1
m
¢
2
>x
2
?
2x
m
> 2. Since x =supS,then ?s
0
∈ S such that x?
1
m
<s
0
.
Butthisimplies(s
0
)
2
>
?
x?
1
m
¢
2
(or (s
0
)
2
> 2) which contradicts s
0
∈ S.
Exercise 3.3.2 Why doesn?t S = {y ∈R
+
: y
2
+1≤ 0} work?
The next theorem complements the result in Theorem 100 and establishes
that between any two real numbers there exists an irrational number.
Theorem 102 Let x,y ∈R with x<y.Ifι is any irrational number, then
?q ∈Q such that the irrational number ιq satisTes x<ιq<y.
Exercise 3.3.3 Prove Theorem 102. (Bartle)
In fact, there are inTnitely many of both kinds of numbers between x and
y
7
The Trstweakinequalityholdswithequalityonlyifn =1.
3.4. OPEN AND CLOSED SETS 53
3.4 Open and Closed Sets
In this section we deTne the most common subsets of real numbers and
determine some of their properties.
DeTnition 103 If a,b ∈R, then the set {x ∈R : a<x<b} ({x ∈R : a ≤
x ≤ b}, {x ∈ R : a ≤ x<b})iscalledaopen (closed, half-open) cell
denoted (a,b) ( [a,b],[a,b) ) respectively with endpoints a and b.Ifa ∈ R,
then the set {x ∈R : a<x} ({x ∈R : a ≤ x})iscalledanopen (closed)
ray denoted (a,∞) ( [a,∞) ), respectively. An interval in R is either a
cell, a ray, or all of R.
A generalization of the notion of an open interval is that of an open set.
DeTnition 104 AsetO ?R is open if for each x ∈ O, there is some δ>0
such that the open interval B
δ
(x)={y ∈ O : |x?y| <δ}? O.
Example 105 (0,1) ?R is open since for any x arbitrarily close to 1 (i.e.
x =1?ε, ε > 0 arbitrarily small), there is an open interval B
ε
2
(1 ?ε) ?
(0,1) by Theorem 93. (0,1] is not open since there does not exist δ>0 for
which B
δ
(1) ? (0,1]. That is, no matter how small δ>0 is, there exists
x
0
=1+
δ
2
∈ B
δ
(1) by Theorem 93 which is not contained in (0,1].Se
Figure 3.4.1.
Theorem 106 (i)? and R are open. (ii) The intersection of any Tnite col-
lectionofopensetsinR is open. (iii) The union of any collection of open
sets in R is open.
Proof. (i)? contains no points, hence DeTnition 104 is trivially satisTed.
8
R
is open since all y 6= x are already in R.
(ii) Let {O
i
: O
i
? R,O
i
open, i =1,...,k} be a Tnite collection of open
sets. We must show O = ∩
k
i=1
O
i
is open. Assume x ∈R.BydeTnition of an
intersection, x ∈ O
i
,?i =1,...,k. Since each O
i
is open, we can Tnd B
δ
i
(x) ?
O
i
for each i. Let δ =min{δ
i
: i =1,...,k}. Then B
δ
(s) ? B
δ
i
(s) ? O
i
,?i.
This implies B
δ
(s) ? O.
(iii) Take x ∈ O = ∪
i∈Λ
O
i
,whereΛ is either a Tnite or inTnite index set.
Since O
i
is open, ?B
δ
(x) ? O
i
?∪
i∈Λ
O
i
.
8
In particular, the statement x ∈? is always false. Thus, according to the truth table,
any implication of the form x ∈?? P(x)istrue.
54 CHAPTER 3. THE SPACE OF REAL NUMBERS
Example 107 Property (ii) of Theorem 106 does not necessarily hold for
inTnite intersections. Consider the following counterexample. Let O
n
=
{x ∈ R : ?
1
n
<x<
1
n
,n∈ N}. Then ∩
∞
n=1
O
n
= {0}, but a singleton set is
not open since there does not exist δ>0 such that B
δ
(0) ? {0}.SeeFigure
3.4.2.
The following theorem provides a characterization of open sets in R.
Theorem 108 (Open Sets Property in R) Every open set in R is the
union of a countable collection of disjoint open intervals.
Proof. The proof is in several steps. First, construct an open interval around
each y ∈ O. Let O be open. Then, for each y ∈ O, ? an open interval (x,z)
such that x<y<zand (x,z) ? O.Letb =sup{z :(y,z) ? O} and
a =inf:(x,y) ? O}. Then a<y<band I
y
=(a,b)isanopeninterval
containing y.
Second, show the constructed interval is contained in O. Take any w ∈
(a,b)withw>y.Theny<w<band by the deTnition of b (i.e. it is
the sup), we know w ∈ O. An identical argument establishes that if w<y,
w ∈ O.
Third, show the constructed interval is open (i.e. a,b /∈ O). If b ∈ O,then
since O is open, ?ε>0 such that (b?ε,b+ε) ? O and hence (y,b+ε) ? O
which contradicts the deTnition of b.
Fourth, show the union of constructed intervals is O. Let w ∈ O.Then
w ∈ I
w
and hence w ∈∪
y∈O
I
y
Fifth, establish that the intervals are disjoint. Suppose y ∈ (a
1
,b
1
) ∩
(a
2
,b
2
). Since b
1
=sup{z :(y,z) ? O} and (y,b
2
) ? O,thenb
1
≤ b
2
. Since
b
2
=sup{z :(y,z) ? O} and (y,b
1
) ? O,thenb
2
≤ b
1
. But b
1
≤ b
2
and
b
1
≥ b
2
implies b
1
= b
2
. A similar argument establishes that a
1
= a
2
.Thus,
two di?erent intervals in {I
y
} are disjoint.
Sixth, establish that {I
y
} is countable. In each I
y
, ?q ∈ Q such that
q ∈ I
y
by Theorem 100. Since I
y
are disjoint, q ∈ I
y
and q
0
∈ I
y
0,fory 6= y
0
implies q 6= q
0
. Hence there exists a one-to-one correspondence between the
collection {I
y
} and a subset of the rational numbers. Thus, {I
y
} is countable
by an argument similar to that in Example 77.
Figure 3.4.3 illustrates the theorem for the open set O = O
1
∪O
2
where
O
1
=(?1,0) and O
2
=(
√
2,∞). Part (a) of the Tgure illustrates steps 1
to 4. For example, take y = ?
1
4
∈ O
1
. Then the supremum of the set of
3.4. OPEN AND CLOSED SETS 55
upper interval endpoints around ?
1
4
contained in O
1
is b
?
1
4
=0andthe
inTmum of the set of lower interval endpoints around ?
1
4
contained in O
1
is
a
?
1
4
= ?1sothatI
?
1
4
=(?1,0) which is just O
1
. Similarly take y =
3
2
.Then
the supremum of the set of upper interval endpoints around
3
2
contained in
O
2
is b3
2
= ∞ and the inTmumofthesetoflowerintervalendpointsaround
3
2
contained in O
2
is a3
2
=
√
2sothatI3
2
=(
√
2,∞)whichisjustO
2
. Part
(b) of the Tgure illustrates step 6, where the injection is Tnite (and hence
countable).
Nowwemoveontoclosedsets.
DeTnition 109 C ?R is closed if its complement (i.e. R\C)isopen.
Example 110 [0,1] ?R is closed since its complement R\[0,1] = (?∞,0)∪
(1,∞) is open since the union of open sets is open by Theorem 106. (0,1] is
not closed since its complement, (?∞,0)∪[1,∞), is not open. The singleton
set {1} is closed since its complement, (?∞,1) ∪ (1,∞) is open. The set
N is closed since its complement, (∪
∞
n=1
(n?1,n))∪(?∞,0), is a countable
union of open sets and hence by Theorem 106 is open.
There is another way to describe closed sets which uses cluster points.
DeTnition 111 Apointx ∈ R is a cluster point of a subset A ? R if
any open ball around x intersects A at some point other than x itself (i.e.
(B
δ
(x)\{x})∩A 6= ?).
Note that the point x may lie in A or not. A cluster point must have
points of A su?ciently near to it as the next examples show.
Example 112 (i) Let A =(0,1]. Then every point in the interval [0,1] is a
cluster point of A.In particular, the point 0 is a cluster point since for any δ>
0, ?y =
δ
2
∈ B
δ
(0) such that B
δ
(0)∩A ? A. (ii) Let A = {
1
n
,n∈ N}. Then
0 istheonlyclusterpointofA. To see why, for any δ, just take n
δ
=
1
δ
+1,
in which case for any δ>0, ?y =
δ
1+δ
∈ B
δ
(0) such that B
δ
(0)∩A ? A.(iii)
Let A = {0}∪(1,2). Then [1,2] are the only cluster points of A since for any
δ ∈ (0,1),B
δ
(0)∩A = ?. (iv) N has no cluster points for the same reason as
(iii). (v) Let A = Q. The set of cluster points of A is R. This follows from
Theorem 100 that between any two real numbers lies a rational. See Figure
3.4.4.
56 CHAPTER 3. THE SPACE OF REAL NUMBERS
WenextuseAxiom3toproveaveryimportantpropertyofR;every
nested sequence of closed intervalshasacommonpoint(andwecantake
that common point to either be the sup of the lower endpoints or the inf of
the upper endpoints). First we must make that statement precise.
Example 113 Returning to Example 97 where the open interval S =(0,1)
and the closed interval S
0
=[0,1] are both bounded (and hence both possess
a supremum by Axiom 3), only the closed interval S
0
contains its supremum
of 1 (ie. has a maximum).
DeTnition 114 A set of intervals {I
n
,n∈N} is nested if I
1
? I
2
? ... ?
I
n
? I
n+1
? ...
Example 115 A nested set of intervals does not necessarily have a common
point (i.e. ∩
∞
n=1
I
n
= ?). For example, neither I
n
=(n,∞) (so that (1,∞) ?
(2,∞) ? ...)norI
n
=(0,
1
n
) (so that (0,1) ? (0,
1
2
) ? ...) have common
points. Why? It follows from the Archimedean Property 99 that for any
x ∈R,?n ∈N such that 0 <
1
n
<x.SeeFigure3.4.5.
Theorem 116 (Nested Intervals Property in R) If {I
n
,n∈N} is a set
of non-empty, closed, nested intervals in R,then?x ∈R such that ∩
∞
n=1
I
n
6=
?.
Proof. Let I
n
=[a
n
,b
n
]witha
n
≤ b
n
.SinceI
1
? I
n
,thenb
1
≥ b
n
≥ a
n
.
Hence {a
n
: n ∈N} is bounded above and let α be its sup. To establish the
claim, it is su?cient to show α ≤ b
n
,?n ∈ N. Suppose not. Then ?m ∈
N 3b
m
<α.Sinceα =sup{a
n
: n ∈ N}, ?a
p
>b
m
.Letq =max{p,m}.
Then b
q
≤ b
m
<a
p
≤ a
q
. But b
q
<a
q
contradicts I
q
is a non-empty interval.
Thus a
n
≤ α ≤ b
n
or α ∈ I
n
, ?n ∈ N.IfI
n
is not closed, then the last
statement (α ∈ I
n
) doesn?t necessarily hold.
See Figure 3.4.6. Note that the same arguments can be applied so that
β =inf{b
n
|n ∈N} is in every interval.
Example 117 Let us return to Example 115. Instead of the open interval
I
n
=(0,
1
n
) consider the closed interval I
n
=[0,
1
n
] for which sup{a
n
|n ∈
N} =0. But it is clear that 0 is indeed in every nested interval. Another
example of Theorem 116 may be I
n
=[?
1
n
,1+
1
n
]. Obviously this is nested
since [?1,2] ? [?
1
2
,
3
2
] ? [?
1
3
,
4
3
] ? ... In this case the sup{a
n
: n ∈ N} =
sup{?1,?
1
2
,?
1
3
,...} =0,which is again in every interval. See Figure 3.4.7.
3.4. OPEN AND CLOSED SETS 57
We need the following important result to show that R doesn?t have any
?holes? in it.
9
In Section 4.2, we will show the precise meaning of this
?absence of holes? property known as completeness. For now, one should
simplyrecognizethattoruleoutholes, we need to draw out the implications
of the Completeness Axiom 3. We do this through the next theorem.
Theorem 118 (Bolzano-Weierstrass) Every bounded inTnite subset A ?
R has a cluster point.
Proof. (Sketch) If A is bounded, then there is a closed interval I such
that A ? I.Bisect I.ThereareinTnitely many elements in at least one of
the bisections. Denote such a bisection I
1
? I.Bisect I
1
.Again,thereare
inTnitely many elements in at least one of the bisections. Denote such a
bisection I
2
? I
1
.By continuing this process we construct a set {I
n
,n∈ N}
of non-empty, closed, nested intervals in R. By Theorem 116, there is a point
x
?
∈∩
∞
n=1
I
n
, which is a cluster point of A.
Exercise 3.4.1 Show that x
?
∈∩
∞
n=1
I
n
in Theorem 118 is a cluster point of
A to Tnish the proof.
In the proof we enclosed A in a closed interval I =[a,b]andshowedthat
any inTnite subset of I has a cluster point. This special property of [a,b]is
called the Bolzano-Weierstrass property.
DeTnition 119 AsubsetA ? R has the Bolzano-Weierstrass property
if every inTnite subset of A has a cluster point belonging to A.
We did not show that any inTnite subset of A has a cluster point. The
next example illustrates this.
Example 120 Let A =(a,b) with b?a>1.DeTne B = {a+
1
n
,n∈N}? A.
TheonlyclusterpointofB is a,which doesn?t belong to A. Thus open sets
like (a,b) don?t have the Bolzano-Weierstrass property. Boundedness is also
important. Let A = R. Then N is an inTnite subset of R which does not have
aclusterpoint.
9
In Section 4.2, we will show the preceise meaning of the ?absence of holes? property
known as completeness. For now, one should simply recognize that to rule out holes, we
need to draw out the implications of the completeness axiom. We do this through the
Bolzano-Weierstrass Theorem.
58 CHAPTER 3. THE SPACE OF REAL NUMBERS
We next present a necessary and su?cient condition for a subset of R to
have the Bolzano-Weierstrass property. This important result is known as
the Heine-Borel theorem.
10
Theorem 121 (Heine-Borel) A ? R has the Bolzano-Weierstrass prop-
erty i? A is closed and bounded.
Proof. (?)IfA is Tnite, then A has the B-W property since inTnite subsets
of a Tnite set is a false statement.
11
Let A be inTnite and let B be an inTnite
subset of A. Since B is bounded, it can be enclosed in a closed interval. Using
the same procedure as in Theorem 118 we construct a cluster point x
?
of B
and hence also of A. Since A is closed, x
?
∈ A.
(?) ?closedness?. Let x
?
be a cluster point of A.Then for each δ =
1
n
,
?x
n
∈ A such that |x
?
?x
n
| <
1
n
. The set {x
n
}
n∈N
is an inTnite subset of A
which has the B-W property so that x
?
∈ A.
?boundedness?. By contradiction. Suppose A is unbounded. Then for
any n, ?x
n
∈ A such that x
n
>n.Then {x
n
} is an iniTnite subset of A
which doesn?t have a cluster point (since |x
n+2
?x
n
| > 1 for all n). But this
contradicts the B-W property.
We next use the Nested Intervals Property in R (Theorem 116) to estab-
lish the uncountability of the set of real numbers.
Theorem 122 [0,1] is uncountable.
Proof. Suppose not. Then there exists a bijection b : N → [0,1]. Then
all elements from [0,1] can be numbered {x
1
,x
2
,...,x
n
,...}. Divide [0,1] into
three closed intervals: I
1
1
=[0,
1
3
],I
1
2
=[
1
3
,
2
3
],I
1
3
=[
2
3
,1]. This implies x
1
is
not contained in at least one of these three intervals.
12
WLOG, say it is I
1
1
.
Divide I
1
1
into three closed intervals: I
2
1
=[0,
1
9
],I
2
2
=[
1
9
,
2
9
],I
1
3
=[
2
9
,
1
3
]. This
implies ?I
2
such that x
2
/∈ I
2
.NoticethatI
2
? I
1
and that x
1
,x
2
/∈ I
2
. In
this way we can construct a sequence {I
n
}
∞
n=1
with the following properties:
(i) I
n
is closed; (ii) I
1
? I
2
? ... ? I
n
? ... (i.e. nested intervals); and
(iii) x
i
/∈ I
n
, ?i =1,...,n. From (i) and (ii), Theorem 116 implies ?x
0
∈
∩
∞
n=1
I
n
? [0,1].Sowehavefoundarealnumberx
0
∈ [0,1] which is di?erent
10
Those of you experienced readers may associate Heine-Borel with compactness. Since
we wanted to keep this section simple, we?ll put o? the treatment of compactness until we
work with more general metric spaces in Section 4.3.
11
And from a false statement, the implication is true by the truth table.
12
It is possible x
1
is an element of 2 closed intervals (e.g. x
1
=
1
3
).
3.4. OPEN AND CLOSED SETS 59
from any x
i
, i =1,2,...This contradicts our assumption that {x
1
,x
2
,...} are
all real numbers from [0,1].
While the above theorem establishes that [0,1] is uncountable (i.e. and
hence really big in one sense), we next provide an example of an uncountable
subset of [0,1] that is somehow small in another sense. This concrete example
is known as the Cantor set and is constructed in the following way (see Figure
3.4.8). First, divide [0,1] into three ?equal? parts: [0,
1
3
], (
1
3
,
2
3
), [
2
3
,1].
13
DeTne F
1
=[0,
1
3
]∪[
2
3
,1] or equivalently F
1
=[0,1]\A
1
where A
1
=(
1
3
,
2
3
).
That is, to construct F
1
we take out the center of [0,1]. Second, divide each
part of F
1
into three equal parts (giving us now 6 intervals). DeTne F
2
=
[0,
1
9
]∪[
2
9
,
3
9
]∪[
6
9
,
7
9
]∪[
8
9
,1] or F
2
=[0,1]\A
2
where A
2
=(
1
9
,
2
9
)∪(
3
9
,
6
9
)∪(
7
9
,
8
9
).
That is, to construct F
2
we take out the center of each of the two intervals
in F
1
. By this process of removing the open ?middle third? intervals, we
construct F
n
, ?n ∈N. The Cantor set is just the intersection of the sets F
n
.
That is,
F =
n∈N
F
n
?
≡ [0,1][
n∈N
A
n
!
.
The Cantor set has the following properties:
1. F is nonempty (by Theorem 116).
2. F is closed because it is the intersection of closed intervals F
n
(by (iii)
of Corollary ??,eachF
n
is closed because it is the union of Tnitely
many closed intervals).
3. F doesn?t contain any interval (a,b)witha<b(by construction).
4. F is uncountable (by the same argument used in the proof of Theorem
122).
There are two important things to note about the Cantor set. First, while
Theorem 108 says that any open set can be expressed as a countable union
of open intervals, properties (1)-(4) of the Cantor set shows that there is no
analogous result for closed sets. That is, a closed set may not in general be
written as a countable union of closed intervals. In this sense, closed sets
13
The sense in which we mean equal parts is that while the sets are di?erent (some are
closed, some open), they have the same distance between endpoints of
1
3
(more formally,
theyhavethesamemeasure).
60 CHAPTER 3. THE SPACE OF REAL NUMBERS
can have a more complicated structure than open sets. Second, property (4)
above shows that even though F
n
seem to be getting smaller and smaller in
one sense (i.e. that it has many holes in it) in Figure 3.4.9, F is uncountable
(and hence large in another sense).
3.5 Borel Sets
Since the intersection of a countable collection of open sets need not be open
(e.g. Example 107), the collection of all open sets in R is not a σ-algebra. By
Theorem 87, however, there exists a smallest σ-algebra containing all open
sets.
DeTnition 123 The smallest σ-algebra generated by the collection of all
open sets in R,denotedB, is called the Borel σ?algebra in R.
Just as Example 83 showed in the case of algebras, even though B is the
smallest σ-algebra containing all open sets, it is bigger than just the collection
of open sets. For example, we have to add back in singleton sets like those
in Example 107 (i.e. the closed set {0} = ∩
n∈N
(?
1
n
,
1
n
)) in order to keep it
closed under countable intersection.
14
Infact,almostanysetthatyoucan
conceive of is contained in the Borel σ-algebra: open sets, closed sets, half
open intervals (a,b], sets of the form ∩
n∈N
O
n
with O
n
open (which we saw
is not necessarily open), sets of the form ∪
n∈N
F
n
with F
n
closed (which we
saw is not necessarily closed), and more. On the other hand, while Tnding
a subset of R which is not Borel requires a rather sophisticated construction
(see p.??? of Jain and Gupta (1986)), the size of the collection of non-Borel
sets is much bigger than the size of B. Loosely speaking, B is as thin in P(R)
as N is in R (as we will see in Chapter 5).
Exercise 3.5.1 Prove that the following sets in R belong to B:(i)anyclosed
set; (ii) (a,b].
Borel sets can be generated by even smaller collections than all open sets
as the next theorem shows.
14
Recall in Example 83, for underlying set X = {a,b,c}, we showed that while C
4
=
{?,{a},X} was not an algebra (just as the collection of all open sets is not an algebra), we
can create an algebra generated by {a} (whose analogue is the Borel σ-algebra) which is
just C
2
= {?,{a},{b,c},X} ? P(X) and is ?bigger? in the sense of C
4
? C
2
(where {b,c}
plays the anologue of the other sets we have to add in).
3.5. BOREL SETS 61
Theorem 124 The collection of all open rays {(a,∞):a ∈R} generates B.
Proof. It is su?cient to show that any open set A can be constructed in
terms of open rays. By Theorem 108, we know that A = ∪
∞
n=1
I
n
where I
n
are
disjoint open intervals. But (a,b)=(a,∞)£
∩
∞
n=1
?
b?
1
n
,∞
¢¤
with a<b.
Exercise 3.5.2 Using the same idea, show that B can be generated by the
collection of all closed intervals {[a,b]:a,b ∈R,a<b}.
62 CHAPTER 3. THE SPACE OF REAL NUMBERS
Figures for Chapter 3
Figure 3.3.1: ub, lb, sup, inf
Figure 3.4.1: Open and Half-open unit intervals
Figure 3.4.2: Example where Countable Intersection of Open Intervals is
not Open
Figure 3.4.3a&b:Open Sets as a Countable Union of Disjoint Intervals
Figure 3.4.4: Examples of Cluster points
Figure 3.4.5: Examples of Nested Cells without a Common Point
Figure 3.4.6: Nested Cells Property
Figure 3.4.7: Example of a Common Point in Nested Cells
Figure 3.4.8 Cantor Set
3.6. BIBILOGRAPHY FOR CHAPTER 3 63
3.6 Bibilography for Chapter 3
Sections3.1to3.3arebasedonBartle(Sec4-6)andRoyden(Ch2.,Sec1
and 2).
64 CHAPTER 3. THE SPACE OF REAL NUMBERS
3.7 End of Chapter Problems.
1. Let D be non-empty and let f : D →R have bounded range. If D
0
is
anon-emptysubsetofD, prove that
inf{f(x):x ∈ D}≤ inf{f(x):x ∈ D
0
}≤ sup{f(x):x ∈ D
0
}≤ sup{f(x):x ∈ D}
2. Let X and Y be non-empty sets and let f : X ×Y →R have bounded
range in R.Let
f
1
(x)=sup{f(x,y):y ∈ Y},f
2
(y)=sup{f(x,y):x ∈ X}
Establish the Principle of Iterated Suprema:
sup{f(x,y):x ∈ X,y ∈ Y} =sup{f
1
(x):x ∈ X} =sup{f
2
(y):y ∈ Y}
(We sometimes express this as sup
x,y
f(x,y)=sup
x
sup
y
f(x,y)=
sup
y
sup
x
f(x,y)).
3. Let f and f
1
be as in the preceding exercise and let
g
2
(y)=inf{f(x,y):x ∈ X}.
Prove that
sup{g
2
(y):y ∈ Y}≤ inf{f
1
(x):x ∈ X}
(We sometimes express this as sup
y
inf
x
f(x,y) ≤ inf
x
sup
y
f(x,y)).
Chapter 4
Metric Spaces
There are three basic theorems about continuous functions in the study of
calculus (upon which most of calculus depends) that will prove extremely
useful in your study of economics. They are the following:
1. The Intermediate Value Theorem. If f :[a,b] →R is continuous and if
r ∈R such that f(a) ≤ r ≤ f(b),then ?c ∈ [a,b] such that f(c)=r.
2. The Extreme Value Theorem. If f :[a,b] → R is continuous, then
?c ∈ [a,b] such that f(x) ≤ f(c), ?x ∈ [a,b].
3. The Uniform Continuity Theorem. If f :[a,b] →R is continuous, then
given ε>0,?δ>0 such that |f(x
1
)?f(x
2
)| <ε, ?x
1
,x
2
∈ [a,b]for
which |x
1
?x
2
| <δ.
These theorems are used in a number of places. The intermediate value
theorem forms the basis for Txed point problems such as the existence of equi-
librium. The extreme value theorem is useful since we often seek solutions to
problems where we maximize a continuous objective function over a compact
constraint set. The uniform continuity theorem is used to prove that every
continuous function is integrable, which is important for proving properties
of the value function in stochastic dynamic programming problems.
While we write these theorems in terms of real numbers, they can be
formulated in more general spaces than R. To this end, we will introduce (al-
literatively) the 6 C?s: convergence, closedness, completeness, compactness,
connectedness, and continuity. In this chapter, we formulate these properties
in terms of sequences. Each of the C properties uses some notion of distance.
65
66 CHAPTER 4. METRIC SPACES
For instance, convergence requires the distance between a limit point and
elements in the sequence to eventually get smaller.Our goal in this chapter,
is to consider theorems like those above but for any arbitrary set X. To do
so however, requires X to be equipped with a distance function.
How will we proceed? First we will clarify what is meant by a distance
function on an arbitrary set X. Then using the notion of convergence, which
relies on distance, we will deTne closed sets in X. Then the collection of all
closed (or by complementation open) sets is called a topology on a set X and
it is the main building block in real analysis. That means properties such as
continuity, compactness, and connectedness are deTned directly or indirectly
in terms of closed or open sets and for this reason are called topological
properties. While there is an even more general way of deTning a topology
on X that doesn?t use the notion of distance, we will wait until Chapter 7 to
discuss it.
DeTnition 125 A metric space (X,d) is a nonempty set X of elements
(called points) together with a function d : X×X →Rsuch that?x,y,z ∈ X :
(i) d(x,y) ≥ 0; (ii) d(x,y)=0i? x = y; (iii) d(x,y)=d(y,x);and(iv)
d(x,z) ≤ d(x,y)+d(y,z). The function d is called a metric.
Example 126 We give three examples. First, let X beaset(e.g. X =
{a,b,c,d})anddeTne a metric d(x,y)=0for x = y,andd(x,y)=1
for x 6= y. Thisiscalledthe?discretemetric?. Itiseasytocheckthat
(X,d) is a metric space. Second, (R, |·|), where d is simply the abso-
lute value function and property (iv) is simply a statement of the trian-
gle inequality. Thus, Chapter 3 should be seen as a special case of this
chapter. Third, let X be the set of all continuous functions on [a,b] and
d(f,g)=sup{|f(x)?g(x)|,x∈ [a,b]}.In Chapter 6, we will see this as well
as other metrics are valid metric spaces.
It should be emphasized that a metric space is not just the set of points X
but the metric d as well. To see this, we introduce the notion of the cartesian
product of metric spaces. Let (X,d
x
)and(Y,d
y
) be two metric spaces, then
we can construct a metricdonX×Y from the metricsd
x
andd
y
.In fact, there
are many metrics we can construct: d
2
(x,y)=
q
(d
x
(x
1
,y
1
))
2
+(d
y
(x
2
,y
2
))
2
and d
∞
(x,y)=sup{d
x
(x
1
,y
1
),d
y
(x
2
,y
2
)}.
Exercise 4.0.1 Show that d
2
and d
∞
are metrics in X ×Y.
67
Next, metrics provide us with the ability to measure the distance between
two sets (if one of the sets is a singleton, then we can measure the distance
of a point from a set).
DeTnition 127 Let A ? X and B ? X. The distance between sets A
and B is d(A,B)=inf{d(x,y),x∈ A,y ∈ B}.
We note that any subset of a metric space is a metric space itself.
DeTnition 128 If (X,d) is a metric space and H ? X, then (H,d|
r
H) is
also a metric space called the subspace of (X,d).
1
Example 129 ([0,1],|·|) isametricspacewhichisasubspaceof(R,|·|).
In a metric space, we can extend the notion of open intervals in DeTnition
(104).
DeTnition 130 For x ∈ X, we call the set B
δ
(x)={y ∈ X : d(x,y) <δ}
an open ball with center x and radius δ. In this case, G is open if
?x ∈ G, B
δ
(x) ? G.
2
Don?t assume that an open ball is an open set. We still don?t know what
an open set is. We will prove this in the next section. Also note that a ball
is deTned relative to the space X, so that if for example X = N, then a ball
of size δ =1.5around5isjust{4,5,6}. The next example shows that balls
don?t need to be ?round?. Their shape depends on their metric.
Example 131 In R
2
, Figure 4.1 illustrates a ball with metric d
1
(x,y)=
|x
1
?y
1
|+|x
2
?y
2
|, one with a Euclidean metric d
2
(x,y)=
p
(x
1
?y
1
)
2
+(x
2
?y
2
)
2
,
and one with a sup metric d
∞
(x,y)=sup{|x
1
?y
1
|,|x
2
?y
2
|}.
Before proceeding, we brie?y mention some of the important results that
you will see in this chapter. Here we extend the Heine Borel Theorem 121
of Chapter 3 to provide necessary and su?cient conditions for compactness
in general metric spaces in Theorem 198. We also introduce the notion of a
Banach space (a complete normed vector space) and for the Trst time give
an example of an inTnite dimensional Banach space. In many theorems that
1
Note that we restrict the metric function to the set H using DeTnition 56.
2
Don?tassumethatanopenballisanopenset.WewillprovethisinSectionX.
68 CHAPTER 4. METRIC SPACES
follow, the dimensionality of a Banach space plays a crucial role. Another
important set of results pertain to the properties of a continuous function on
a connected domain (a generalization of the Intermediate Value Theorem is
given in Theorem 254) as well as a continuous function on a compact domain
(a generalization of the Extreme Value Theorem is given in Theorem 261 and
the Uniform Continuity Theorem in general metric spaces is given in The-
orem ??). Since many applications in economics result in correspondences,
we spend considerable time on upper and lower hemicontinuous correspon-
dences. Probably one of the most important theorems in economics is Berge?s
Theorem of the Maximum 295. The chapter concludes with a set of Txed
point theorems that are useful in proving the existence of general equilibrium
or existence of a solution to a dynamic programming problem.
4.1 Convergence
In this section we will build all the topological properties of a metric space
in terms of convergent sequences (as an alternative to building upon open
sets). In many cases, the sequence version (of deTnitions and theorems) is
more convenient, easier to verify, and/or easier to picture.
DeTnition 132 If X is any set, a Tnite sequence (or ordered N-tuple)
in X is a function f : Ψ
N
→X denoted <x
n
>
N
n=1
.AninTnite sequence
in X is a function f : N→ X denoted <x
n
>
∞
n=1
(or <x
n
> for short).
When there is no misunderstanding, we assume all sequences are inT-
nite unless otherwise noted. We use the <x
n
> notation to reinforce the
di?erence from {x
n
|n ∈N} since order matters for a sequence.
Example 133 There are many ways of deTning sequences. Consider the
sequence of even numbers < 2,4,6,...>. One way to list it is < 2n>
n∈N
.
Another way this is to specify an initial value x
1
and a rule for obtaining
x
n+1
from x
n
. In the above case x
1
=2and x
n+1
= x
n
+2,n∈N.
It is possible that while a sequence doesn?t have some desired properties,
but a subset of the sequence has the desired properties.
DeTnition 134 A mapping g : N→N is monotone if (n>m) implies
(g(n)) > (g(m)).Iff : N→X is an (inTnite) sequence, then h is an (inTnite)
subsequence of f if there is a monotone mapping g:N→N such that h =
f ?g, denoted <x
g(n)
>.
4.1. CONVERGENCE 69
Example 135 Consider the sequence f : N→{?1,1} given by < (?1)
n
>
n∈N
. If g(n)=2n for n ∈N (i.e. the even indices), then the subsequence h = f?g
is simply < 1,1,...>while if g(n)=2n?1 (i.e. the odd indices), then the
subsequence is < ?1,?1,...>. See Figure 4.1.1.
DeTnition 136 A sequence <x
n
> from a metric space (X,d) converges
to the point x ∈ X (or has x as a limit), if given any δ>0,?N (which
maydependonδ)suchthatd(x,x
n
) <δ, ?n ≥ N(δ). In geometric terms,
this says that <x
n
> converges to x if every ball around x contains all but
a Tnite number of terms of the sequence. We write x = limx
i
or x
i
→ x
to mean that x is the limit of <x
i
>. If a sequence has no limit, we say it
diverges.
Example 137 Toseeanexampleofalimit,considerthesequencef :
N→R given by < (
1
n
) >
n∈N
. In this case the lim < (
1
n
) >
n∈N
=0. To
see why, notice that for any δ>0, it is possible to Tnd an N(δ) such that
d(0,x
n
)=|x
n
| <δ, ?n ≥ N(δ).Forinstance,ifδ =1, then N(1) = 2
(respects the strict inequality), if δ =
1
2
, then N
?
1
2
¢
=3, etc. In general, let
N(δ)=w
?
1
δ
¢
+1where w(x) denotes an operator which takes the whole part
of the real number x. Such natural numbers always exist by the Archimedean
Property (Theorem 99). See Figure 4.1.2.
Theorem 138 (Uniqueness of Limit Points) Asequencein(X,d) can
have at most one limit.
Proof. Suppose, to the contrary, x
0
and x
00
are limits of <x
n
> and x
0
6=
x
00
. Let B
δ
(x
0
)(= {x ∈ X : d(x,x
0
) <δ})andB
δ
(x
00
) be disjoint open balls
around x
0
and x
00
, respectively.
3
Furthermore, let N
0
, N
00
∈ N be such that
if n ≥ N
0
and n ≥ N
00
, then x
n
∈ B
δ
(x
0
)andx
n
∈ B
δ
(x
00
),respectively. Let
k =max{N
0
,N
00
}. But then x
k
∈ B
δ
(x
0
)∩B
δ
(x
00
), a contradiction.
Lemma 139 If <x
n
> in (X,d) converges to x ∈ X, then any subsequence
<x
g(n)
> also converges to x.
Proof. By deTnition 136, ?N(δ)suchthatd(x,x
n
) <δ, ?n ≥ N(δ). Let
<x
g(n)
> be a subsequence of <x
n
>.Sinceg(n) ≥ n,then g(n) ≥ N(gd)
in which case d(x,x
g(n)
) <δ.
The next deTnition gives another notion of convergence to a point which
is the sequential version of DeTnition 111.
3
It is always possible to construct such disjoint balls. Just let δ =
1
4
d(x
0
,x
00
).
70 CHAPTER 4. METRIC SPACES
DeTnition 140 Asequence<x
n
> fromametricspace(X,d) has a clus-
ter point x
?
∈ X if given any δ>0 and given any N,?n ≥ N such that
d(x
?
,x
n
) <δ. In geometric terms, this says that x
?
is a cluster point of
<x
n
> if each ball around x
?
contains inTnitely many terms of the sequence.
Thus if x = lim <x
n
>, then it is a cluster point.
4
However, if x
?
is a
cluster point, it need not be a limit. To see this, note that the key di?erence
between DeTnitions 136 and 140 lie in what terms in the sequence qualify
as a limit or cluster point. If x is a limit point we know ?N(δ)forwhich
d(x,x
n
) <δfor n ≥ N(δ). For a cluster point, given N, it is su?cient
to Tnd just one term in the sequence x
n
su?ciently far out that satisTes
d(x
?
,x
n
) <δ. But then just take n in deTnition 140 as n =max{N(δ),
N}.To see this, consider the next example.
Example 141 Consider the sequence < (?1)
n
>
n∈N
from Example 135. This
sequence has no limit point but two cluster points. To see why, notice that the
only candidate limit points are {?1,1}. Consider x =1.Forallδ ∈ (0,1),
d(1,x
i
)=|1 ? x
i
| >δfor any odd index i =2n ? 1,n∈ N.Asimilar
argument holds for x = ?1. To see why x
?
=1satisTes the deTnition of a
cluster point, notice that for any N, there exists i =2N +1(an odd index)
such that for any δ>0,d(1,x
i
) <δ. For this particular sequence, there are
actually an inTnite number of such indices. See Figure 4.1.1.
Example 141 provides a sequence which does not have a limit point (and
hence the assumption of Lemma 139 does not apply). However, it is easy
to see that there is a subsequence
-
(?1)
g(n)
?
n∈N
of odd indices that has a
limit point (which is one of the cluster points of the original sequence). The
following theorem applies to such cases.
Lemma 142 x
?
is a cluster point of <x
n
> i? there exists a subsequence
<x
n
k
>?<x
n
> such that <x
n
k
>→ x
?
as k →∞.
Proof. (?)Ifx
?
is a cluster point, then ?
1
k
> 0, ?x
n
k
such that d(x
?
,x
n
k
) <
1
k
for any N. (?)istrivial.
4
One shouldn?t be confused between cluster points for a set and for a sequence. For
instance, singletons like {1} do not have cluster points, whereas the constant sequence
< 1,1,1,... > does (which is 1).
4.1. CONVERGENCE 71
DeTnition 143 Let (X,d) beametricspaceandA ? X. A is closed if any
convergent sequence of elements from A has its limit point in A.
Theorem 144 (Closed Sets Properties) (i)?and X are closed. (ii) The
intersection of any collection of closed sets in X is closed. (iii) The union
of any Tnite collection of closed sets in X is closed.
Proof. (i) Trivial.
(ii) Let A = ∩
i∈Λ
A
i
where A
i
? X is closed ?i ∈ Λ, which is any index
set. Take any convergent sequence from A and show its limit point is in A
as well. That is, let <x
n
>? A and <x
n
>→ x. Then <x
n
>? A
i
?i ∈Λ
and because A
i
is closed <x
n
>→x and x ∈ A
i
?i ∈Λ implies x ∈∩
i∈Λ
A
i
.
(iii) Let A = ∪
n
i=1
A
i
where A
i
? X is closed ?i ∈ {1,...,n}. Again, take
any convergent sequence from A and show its limit point is in A as well.
In particular, let <x
n
>? A and <x
n
>→ x. There exists A
j
containing
inTnitely many elements of <x
n
> (i.e. A
j
contains a subset <x
n
k
>.By
lemma 139, <x
n
k
>→ x and because <x
n
k
>? A
j
and A
j
is closed, then
x ∈ A
j
implies x ∈ A = ∪
n
i=1
A
i
.
Example 145 Property (iii) of Theorem 144 does not necessarily hold for
inTnite union. Consider the following counterexample. Let F
n
=[?1,?
1
n
]∪
[
1
n
,1]. Then ∪
∞
n=1
F
n
=[?1,0)∪(0,1]. See Figure 4.1.3.
Closed sets can also be described in terms of cluster points.
Theorem 146 AsubsetofX is closed i? it contains all its cluster points.
Proof.
5
(?) By contradiction. Let x be a cluster point of a closed set A
and let x/∈ A. Then x ∈ X\A. Because X\A is open, there exists an open
ball B
δ
x
(x) such that B
δ
x
(x) ? X\A.Thus B
δ
x
(x) is a neighborhood of x
having empty intersection with A.This contradicts the assumption that x is
a cluster point of A.
(?)Letx ∈ X\A.Thenx is not a cluster point of A since A contains
all its cluster points by assumption. Then there exists an open ball B
δ
x
(x)
such that A∩B
δ
x
(x)=?. This implies B
δ
x
(x) ? X\A or that X\A is open,
in which case A is closed.
5
From Munkres Theorem 6.6, p. 97.
72 CHAPTER 4. METRIC SPACES
Exercise 4.1.1 Explain why a singleton set {x} is consistent with Theorem
146.
We now introduce another topological notion that permits us to charac-
terize closed sets in other terms.
DeTnition 147 Given a set A ? X, the union of all its points and all its
cluster points is called the closure of A, denoted A (i.e. A = A∪A
0
where
A
0
is the set of all cluster points of A).
Notice that A is not a partition since A and A
0
are not necessarily disjoint.
Take (iii) of Example 112 where A = {0}∪(1,2), in which case A
0
=[1,2]
and A = {0}∪[1,2].
As an exercise, prove the following theorems.
Theorem 148 Let A ? X. x ∈ A i? any open ball around x has a non-
empty intersection with A.
Theorem 149 The closure of A is the intersection of all closed sets con-
taining A.
Theorem 150 A is closed i? A = A.
Exercise 4.1.2 Prove that A ? A and (A∪B)=A∪B. Give an example
to show that (A∩B)=A∩B may not hold.
Example 151 (i) A = {2,3}.Then A = {2,3}. (ii) A = N. Then A = N.
(iii) A =(0,1]. Then A =[0,1]. (iv) A = {x ∈ Q : x ∈ (0,1)}. Then
A =[0,1].
Intuitively, one would expect that a point x lies in the closure of A if there
is a sequence of points in A converging to x. This is not necessarily true in a
general topological space, but it is true in a metric space as the next lemma
shows.
Lemma 152 Let (X,d) be a metric space and A ? X.Then x ∈ A i? it is a
limit point of a sequence <x
n
> of points from A (i.e. ? <x
n
>? A such
that <x
n
>→ x).
4.1. CONVERGENCE 73
Proof. (?)Takeanyx ∈ A. By Theorem 148, ?δ =
1
n
> 0, ?x
n
∈ A such
that x
n
∈ B1
n
(x). Hence this sequence <x
n
>→ x (a limit point). (?)If
<x
n
>→ x such that <x
n
>? A,thenineveryopenballaroundx there
is x
n
(actually, inTnitely many of them) inside this ball. Then by Theorem
148 x ∈ A.
Exercise 4.1.3 (i) Show that if A is closed and d(x,A)=0,then x ∈ A.Does
(i) hold without assuming closedness of A? (ii) Show that if A is closed and
x
0
/∈ A,thend(x
0
,A) > 0.
In the previous Example 151, we see that in some cases the closure of a
set is: the set itself (i); ?brings? Tnitely many new points to the set (iii); or
brings uncountably many points (iv). This leads us to the notion of density.
DeTnition 153 Given the metric space (X,d),asubsetA ? X is dense in
X if A = X.
Example 154 To see that Q is dense in R, we know that in any ball around
x ∈R there is a rational number. Hence by Theorem 148 x is from Q.Thus
we have R?Q. Obviously, R?Q as well, so R =Q. A similar argument
establishes that the set of irrationals is dense in R.
Intuitively, if A is dense in X then for any x ∈ X, there exists a point in A
that is su?ciently close to (or approximates) x.From the previous example,
since Q is dense in R, any real number can be approximated by a rational
number which is countable. More importantly for applied economists, we
might take X to be the set of continuous functions and A the set of poly-
nomials with rational coe?cients which is again countable. Then, provided
the set of such polynomials is dense in the set of all continuous functions,
working with polynomials will yield a good approximation to the continuous
function we are interested in.
DeTnition 155 Ametricspace(X,d) is separable if it contains a dense
subset that is countable.
Example 156 (R, |·|) is separable since Q is a countable dense subset of
R.
So far in a general metric space we have dealt only with closed sets. Now
we can introduce open sets as follows.
74 CHAPTER 4. METRIC SPACES
DeTnition 157 AsetA ? X is open if its complement is closed.
Exercise 4.1.4 Show that an open ball is an open set.
Example 158 Let A = {(0,1) ×{2}} = {(x,y):0<x<1,y=2} ? R
2
equipped with d
2
. A is not open since no matter how small δ is, there exist
y
0
=2±
δ
2
such that (x,y
0
) ∈ B
δ
((x,2)) is not contained in A. See Figure
4.1.4.
We could have proven the properties of open sets as we did in Theorem
106, but we will not repeat it. Here we will simply mention a few concepts
that will be useful. The Trstconceptisthatofaneighborhood.
DeTnition 159 A neighborhood of x ∈ X is an open set containing x.
Sometimes it is more convenient to use the concept of a neighborhood of
x rather than an open ball around x,but you should realize that these two
concepts are equivalent since an open ball B
ε
(x) is a neighborhood of x and
conversely, if V
x
is a neighborhood of x,then there is a B
ε
(x) ? V
x
. See Figure
4.1.5.
There is another way to describe closed sets which uses boundary points.
DeTnition 160 Apointx ∈ X is a boundary point of A if every open
ball around x contains points in A and in X\A (i.e.(B
δ
(x)∩A)∩(B
δ
(x)∩
(X\A)) 6= ?).Apointx ∈ X is an interior point of A if ?B
δ
(x) ? A.
SeeFigure4.1.6.
Note that a boundary point need not be contained in the set. For example,
the boundary points of (0,1] are 0 and 1.
Example 161 The set of boundary points of Q is R since in any open ball
around a rational number there are other rationals and irrationals by Theo-
rems 100 and 102.
The next theorem provides an alternative characterization of a closed set.
Theorem 162 AsetA ? X is closed i? it contains its boundary points.
4.1. CONVERGENCE 75
Proof. (?) Suppose A is closed and x is a boundary point. If x/∈ A,
then x ∈ X\A (which is open), contrary to x beingaboundarypoint.(?)
Suppose A contains all its boundary points. If y/∈ A,then?B
δ
(y)aproper
subset of X\A. Since this is true ?y/∈ A, X\A is open so A is closed.
Unlike properties like closedness and openness, boundedness is deTned
relative to the distance measure and hence is a metric property rather than
a topological property.
DeTnition 163 Given (X,d), A ? X is bounded if ?M>0 such that
d(x,y) ≤ M, ?x,y ∈ X.
Boundedness cannot be deTned only in terms of open sets. It requires
the notion of distance. Thus it is not a topological property.
Theorem 164 A convergent sequence in the metric space (X,d) is bounded.
Proof. Taking δ =1,weknowbyDeTnition 136,?N(1) such that |x
n
?x| <
1, ?n ≥ N(1).
6
By the triangle inequality, we know |x
n
| = |x
n
?x+x| ≤
|x
n
?x| + |x| < 1+|x|,?n ≥ N(1). Since there are a Tnite number of
indices n<N(1), then we set M =sup{|x
1
|,|x
2
|,...,|x
N(1)?1
|,|x|+1}. Hence,
|x
n
|≤ M, ?n ∈N,sothat<x
i
> is bounded.
4.1.1 Convergence of functions
While we will focus on convergence of functions in Chapter 6, it will be
necessary for some results in the upcoming sections to introduce a form of
functional convergence. A sequence of functions is simply a sequence whose
elements f
n
(x) contain two variables, n and x,wheren indicates the order in
the sequence and xis the variable of a function. For example, <f
n
(x) >=<
x
n
>=<x,x
2
,x
3
,... > for x ∈ [0,1].
What does it mean for a sequence of functions to be convergent? There
are basically two di?erent answers to this question. If we work in a metric
space whose elements are functions themselves with a certain metric, then
convergence of functions is nothing other than convergence of elements (the
element being a function) with respect to the given metric. We will deal
with this type of convergence in Chapter 6 on function spaces. The second
6
Since the sequence converges, we are free to choose any ε>0. Here we simply choose
ε =1.
76 CHAPTER 4. METRIC SPACES
approach is to take any set X along with a metric space (Y,d
Y
)andlet
f
n
: X → Y for all n. Fix x
0
∈ X. Then hf
n
(x
0
)i is a sequence of elements
(the element being a point) in Y. If this sequence is convergent, then it
converges to a certain point y
0
(i.e. hf
n
(x
0
)i → y
0
= f(x
0
). This leads to
the following deTnition.
DeTnition 165 Given any set X and a metric space (Y,d
Y
), let <f
n
> be a
sequence of functions from X to Y. The sequence <f
n
> is said to converge
pointwise to a function f : X → Y if for every x
0
∈ X, lim
n→∞
f
n
(x
0
)=
f(x
0
).Wecallf a pointwise limit of <f
n
>.In other words, <f
n
>
converges pointwise to f on X if ?x
0
∈ X and ?ε>0, ?N(x
0
,ε) such that
?n>N(x
0
,ε) we have d
Y
(f
n
(x
0
),f(x
0
)) <ε.
Notice that if x
0
is Txed, then <f
n
(x
0
) > is simply a sequence of elements
in the metric space (Y,d
Y
).
Example 166 Let f
n
: R→R given by f
n
(x)=
x
n
and f : R→R given
by f(x)=0. Thus, this example is a simple generalization of Example 137.
Then <f
n
> converges pointwise to f sincewecanalwaysTnd a natural
number N(x,ε)=w
?ˉ
ˉ
x
ε
ˉ
ˉ
¢
+1 by the Archimedean Property. See Figure 4.1.7.
Example 167 Let f
n
:[0,1]→R given by f
n
(x)=x
n
and f :[0,1]→R
given by
f(x)=
?
00≤ x<1
1 x =1
.
It is clear that when x =1, then f
n
(x)=1
n
=1=f(x) so that f
n
(1) → 1
trivially. To see that for x ∈ [0,1),f
n
(x) → f(x) note that if we write x =
1
1+a
with a>0 then we can use Bernoulli?s inequality that (1+a)
n
≥ 1+na,
then 0 <x
n
=
?
1
1+a
¢
n
≤
1
1+na
<
1
na
so that we can take N(x,ε)=w
?
1
aε
¢
+1.
SeeFigure4.1.8.
Notice that the rate of convergence N(x
0
,ε)canbeverydi?erent for
each x
0
. In Example 166, the rate is very low for very large x (we say the
rate of convergence is smaller the larger is N). However, if we restict the
domain for f
n
,sayf
n
:[0,2] →R, thenthesmallestpossiblerateforagiven
ε is N(2,ε)=w
?ˉ
ˉ
x
ε
ˉ
ˉ
¢
+1. If it is possible, for a given ε, to Tnd the rate
independently of x,then we call this type of convergence uniform.
4.2. COMPLETENESS 77
DeTnition 168 Given X and a metric space (Y,d
Y
), let <f
n
> be a se-
quence of functions from X to Y. The sequence <f
n
> is said to converge
uniformly to a function f : X → Y if ?ε>0, ?N(ε) such that ?n>N(ε)
we have d
Y
(f
n
(x),f(x)) <ε, ?x ∈ X.
It is apparent from the deTnition that uniform convergence implies point-
wise convergence, but as Examples 166 and 167 show the converse does not
necessarily hold (i.e.the above two sequences of functions do not converge
uniformly but do converge pointwise). In the Trst case, ?ε>0and?n ∈N,
?x
0
∈ R such that x
0
>
ε
n
by the Archimedean property X so that
x
0
n
>ε.
Similarly, in the second example,?ε ∈ (0,1) and?n ∈N,wehave1>ε
1
n
> 0.
Then ?x
0
∈ (0,1) such that 1 >x
0
>ε
1
n
> 0inwhichcasex
n
0
>ε.Onthe
other hand, if we restrict the domain of the Trst example to [?1,1] (or for
that matter any bounded set) f
n
is uniformly convergent since for any ε we
can take N =
1
ε
+1.
4.2 Completeness
The completeness of a metric space is a very important property for problem
solving. For instance, to prove the existence of the solution of a problem,
we usually manage to Tnd the solution of an approximate problem. That
is, we construct a sequence of solutions that are getting closer and closer
to one another via the method of successive approximations. But for this
method to work, we need a guarantee that the limit point exists.Ifthespace
is complete, then the limit of this sequence exists and is the solution of the
original problem. For this reason, we turn to establishing when a given space
is complete.
DeTnition 169 Asequence<x
n
> from a metric space (X,d) is a Cauchy
sequence if given δ>0,?N(δ) such that d(x
m
,x
n
) <δ, ?m,n ≥ N(δ).
Note that if <x
n
> is convergent, then there is a limit point x to which
elements of <x
n
>eventually approach. If <x
n
> is Cauchy, then elements
of <x
n
> eventually approach a point which may or may not exist. Hence all
Cauchy sequences can be divided into two di?erent classes: those for which
?x such that <x
n
>→ x (i.e. convergent Cauchy sequences); and those for
which @x such that <x
n
>→x (i.e. nonconvergent Cauchy sequences).
78 CHAPTER 4. METRIC SPACES
Example 170 Suppose we did not know that there existed a limit in example
137 where < (
1
n
) >
n∈N
in (R,|·|). We can however, establish that this sequence
of real numbers is a Cauchy sequence and hence has a limit. Let m,n ≥ N(δ)
and without loss of generality let m ≤ n. Then d(s
n
,s
m
)=
ˉ
ˉ
1
m
?
1
n
ˉ
ˉ
<
1
m
.
Hence a su?cient condition for this sequence to be Cauchy for any δ>0 is
that N(ε)=w
?
1
δ
¢
+1.
Example 171 Considerthemetricspace(X,d) with X =(0,1] and d = |x|.
Then by Example170, we?ve established that < (
1
n
) >
n∈N
is a Cauchy sequence
that converges (in R )toalimit0 /∈ X.
We now list some results that are not so useful in and of themselves but
will be used repeatedly to prove important theorems in the next few sections.
Lemma 172 Given (X,d),if <x
n
> converges, then <x
n
> is a Cauchy
sequence.
Proof. Let x = lim <x
n
>.Then given δ>0,?N
?
δ
2
¢
such that if
n ≥ N
?
δ
2
¢
, then d(x,x
n
) <
δ
2
.Thusifn,m ≥ N
?
δ
2
¢
, then
d(x
m
,x
n
) ≤ d(x
m
,x)+d(x,x
n
) <
δ
2
+
δ
2
.
Lemma 173 If a subsequence <x
g(n)
> of a Cauchy sequence <x
n
>
converges to x, then <x
n
> also converges to x.
Exercise 4.2.1 Prove Lemma 173.
Lemma 174 A Cauchy sequence in (X,d) is bounded.
Exercise 4.2.2 Prove Lemma 174. It is similar to Lemma 164 in the pre-
ceding section.
The converse of Lemma 172 is not necessarily true. Those spaces for
which the converse of Lemma 172 is true are called complete.
DeTnition 175 If (X,d) has the property that every Cauchy sequence con-
vergestosomepointinthemetricspace,then(X,d) is complete.
4.2. COMPLETENESS 79
Establishing a metric space is complete is a di?cult task since we must
show that every Cauchy sequence converges. In fact, due to Lemma 173 we
can (somewhat) weaken this deTnition,whichgivesusthefollowinglemma.
Lemma 176 (X,d) is complete if every Cauchy Sequence has a convergent
subsequence.
Proof. It is su?cient to show that if <x
n
> is a Cauchy sequence that
has a subsequence <x
g(n)
> which converges to x, then <x
n
> converges
to x. Since <x
n
> is a Cauchy sequence, given δ>0,we can choose N
?
δ
2
¢
large enough such that d(x
m
,x
n
) <
δ
2
, ?m,n ≥ N
?
δ
2
¢
by DeTnition 169.
Since <x
g(n)
> is a convergent subsequence, given δ>0, we can choose
N
?
δ
2
¢
large enough such that d(x
g(n)
,x) <δ, ?g(n) ≥ N
?
δ
2
¢
by DeTnition
136. Combining these two facts and using (iv) of DeTnition 125, d(x
n
,x) ≤
d(x
n
,x
g(n)
)+d(x
g(n)
,x) <δ.
Another useful fact is that if we know a space is complete, then we know
a closed subspace is complete.
Theorem 177 A closed subset of a complete metric space is complete.
Proof. Any Cauchy sequence in the closed subset is a Cauchy sequence in
the metric space. Since the metric space is complete, it is convergent. Since
the subset is closed, the limit also must be from this set.
Establishing that a metric space is not complete is an easier task since
we must only show that one Cauchy sequence does not converge to a point
in the space. Just take ((0,1],|·|) in Example 171 since the limit of < (
1
n
) >
is 0 which is not contained in (0,1].
Example 178 Consider the sequence f : N→Rgiven by < (1+
1
n
)
n
>
n∈N
.It
can be shown that this sequence is increasing and bounded above. Then by the
Monotone Convergence Theorem 324, which is proven in the End of Chapter
Exercises, this sequence converges in (R,|·|). The limit of this sequence
is called the Eulernumbere(e=2.71828...), which is irrational. But then
(Q,|·|) is not complete; the sequence < (1+
1
n
)
n
>
n∈N
?Q is Cauchy (because
it is convergent in R ) but is not convergent in Q.
Example 179 While Q is not complete, N is complete because the only
Cauchy sequences in N are constant sequences (e.g. < 1,1,1,...>), which
are also convergent.
80 CHAPTER 4. METRIC SPACES
We next take up the important question of completeness of (R,|·|). This
takes some work.
Theorem 180 (Bolzano-Weierstrass for Sequences) A bounded sequence
in R has a convergent subsequence.
Proof. Let A =<x
n
> be bounded. If there are only a Tnite number
of distinct values in the sequence, then at least one of these values must
occur inTnitely often. If we deTne a subsequence <x
g(n)
> of <x
n
> by
selecting this element each time it appears we obtain a convergent (constant)
subsequence.
If the sequence <x
n
>contains inTnitely many distict values, then A =<
x
n
> is inTnite and bounded. By the Bolzano-Weirstrass Theorem 118 for
sets (which rested upon the Nested Cells Property, which in turn rested upon
the Completeness Axiom), there is a cluster point x
?
of A =<x
n
>.Then
by Theorem 142 there is a subsequence <x
g(n)
>→ x
?
.
Theorem 181 (Cauchy Convergence Criterion) A sequence in Ris con-
vergent i? it is a Cauchy sequence.
Proof. (?)istrueinanymetricspacebyLemma172.
(?)Let<x
n
> beaCauchysequenceinR.Then it is bounded by Lemma
174 and by Theorem 180 there is a convergent subsequence <x
n
k
>→ x.
Then by Lemma 173, the whole sequence <x
n
> converges to x.
Hence, since completeness requires that any Cauchy sequences converges,
we know from Theorem 181 that (R,|·|) is complete.
4.2.1 Completion of a metric space.
Everymetricspacecanbemadecompete. Theideaisasimpleone. Let
(X,d)beametricspacethatisnotcomplete. LetCS[X]bethesetof
all Cauchy sequences on the incomplete metric space and let <a
n
>,<
b
n
>∈ CS[X]. DeTne (as in DeTnition 26) the equivalence relation ?~ ?by
<x
n
>?<y
n
> i? lim
n→∞
d(x
n
,y
n
) = 0. This relation forms a partition of
CS[X] where in every equivalence class there are all sequences which have
thesamelimit.LetX
?
be the set of all equivalence classes of CS[X]. Then
X
?
with the metric
b
d([<a
n
>],[<b
n
>]) = lim
n→∞
d(a
n
,b
n
)isacomplete
metric space.
4.2. COMPLETENESS 81
Example 182 Reconsider Example 171. The completion of ((0,1),|·|) is
([0,1],|·|). Notice that we added two Cauchy sequences <
1
n
> and < 1?
1
n
>.
Next we demonstrate the process of completion of a metric space (Q,|·|),
which we know by example 178 is not complete since
-?
1+
1
n
¢
n
?
is a non-
convergent Cauchy sequence (in Q). Let CS(Q)bethesetofallCauchy
sequences. An equivalence relation, deTned in 26, partitions CS(Q)into
classes like those shown in Figure 4.2.1: classes of convergent Cauchy se-
quences such as
-
1+
1
n
?
and
-
1?
1
n
?
(which converges to the rational num-
ber 1) and classes of non-convergent (in Q) like
-?
1+
1
n
¢
n
?
(which converges
to e which is not in Q). Loosely speaking, we can then assign the number
1 to the class of convergent Cauchy sequences and can assign e to the non-
convergent Cauchy sequence. How can we compare two metric spaces with
completely di?erent objects (e.g. one containing classes of Cauchy sequences
and the other containing real numbers)?
DeTnition 183 Let (X,d
X
) and (Y,d
Y
) be two metric spaces. Let f : X →
Y have the following property
d
X
(x,y)=d
Y
(f(x),f(y)). (4.1)
Afunctionf having this property is called an isometry.
By (4.1) it is clear that an isometry is always an injection. If f is also
a surjection, then it is a bijection and in this case we say that (X,d
X
)and
(Y,d
Y
)areisometric. Two isometric spaces might have completely di?erent
objects, but due to (4.1) and the fact that f is a bijection, they are exact
replicas of one another. Their objects just have di?erent names.
Gettingbacktoourexample, consider the spaces (R,|·|)and(CS(Q),
b
d).
Let f : R→CS(Q)givenbyf(x)={<x
n
>
1
,<x
n
>
2
,...}where <x
n
>
1
→
x, < x
n
>
2
→ x,etc. are Cauchy sequences of one class converging to x (e.g.
<x
n
>
1
=
-
1+
1
n
?
and <x
n
>
2
=
-
1?
1
n
?
which converge to 1). f is a
surjection because (R,|·|) is complete. One can show that f is an isometry.
Thus, these two metric spaces are isometric. The above construction implies
the fact that for every real number x ∈R there exists a sequence <x
n
> of
rational numbers converging to x (i.e. limx
n
= x where x
n
∈Q).
82 CHAPTER 4. METRIC SPACES
4.3 Compactness
When we listed the three important theorems at the beginning of this chap-
ter, there was a common assumption; the domain of f was taken to be the
closed interval [a,b]. What properties of [a,b] guarantee the validity of these
theorems? What properties of the domain of f are necessary for the valid-
ity of comparable theorems in more general metric spaces? In more general
metric spaces, closed intervals may not even be deTned. If we replaced the
closed interval above with a closed ball, would the results continue to hold?
As we will see later in the Chapter, they may not.
In Chapter 3 we showed (Theorem 121) that [a,b] has the Bolzano-
Weierstrass property such that any sequence of elements of [a,b]hasasub-
sequence that converges to a point in [a,b]. As we will see, this property can
also be deTned for a subset A of a general metric space (X,d)toguarantee
the validity of the theorems we?re interested in. In fact, if we dealt with
metric spaces only, we could have deTned compactness only in terms of sets
which satisfy the Bolzano-Weierstrass property. The more general approach
we take next, which can be applied in any topological space, uses a di?erent
deTnition which is seemingly unrelated to the Bolzano-Weierstrass property.
DeTnition 184 A collection C = {A
i
: i ∈ Λ,A
i
? X} covers ametric
space (X,d) if X = ∪
i∈Λ
A
i
. C is called an open covering if its elements
A
i
are open subsets of X.
DeTnition 185 Ametricspace(X,d) is compact if every open covering C
of X contains a Tnite subcovering of X.
7
AsubsetH of (X,d) is compact if
every open covering of H by open sets of X has a Tnite subcovering of H.
In order to apply this deTnition to show that a set H is compact we must
examine every open covering of H and hence it is virtually impossible to use
it in determining compactness of a set. The exception is the case of a Tnite
subset H of a metric space X.Forifeverypointx
n
∈ H is in some open set
A
i
∈C,thenat most m carefully selected subsets of C will have the property
that their union contains H. Thus any Tnite subset H ? X is compact.
On the other hand, to show that a set H is not compact, it is su?cient
to show only one open covering cannot be replaced by a Tnite subcollection
that also covers H.
7
That is, if every open covering C of X contains a Tnite subcollection {A
i
1
,A
i
2
,....,A
i
k
}
with A
i
j
∈C that also covers X.
4.3. COMPACTNESS 83
Example 186 Let (X,d)=(R,|·|) and H = {x ∈R : x ≥ 0} =[0,∞).Let
C = {A
n
: n ∈ N,A
n
=(?1,n)} so that every A
n
? R and H ? {∪
n∈N
A
n
}.
If {A
n
1
,A
n
2
,...,A
n
k
} is a Tnite subcollection, let M =max{n
1
,n
2
,...,n
k
} so
A
n
j
? A
M
and hence A
M
= ∪
k
j=1
A
n
j
. However, since A
n
is open, M/∈ A
M
and hence the real number M>0 does not belong to a Tnite open subcovering
of H. Thus we have provided one particular covering of H by open sets
(?1,n) which cannot be replaced by a Tnite subcollection that also covers
H. This was su?cient to show that H is not compact. This example shows
that boundedness of a set is a likely necessary condition for compactness.
8
See
Figure 4.3.1.
Lemma 187 Let H ? X. If H is compact, then H is bounded.
Proof. Let x
0
∈ H.Let A
m
= {x ∈ X : d(x
0
,x) <m}. Here we construct an
increasing nested sequence of open sets A
m
whose countable union contains
H.That is, H ?∪
∞
m=1
A
m
= X and A
1
? A
2
? ... ? A
m
? ... It follows
from the deTnition of compactness that there is a Tnite number M such that
A
1
? A
2
? ... ? A
M
covers H.ThenH ? A
M
and hence bounded.
Example 188 H =[0,1) cannot be covered by a Tnite subcollection of sets
A
n
=
?
?1,1?
1
n
¢
for n ∈N. It is simple to see that H ?{∪
n∈N
A
n
} and each
A
n
? R. However, if {A
n
1
,A
n
2
,...,A
n
k
} is a Tnite subcollection and we let
M =sup{n
1
,n
2
,...,n
k
},then A
n
j
? A
M
and hence A
M
= ∪
k
j=1
A
n
j
. However,
since A
n
is open, 1?
1
M
/∈ A
M
and hence the real number 1?
1
M
∈ H does not
belong to a Tnite open subcovering of H. This example shows that closedness
of a set is a likely necessary condition for compactness.
9
Figure 4.3.2.
Lemma 189 Let H ? X. If H is compact, then H is closed.
Proof. H is closed ? X\H is open. Let x ∈ X\H and construct an
increasing nested sequence of open sets A
k
around but not including x with
the property that their countable union is H\{x}. That is, A
k
= {y ∈ X :
d(x,y) >
1
k
,k ∈ N} in X.Then{H\{x}} = ∪
k∈N
A
k
.Since x/∈ H, each
element of H is in some set A
k
by an application of Corollary 100 so that H ?
8
Unboundedness in this example was really just an application of the Archimedean
Theorem 99.
9
Lack of closedness in this example was really just an application of the Corollary to
the Archimedean Theorem 100.
84 CHAPTER 4. METRIC SPACES
∪
k∈N
A
k
.Since H is compact, it follows from the deTnition of compactness that
there is a Tnite K ∈ N such that H ?∪
K
k=1
A
k
= A
K
.In that case, there is
an open ball around x such that B 1
K+1
(x) ? X\H with B 1
K+1
(x)∩H = ?.
Since x was arbitrary, each point in the complement of H iscontainedinan
open ball in X\H. Thus X\H is open in which case H is closed. See Figure
4.3.3.
Theorem 190 A closed subset of a compact set is compact.
Proof. Let X be compact and H ? X be closed. Let C = {A
i
} be an
open covering of H. Then G = {A
i
}∪(X\H)isanopencoveringofX since
X\H is open (because H is closed). Since X is compact, there exists a Tnite
subcollection F of G covering X.Since F also covers H,thenF\{X\H} also
covers H and is a subcollection of C.ThenH is compact.
Lemmas 187 and 189 provide necessary conditions for a set to be compact.
Butwewouldliketohavesu?cient conditions that guarantee compactness
of a set. To that end, Theorem 190 is useful but has limited applicability.
The original space has to be compact in order to be able to use it. Are
the necessary conditions of Lemmas 187 and 189 in fact su?cient? Not
necessarily, as the next example shows.
Example 191 Consider the metric space (R,d
0
) with d
0
(x,y)=min{|x ?
y|,1}.In this case, R is bounded since |x?y| ≤ 1, ?x,y ∈R.Wealsoknow
that R is closed. It is clear, however, that R is not compact in (R,d
0
) since
a collection A = {(?n,n),n∈ N} covers R but it doesn?t contain a Tnite
subcollection that also covers R.
Exercise 4.3.1 Show that d
0
isametriconR.
The space (R,|·|) provides a clue as to a set of su?cient conditions to
establish compactness. Since Lemmas 187and189applytoanymetricspace,
we know compactness implies boundedness and completeness. But in (R,|·
|), boundedness and completeness is equivalent to the Bolzano-Weierstrass
property by Theorem 121 (which we attributed to Heine-Borel). Thus, isn?t
the Bolzano-Weierstrass propertysu?cient for compactness? In (R,|·|),this is
true and we now show that the Bolzano-Weierstrass property is also su?cient
in any metric space. Before we do this, we begin by formulating compactness
in terms of sequences (consistent with the approach we are taking in this
chapter).
4.3. COMPACTNESS 85
DeTnition 192 AsubsetH of a metric space X is sequentially compact
if every sequence in H has a subsequence that converges to a point in H.
Next we turn to establishing that the Bolzano-Weierstrass property, se-
quential compactness, and compactness are equivalent in any metric space.
Theorem 193 Let (X,d) be a metric space. Let H ? X. The following are
equivalent: (i) H is compact; (ii) Every inTnite subset of H has a cluster
point; (iii) Every sequence in H has a convergent subsequence.
Proof. (Sketch) (i ? ii)Itissu?cient to prove the contrapositive that if
A ? H has no cluster point, then A must be Tnite. If A has no cluster
point, then A contains all its cluster points (because the set of cluster points
is empty and every set contains the empty set.). Therefore A is closed.
Since A is a closed subset of a compact space H, it is compact. For each
x ∈ A,?ε>0suchthatB
ε
(x)∩{A\{x}} = ? since x is not a cluster point
of A by DeTnition 140. Thus, the collection {B
ε
(x),x∈ A} forms an open
covering of H.SinceH is compact, it is covered by Tnitely many B
ε
(x).
Since each B
ε
(x) contains only one point of H, H is Tnite.
(ii ? iii)Given <s
i
>, consider the set S = {s
i
∈ H : i ∈ N}.IfS
is Tnite, then s
?
= s
i
for inTnitely many values of i in which case <s
i
>
has a subsequence that is constant (and hence converges automatically). If
S is inTnite, then by (ii) it has a cluster point s
?
. Since s
?
is a cluster point,
we know by DeTnition 140 that ?ε =
1
n
there exist x
i
n
∈ B1
n
(s
?
)suchthat
s
i
n
6= s
?
. This allows us to construct a subsequence <s
i
1
,s
i
2
,... > which
converges to s
?
.
10
(iii ? i) First, we show that ?ε>0, ? a Tnite subcovering of H (by
ε-balls). Once again, it is su?cient to prove the contrapositive: If for some
ε>0,Hhas no Tnite subcover, then H has no convergent subsequence.
If H cannotbecoveredwithaTnite number of balls, construct <s
i
> as
follows: Choose any s ∈ H,say s
1
. Since B
ε
(s
1
) is not all of S (which would
contradict that S has no Tnite subcover), choose s
2
/∈ B
ε
(s
1
). In general,
10
More speciTcally, deTne the subsequence <s
g(i)
> approaching s
?
inductively as
follows: Choose i
1
such that s
i
1
∈ B
1
(s
?
). Since s
?
is a cluster point of the set X, it is
also a cluster point of the set S
2
= {s
i
∈ S : i ∈ N,i≥ 2} obtained by deleting a Tnite
number of elements of S. Therefore, there is an element s
i
2
of S
2
whichisanelement
of B1
2
(s
?
)withi
2
>i
1
. Continuing by induction, given i
n?1
, choose i
n
>i
n?1
such that
s
i
n
∈ B1
n
(s
?
).
86 CHAPTER 4. METRIC SPACES
given <s
1
,s
2
,...,s
n
>, choose x
n+1
/∈ B
ε
(s
1
) ∪ B
ε
(s
2
) ∪ ... ∪ B
ε
(s
n
)since
these balls don?t cover S. By construction d(s
n+1
,s
i
) ≥ ε for i =1,...,n.
Thus, <s
i
> can have no convergent subsequence. The above procedure can
be used to construct a Tnite subcollection that covers S.
11
.
When we put this theorem together with the result that every closed and
bounded set has the Bolzano-Weierstrass Property we get a simple criterion
for determining compactness of a subset in (R, |·|).In particular, all we have
to do is establish that a set in (R, |·|) is closed and bounded to know it is
compact.
Corollary 194 (Heine-Borel) Given (R, |·|), H ?R is compact i? H is
closed and bounded.
Proof. Follows from Theorem 193 together with the Heine Borel Theorem
121.
Corollary 194 is the ?more familiar? version of the Heine-Borel Theorem
and can easily be extended to R
n
with the Euclidean metric.
Is there any relation between compactness and completeness? While it
may not appear so by their deTnitions, the next result establishes that they
are in fact related.
Lemma 195 Let (X,d) be a metric space. If X is compact, then it is com-
plete.
Proof. From Theorem 193 every Cauchy sequence has a convergent subse-
quence. Completeness follows by Lemma 173.
The converse of Lemma 195 does not necessarily hold; that is, it doesn?t
follow that if X is complete, then it is compact as Example 191 shows.
We need a stronger condition than boundedness to prove an analogue of
the Heine-Borel Corollary 194 for general metric spaces. The condition was
actually already used in part (iii) of Theorem 193.
DeTnition 196 Ametricspace(X,d) is totally bounded if ?ε>0, there
is a Tnite covering of X by ε?balls.
11
See the Lebesgue Number Lemma, p. 179, of Munkres for this construction. For the
case of X = R
n
, see Bartle Theorem23.3 p. 160.
4.4. CONNECTEDNESS 87
As we can see, the deTnition of total boundedness is quite similar to
DeTnition 185 of compactness. One might ask how to check if a metric
space is totally bounded. Though there is no satisfactory answer in a general
metric space, there are various criteria for speciTc spaces. For instance, total
boundedness in R
n
is equivalent with boundedness.
Total boundedness of a metric space implies boundedness, but the con-
verse is not true.
Example 197 Whileweestablishedthat(R,d
0
) with d
0
(x,y)=min{|x ?
y|,1} was bounded in Example 191, it is not totally bounded. This follows
because all of R cannot be covered with Tnitely many balls of radius, say
1
4
.
NextweestablishananalogueoftheHeine-Borel Theorem for general
metric spaces.
Theorem 198 Ametricspace(X,d) is compact i? it is complete and totally
bounded.
Proof. (?) Completeness follows by Lemma 195. Total boundedness follows
from DeTnition 196 given X is compact.
(?)ByTheorem193itsu?ces to show that if <x
n
> is a sequence in
X, then there exists a subsequence <x
g(n)
> that converges. Since X is
complete,itsu?ces to construct a subsequence that is Cauchy. Since X
is totally bounded, there exist Tnitely many ε =1ballsthatcoverX. At
least one of these balls, say B
1
contains y
n
for inTnitely many indices. Let
J
1
? N denote the set of all such indices for which s
n
∈ B
1
. Next cover X
by Tnitely many ε =
1
2
balls. Since J
1
is inTnite, at least one of these balls,
say B
2
contains y
n
for inTnitely many indices. Let J
2
? J
1
denote the set of
all such indices for which n ∈ J
1
and y
n
∈ B
2
. Using this construction, we
obtain a sequence <J
k
> such that J
k
? J
k+1
.Ifi,j ≥ k,then n
i
,n
j
∈ J
k
and y
n
i
,x
n
j
are contained in a ball B
k
of radius
1
k
. Hence <x
n
i
> is Cauchy.
See Figure 4.3.4.
So how does one test for total boundedness? ???EXAMPLE????
4.4 Connectedness
Connectedness of a space is very simple. A space is ?disconnected? if it can
be broken up into separate globs, otherwise it is connected. More formally,
88 CHAPTER 4. METRIC SPACES
DeTnition 199 Let (X,d) beametricspace. S ? X is disconnected
(or separated)ifthereexistapairofopensetsT,U such that S ∩U and
S∩T are disjoint, non-empty and have union S. S is connected if it is not
disconnected. See Figure 4.4.1.
Example 200 (a) Let (X,d)=(R,|·|) and H = N.ThenN is disconnected
in R since we can take T = {x ∈ R: x>
3
2
} and U = {x ∈ R: x<
3
2
}.
Then T ∩N 6= ? 6= U ∩N, (T ∩N)∩(U ∩N)=?, and N = T ∪U.(b)Let
(X,d)=(R,|·|) and H = Q
+
.ThenQ
+
is disconnected in R sincewecan
take T = {x ∈R: x>
√
2} and U = {x ∈R: x<
√
2}.See Figure 4.4.2.
Theorem 201 I =[0,1] is a connected subset of R.
Proof. Suppose, to the contrary, that there are two disjoint non-empty open
sets A,B whose union is I.SinceA and B are open, they do not consist of
asinglepoint.WLOGleta ∈ A and b ∈ B such that 0 <a<b<1. Let
c =sup{x ∈ A : x<b}, which exists by Axiom 3. Since 0 <c<1,c∈ A∪B.
If c ∈ A,thenc 6= b and since A is open, there is a point a
1
∈ A with c<a
1
such that [c,a
1
]iscontainedin{x ∈ A : x<b}. But this contradicts the
deTnition of c. A similar argument can be made if c ∈ B.
This result can easily be extended to any (open, closed, half open, etc.)
subset of R and to show R itself is connected. Furthermore, it is possi-
ble to construct cartesian products of connected sets which are themselves
connected.
12
4.5 Normed Vector Spaces
Before moving onto the next topological concept (continuity), we give an
example of a speciTc type of metric space called a normed vector space.
Normed vector spaces are by far the most important type of metric space
we will deal with in this book. A normed vector space has features that a
metric space doesn?t have in general; it possesses a certain algebraic structure.
Elements of a vector space (called vectors) can be added, subtracted, and
multiplied by a number (called a scalar). See Figure 4.5.1 for the relation
between metric spaces and vector spaces.
12
See Munkres p.150.
4.5. NORMED VECTOR SPACES 89
DeTnition 202 A vector space (or linear space) is a set V of arbitrary
elements (called vectors) on which two binary operations are deTned: (i)
closed under vector addition (if u,v ∈ V,thenu + v ∈ V) and (ii) closed
under scalar multiplication (if a ∈R and v ∈ V, then av ∈ V) which satisfy
the following axiom(s):
C1. u+v = v +u, ?u,v ∈ V
C2. (u+v)+w = u+(v +w), ?u,v,w ∈ V
C3. ?0 ∈ V 3 v +0=v =0+v, ?v ∈ V
C4. For each v ∈ V, ?(?v) ∈ V 3 v +(?v)=0=(?v)+v
C5. 1v = v, ?v ∈ V
C6. a(bv)=(ab)v, ?a,b ∈R and ?v ∈ V
C7. a(u+v)=au+av, ?u,v ∈ V
C8. (a+b)v = av +bv, ?a,b ∈R and ?v ∈ V
Example 203 R is the simplest vector space. The elements are real numbers
where ?+?and?·? were introduced in Axiom 1. R
2
is also a vector space whose
basic elements are 2?tuples, say (x
1
,x
2
). We interpret (x
1
,x
2
) not as a point
in R
2
with coordinates (x
1
,x
2
) but as a displacement from some location. For
instance, the vector (1,2) means move one unit to the right and two units up
from your current location. See Figure 4.5.2a for an example of the vector
(1,2) from two di?erent initial locations. Often we take the inital location
to be the origin. Vector addition (see Figure 4.5.2b.) is then deTned as
(x
1
,x
2
)+(y
1
,y
2
)=(x
1
+ y
1
,x
2
+ y
2
) and scalar multiplication (see Figure
4.5.2c.) is deTned as a(x
1
,x
2
)=(ax
1
,ax
2
). Let F(X,R) be the set of all real
valued functions f : X → R. Then we can deTne (f + g)(x)=f(x)+g(x)
and (αf)(x)=αf(x). These two operations satisfy Axioms C1 ? C8 and
hence F(X,R) is a vector space. We will consider such sets extensively in
Chapter 6.
DeTnition 204 A vector subspace U ? V is a subset of V which is a
vector space itself.
90 CHAPTER 4. METRIC SPACES
Example 205 Let V = R
2
= {(x,y):x,y ∈ R} and U = {(x,y):y =2x,
where x,y ∈ R}.Then U is a vector subspace of V. Note that Z = {(x,y):
y =2x +1, where x,y ∈ R} is a subset of V but it is not a vector subspace
of V since 0 /∈ Z.
The algebraic structure of a vector space by itself doesn?t allow us to
measure distance between elements and hence doesn?t allow us to deTne
topological properties. This can be accomplished in a vector space through
a distance function called the norm.
DeTnition 206 If V is a vector space, then a norm on V is a function
from V to R, denoted k·k: V → R, which satisTes the following properties
?u,v ∈ V and ?a ∈ R : (i) kvk ≥ 0,(ii)kvk =0i? v =0, (iii) kavk =
|a|kvk,and(iv)ku + vk ≤ kuk+kvk. A vector space in which a norm has
been deTned is called a normed space.
Notice that the algebraic operations vector addition and scalar multipli-
cationareusedindeTning a norm. Thus, a norm cannot be deTnedina
general metric space which is not equipped with these operations. But a
vector space equipped with a norm can be seen as a metric space and a met-
ric space which has a linear structure is also a normed vector space. The
following theorem establishes this relationship.
Theorem 207 Let V be a vector space then
(i) If (V,d) is a metric space then (V,k.k) isanormedvectorspacewith
the norm k.k :V →R deTned kxk = d(x,0), ?x ∈ V
(ii) If (V,k.k) is a normed vector space then (V,ρ) is a metric space with
the metric ρ : V ×V →R deTned ρ(x,y)=kx?yk, ?x,y ∈ V
Exercise 4.5.1 Prove Theorem 207.
Note that whenever a metric space has the additional algebraic structure
given in 202, we will use the norm rather than the metric and hence work in
normed vector spaces.
Exercise 4.5.2 RedeTne convergence, open balls, and boundedness in terms
of normed vector spaces.
DeTnition 208 A complete normed vector space is called a Banach space.
4.5. NORMED VECTOR SPACES 91
Some vector spaces are endowed with another operation, called an inner
(or dot) product that assigns a real number to each pair of vectors. The
inner product enables us to measure the ?angle? between elements of a vector
space.
13
DeTnition 209 If V is a vector space, then an inner product is a function
< ·,· >: V ×V →R which satisTes the following properties ?u,v,w ∈ V and
?a ∈R : (i) <v,v>≥ 0,(ii)<v,v>=0i? v =0, (iii) <u,v>=<v,u>,
(iv) <u,(v + w) >=<u,v>+ <u,w>,(v)< (au),v>= a<u,v>=<
u,(av) >. A vector space in which an inner product has been deTned is called
an inner product space.
The inner product can be used to deTne a norm (in particular the Eu-
clidean measure of distance) in the following way.
Theorem 210 Let V be an inner product space and deTne kvk =
√
<v,v>.
Then k·k: V →R is a norm which satisTes the Cauchy-Schwartz inequality
<u,v>≤kukkvk.
Proof. (Sketch) Since <v,v>≥ 0 by part (i) of deTnition 209,
√
<v,v>
exists and exceeds zero, establishing part (i) of deTnition 206. Part (ii) also
follows from(ii) of deTnition 209. By part (v) of deTnition209,
p
< (av),(av) > =
p
a
2
<v,v>= |a|
√
<v,v>= |a|kvk, establishing part (iii). To establish
Cauchy-Schwartz, let w = au?bv for a,b ∈ R and u,v∈V.BydeTnition
209, w ∈ V. Then
0 ≤ <w,w>= a
2
<u,u>?2ab < u,v > +b
2
<v,v>
= kvk
2
kuk
2
?2kvkkuk <u,v>+kuk
2
kvk
2
=2kukkvk(kukkvk? <u,v>)
where the second equality follows by letting a = kvk and b = kuk,which were
free parameters in the Trst place.
To get some intuition for this result, notice that if θ is the angle between
vectors u and v, then the relationship between the inner product and norms
of the vectors is given by <u,v>= kvkkukcosθ.The inequality then follows
since cosθ ∈ [?1,1].SeeFigure4.5.6forthisgeometricinterpretationofthe
Cauchy Schwartz inequality.
13
For instance, orthogonality is just <u,v>=0.
92 CHAPTER 4. METRIC SPACES
Exercise 4.5.3 Finish the proof of Theorem 210 (i.e. establish the triangle
inequality in part (iv) of deTnition 206).
Whereas some norms (eg. the Euclidean norm) can be induced from an
inner product, other norms (eg sup norm) cannot be.
DeTnition 211 A complete inner product space is called a Hilbert space.
Note that a Hilbert space is also a Banach space.
4.5.1 Convex sets
DeTnition 212 We say that a linear combination of x
1
,...,x
n
∈ V is
{
P
n
i=1
α
i
x
i
,α
i
∈ R,i =1,...,n}.We say that a convex combination of
x
1
,...,x
n
∈ V is {
P
n
i=1
α
i
x
i
,α
i
≥ 0,
P
n
i=1
α
i
=1,i=1,...,n}.
DeTnition 213 AsubsetS of a vector space V is a convex set if for every
x,y ∈ S,the convex combination αx+(1?α)y ∈ S, for 0 ≤ α ≤ 1.
14
Example 214 In R, any interval (e.g. (a,b))isconvexbut(a,b) ∪ (c,d)
with b<cis not convex. See Figure 4.5.3 for convex sets in R
2
.
DeTnition 215 The sum (di?erence) of two subsets S
1
and S
2
of a vector
space V is S
1
±S
2
= {v ∈ V : v = x±y, x∈ S
1
,y∈ S
2
}.See Figure 4.5.4.
Theorem 216 (Properties of Convex Sets) If K
1
and K
2
are convex
sets, then the following sets are convex: (i) K
1
∩K
2
; (ii)λK
1
; (iii) K
1
±K
2
.
Proof. (iii) Let x,y ∈ K
1
+ K
2
so that x = x
1
+ x
2
,x
1
∈ K
1
,x
2
∈ K
2
,
y = y
1
+ y
2
,y
1
∈ K
1
,y
2
∈ K
2
.Thenαx +(1?α)y = α(x
1
+x
2
)+(1?
α)(y
1
+y
2
)=(αx
1
+(1?α)y
1
)+(αx
2
+(1?α)y
2
) ≡ z
1
+z
2
. Since K
1
and
K
2
are convex, z
1
∈ K
1
,z
2
∈ K
2
. Thus, x+y ∈ K
1
+K
2
.
Example 217 It is simple to show cases where K
1
and K
2
areconvexsets,
but K
1
∪K
2
is not convex. See Example 214 in R.
14
For instance, for x
1
,...,x
n
∈ S, the convex combination is
n
X
i=1
α
i
x
i
,where
n
X
i=1
α
i
=1
and α
i
≥ 0.
4.5. NORMED VECTOR SPACES 93
As we will see later, convexity of a set is a desirable property. If a set S
is not convex, we may replace it with the smallest convex set containing S
called the convex hull.
DeTnition 218 Let S ? V. The convex hull of S is the set of all convex
combinations of elements from S, denoted co(S). That is, co(S)={x ∈ V :
x =
P
n
i=1
α
i
x
i
,x
i
∈ S,α
i
≥ 0,
P
n
i=1
α
i
=1}.
Example 219 In R
n
, consider two vectors A 6= B.Thenco({A,B}) is just
a line segment with endpoints A, B. If A,B,C do not lie on the same line,
then co({A,B,C}) is the triangle A,B,C. See Figure 4.5.5.
Exercise 4.5.4 Show that if V is convex, then (i) co(V)=V, and (ii)if
S ? V,thenco(S) ? V is the smallest convex set containing S.
4.5.2 A Tnite dimensional vector space: R
n
ThemostfamiliarvectorspaceisjustR
n
, with n ∈ N and n<∞. R
n
is the collection of all ordered n-tuples (x
1
,x
2
,...,x
n
)withx
i
∈ R, i =
1,2,...,n. Vector addition is deTned as (x
1
,x
2
,...,x
n
)+(y
1
,y
2
,...,y
n
)=(x
1
+
y
1
,x
2
+y
2
,...,x
n
+y
n
) and scalar multiplication is deTned as a(x
1
,x
2
,...,x
n
)=
(ax
1
,ax
2
,...,ax
n
).
Exercise 4.5.5 Verify R
n
is a vector space under these operations.
Example 220 Since (R,|·|) is a complete metric space with absolute value
metric, it is a Banach space with the norm kxk = |x|. Since (R
n
,kxk) is
a complete metric space with Euclidean metric, it is a Banach space with
Euclidean norm kxk
2
=
p
P
n
i=1
(x
i
)
2
. Since (R
n
,kxk
∞
) is a complete metric
space with supremum metric, it is also a Banach space with sup norm kxk
∞
=max{|x
1
|,....|x
n
|}.
The next result provides a useful characterization of the relationship be-
tween kxk
∞
and kxk.
Theorem 221 If x =(x
1
,...,x
n
) ∈R
n
, then kxk
∞
≤kxk
2
≤
√
nkxk
∞
.
94 CHAPTER 4. METRIC SPACES
Proof. Since (kxk
2
)
2
=
P
n
i=1
(x
i
)
2
, it is clear that |x
i
|≤kxk
2
, ?i. Similarly,
if M =max{|x
1
|,....|x
n
|},then(kxk
2
)
2
≤ nM
2
, so kxk≤
√
nM.
Example 220 shows that R
n
can be endowed with two di?erent norms.
One might ask if these two normed vector spaces are somehow related and
if so, in what sense? Distances between two points with respect to these
two norms are generally di?erent. See Figure 4.5.7 In this case, these two
normed vector spaces are not isometric. On the other hand these two spaces
have identical topological properties like openness, closeness, compactness,
connectedness, and continuity. In this case, we say that these two normed
vector spaces are homeomorphic or topologically equivalent.
To show that two metric spaces (or normed vector spaces according to
Theorem 207) are topologically equivalent it su?ces to show that the collec-
tions of open sets in both spaces are identical. This is because all topological
properties can be deTned in terms of open sets. The fact that open sets
are identical follows Theorem 221. To see this, let A be open in R
n
under
Euclidean norm. Then ?x ∈ A, ?ε>0suchthat{y ∈ R
n
: ky ? xk <
ε} ? A. But from the Trst part of the inequality in Theorem 221 we know
{y ∈ R
n
: ky?xk
∞
≤ ky?xk <ε} ? A. Hence A is open in R
n
under the
supnorm. Theinversecanbeshownthesamewayusingthesecondpart
of the inequality. We will discuss this at further length after we introduce
continuity.
Example 222 InR
n
,deTne < (x
1
,x
2
,...,x
n
),(y
1
,y
2
,...,y
n
) >= x
1
y
1
+x
2
y
2
+
...+x
n
y
n
. R
n
with inner product deTned this way is a Hilbert space.
Exercise 4.5.6 Verify the dot product in Example 222 deTnes an inner prod-
uct on R
n
.
Theorem 223 In the Euclidean space R
n
a sequence of vectors <x
m
> con-
verges to a vector x =(x
1
,....,x
n
) ifandonlyifeachcomponent<x
i
m
>
converges to x
i
, i =1,...,n.
Exercise 4.5.7 Prove Theorem 223.
We next introduce the simplest kind of convex set in R
n
.
DeTnition 224 A nondegenerate simplex in R
n
is the set of all points
S = {x ∈R
n
: x = α
0
v
0
+α
1
v
1
+....+α
n
v
n
,α
0
≥ 0,...,α
n
≥ 0 and
n
X
i=0
α
i
=1}
(4.2)
4.5. NORMED VECTOR SPACES 95
where v
0
,v
1
,......,v
n
are vectors from R
n
such that v
1
?v
0
,v
2
?v
0
,.....,v
n
?
v
0
are lineary independent. Vectors v
0
,v
1
,.....,v
n
arecalledvertices. The
numbers α
0
,...,α
n
are called barycentric coordinates (the weights of the
convex combinations with respect to n+1Txed vertices) of the point x.
Example 225 A nondegenerate simplex in R
1
is a line segement, in R
2
is a triangle, in R
3
is a tetrahedron. In R
3
, for example, the simplex is
determined by 4 vertices, any 3 vertices determine a boundary face,any2
vertices determine a boundary segment. See Figure 4.5???(4.8.2.???)
A simplexisjust the convex hull of the set of allverticesV = {v
0
,v
1
,....,v
n
}.
By the following theorem, any point of a convex hull of V canbeexpressedas
a convex combination of these vertices. Do not confuse the n+1 barycentric
coordinates (the α
i
)ofx with the n cartesian coordinates of x.
Theorem 226 (Caratheodory) If X ?R
n
and x ∈ co(X),then x =
n+1
X
i=1
λ
i
x
i
for some λ
i
≥ 0,
n+1
X
i=1
λ
i
=1,x
i
∈ X, ?i.
Proof. (Sketch) Since x ∈ co(X),it can be written as a convex combination
of m points by Theorem ??.Ifm ≤ n + 1, we are done. If not, then the
generated vectors
·
x
1
1
?
,
·
x
2
1
?
,...,
·
x
m
1
?
are linearly dependent, so a combination of them will be zero (i.e.
m
X
i=1
μ
i
·
x
i
1
?
=0
with μ
i
not all zero. If λ
i
are coe?cients of x
i
,wecanchooseα to reduce the
number of vectors with nonzero coe?cients below m by setting θ
i
≡ λ
i
?αμ
i
.
We know that in R
n
each vector can be written as a linear combination
of n-linearly independent vectors (called a basis). That is, x =
n
X
i=1
α
i
x
i
,
96 CHAPTER 4. METRIC SPACES
{x
1
,...,x
n
} is a basis. There is no restriction on the coe?cients α
i
. In The-
orem 226 there are additional assumptions put on α
i
(i.e.
n
X
i=1
α
i
=1and
α
i
≥ 0). Now adding one more variable (from n to n + 1) to the system
yields a unique solution for vectors belonging to the co(V) and no solution
for other vectors.
The following two examples demonstrate the di?erence between cartesian
coordinates and barycentric coordinates in R
2
.
Example 227 Let V = {(0,1),(1,0),(1,1)}. Say we want to express the
vector (
2
3
,
2
3
) as a linear combination of (0,1) and (1,0), two basis vectors.
That is (
2
3
,
2
3
)=
2
3
(1,0) +
2
3
(0,1), but this is not a convex combination since
2
3
+
2
3
=
4
3
6=1.Butanypointfromco(S) can be uniquely expressed as the
convex combination of vectors from S. For instance,
(
2
3
,
2
3
)=α(1,0) +β(0,1) + (1?α?β)(1,1)
where 0 ≤ α,β ≤ 1. Letting α = β =
1
3
, we have
(
2
3
,
2
3
)=
1
3
(1,0) +
1
3
(0,1) +
1
3
(1,1)
On the other hand, a vector outside co(V) (like (
1
3
,0)) cannot be expressed
as a convex combination of vectors from V. See Figure 4.5.8.
Example 228 Let the Txed vertices be given by v
0
=(0,1),v
1
=(0,3),v
2
=
(2,0) and consider the point x
1
=(1,1) on the interior of the simplex. See
Figure 4.5????(4.8.3.) The barycentric coordinates of x
1
with respect to
vertices v
0
,v
1
,v
2
are
?
1
4
,
1
4
,
1
2
¢
since (1,1) =
1
4
(0,1) +
1
4
(0,3) +
1
2
(2,0).
Inthecaseofx
2
=(0,2), the barycentric coordinates are
?
1
2
,
1
2
,0
¢
since
(0,2) =
1
2
(0,1)+
1
2
(0,3)+0(0,2). Inthecaseofx
3
=(2,0), the barycentric
coordinates are (0,0,1) since (2,0) = 0·(0,1)+0·(0,3)+1·(2,0). Notice that
x
1
is an interior point of the simplex so that all its barycentric coordinates
are positive, that x
2
is on the boundary so that one barycentric coordinate is
0, and x
3
isavertexsothatithas2 barycentric coordinates which are zeros.
In this section, we will always mean by α
i
barycentric coordinates of a point
inside the Txed simplex (including boundary points).
4.5. NORMED VECTOR SPACES 97
The next result, while purely combinatorial, will be used in the proof of
Brouwer?s Fixed Point Theorem 302. While this can be proven for R
n
,here
we present it for R
2
. First we must introduce an indexing scheme for points
in the simplex as follows. Let Z be a set of labels in R
2
given by {0,1,2}.
While the index function I : S →Z canobtainanyvaluefromZ for points x
inside the simplex, it must satisfy the following restrictions on the boundary:
I(x)=
?
?
?
0or1 onthelinesegment(v
0
,v
1
)
0or2 onthelinesegment(v
0
,v
2
)
1or2 onthelinesegment(v
1
,v
2
)
(4.3)
For example on the boundary (v
0
,v
1
), I (x) can?t obtain the value 2. Thus
I (X) = 0 or 1 on the line segment (v
0
,v
1
). See Figure 4.5???(4.8.6???). Note
that the (4.3) implies that I(v
0
)=0,I(v
1
)=1, and I(v
2
) = 2 at the vertices.
Lemma 229 (Sperner) Form the barycentric subdivision of a nondegener-
ate simplex. Label each vertex with an index I (x)=0,1,2 that satisTes the
restrictions (4.3)on the boundary. Then there is an odd number of cells (thus
at least 1 ) in the subdivision that have vertices with the complete set of labels
0,1,2.
Proof. By induction on n. We show just the Trst step (i.e. for n =1)to
get the idea. If n =1,a nondegenerate simplex is a line segment and a face
is a point. To obey the restrictions (4.3), one end has label 0 the other has
label 1, and the rest is arbitrary. See Figure 4.5(4.8.11????). Next deTne a
counting function F,where by F (a,b) we mean the number of elements in the
simplex of type (a,b). For example, in Figure 4.5(4.8.11????), F (0,0) = 2,
F (0,1) = 3,F(1,1) = 1,F(0) = 4,F(1) = 3. Permutations don?t matter
(i.e. (0,1) and (1,0) are the same type which is why F(0,1)=3sincewehave
two occurences of (0,1) and one of (1,0). Consider the single points labeled
0. Two labels 0 occur in each cell of type (0,0), one label 0 occurs in each cell
of type (0,1). The sum 2F (0,0)+F (0,1) counts every interior 0 twice, since
every interior 0 is the point that is shared by two cells and the sum counts
every boundary 0 once. Therefore 2F (0,0)+F (0,1) = 2F
i
(0)+F
b
(0) where
F
i
(0) is the number of i (for interior) 0
0
s and F
b
(0) is the number of b (for
boundary) 0
0
s. Clearly F
b
(0) = 1. Hence
F (0,1) = 2[F
i
(0)?F (0,0)] + 1. (4.4)
98 CHAPTER 4. METRIC SPACES
In Figure 4.5(4.8.9????) these numbers are F (0,1) = 3,F
i
(0) = 3,F(0,0) =
2. In 1 dimension the number of cells having vertices with the complete set of
labels 0,1isF (0,1) and from (4.4) we see that it is always an odd number.
Example 230 The values of counting functions F for the simplex in Figure
4.5(4.8.9????) are:
F (0) = 7,F(0,0) = 4,F(1,1) = 8,F(0,0,0) = 0
F (1) = 9,F(0,1) = 16,F(1,2) = 8,F(0,0,1) = 5
F (2) = 5,F(0,2) = 8,F(2,2) = 1,F(0,0,2) = 2
F (0,1,1) = 6,F(0,2,2) = 1,F(1,1,1) = 2,F(2,2,2) = 0
F (0,1,2) = 7,F(1,1,1) = 2,F(1,2,2) = 0
4.5.3 Series
The fact that a normed vector space is the synthesis of two structures -
topological and algebraic - enables us to introduce the notion of an inTnite
sum (i.e. a sum containing inTnitely many terms). These objects are called
series. As we will see in the subsection on !
p
spaces, norms will be deTned
in terms of functions of inTnite sums so understanding when they converge
or diverge is critical.
Let (V,k·k) be a normed vector space and let <x
n
> be a sequence in V.
We can deTne a new sequence <y
n
> by y
n
=
n
X
i=1
x
i
. The sequence <y
n
> is
called the sequence of partial sums of <x
n
>.Since X is also a metric space,
we can ask if <y
n
> is convergent (i.e. if there exists an element y ∈ X such
that <y
n
>→ y or equivalently ky
n
?yk
X
→ 0. If such an element exists
we say that the series
∞
X
i=1
x
i
is convergent and write y =
∞
X
i=1
x
i
.If<y
n
> is
not convergent, we say that
∞
X
i=1
x
i
is divergent.
Example 231 Consider (R,|·|) and let <x
n
>=
-
1
2
n
?
, which is just a
geometric sequence with quotient
1
2
. The sequence of partial sums is y
1
=
1
2
,
4.5. NORMED VECTOR SPACES 99
y
2
=
1
2
+
1
4
=
3
4
,y
3
=
1
2
+
1
4
+
1
8
=
7
8
,...,y
n
=
1
2
+
1
4
+...+
1
2
n
=
1
2
3
1?
1
2
n
1?
1
2
′
=1?
1
2
n
.
Since <y
n
>=
-
1?
1
2
n
?
→ 1,wewrite
∞
X
i=1
1
2
i
=1.
Example 232 While we have already seen that <
1
n
> converges (to 0), the
harmonic series
P
∞
i=1
1
n
diverges (i.e. is not bounded). To see this, note
∞
X
n=1
1
n
=1+
1
2
+
1
3
+
1
4
+
1
5
+
1
6
+
1
7
+
1
8
+...
≥ 1+
1
2
+
1
4
+
1
4
+
1
8
+
1
8
+
1
8
+
1
8
+...
=1+
1
2
+
1
2
+
1
2
+...
The right hand side is the sum of inTnitely many halves which is not bounded.
Elements of a series can also be functions. We will deal with series of
functions in Chapter 6.
4.5.4 An inTnite dimensional vector space: !
p
The example in the above subsection is of a Tnite dimensional vector space;
that is, the Euclidean space R
n
with either norm (there are at most n lineary
independent vectors inR
n
). Now we introduce an inTnite dimensional vector
space. As you will see, results from Tnite dimensional vector spaces cannot
be generalized in inTnite dimensional vector spaces.
DeTnition 233 Let R
ω
be the set of all sequences in R. Let 1 ≤ p ≤∞
and let !
p
be the subset of R
ω
whose elements satisfy the
∞
P
i=1
|x
i
|
p
< ∞.
15
The
!
p
-norm of a vector x ∈ !
p
is deTned by
kxk
p
=(
∞
X
i=1
|x
i
|
p
)
1
p
for 1 ≤ p<∞
and !
∞
is the subset of all bounded sequences equipped with the norm
kxk
∞
= sup{|x
1
|,....,|x
n
|,...}.
15
Recall, R
ω
= {f : N→R} where ω = card(N).
100 CHAPTER 4. METRIC SPACES
We note that there is a set of inTnitely many linearly independent vectors
in !
p
,namely {e
i
=<x
j
>, i ∈N where x
j
=0fori 6= j and x
j
=1fori = j}
whichiscalledabasis.
Before proving that !
p
is a Banach space, we use the following exam-
ple to illustrate some di?erences between Tnite dimensional Euclidean space
R
n
and inTnite dimensional !
2
. In particular, convergence by components is
not su?cient for convergence in !
2
(i.e. the result of Theorem 223 is not
necessarily true).
Example 234 Let K = {e
i
=<x
j
>, i ∈ N where x
j
=0for i 6= j and
x
j
=1for i = j}. That is,
e
1
= < 1, 0, 0, 0, ... >
e
2
= < 0, 1, 0, 0, ... >
e
3
= < 0, 0, 1, 0, ... >
e
4
= < 0, 0, 0, 1, ... >
. ....
. ....
↓↓↓↓
0000
Observe that each component <x
i
n
> converges to 0 in (R ,|·|)foreach
i ∈ N . But the sequence <e
i
> doesn?t converge to 0 since ke
i
?0k
2
=1,
?i ∈ N. In fact <e
i
> has no convergent subsequence since the distance
between any two elements e
i
and e
j
,i6= j, is ke
i
?e
j
k
2
=
√
2. Thus according
to Theorem 193, K is not compact in !
2
. But notice that K is both bounded
and closed and these two properties are su?cient for compactness in R
n
.
Notice that K is not totally bounded. For if ε =
1
2
, the only non-empty
subsets of K with diameter less than ε are the singleton sets with one point.
Accordingly, the inTnite subset K cannot be covered by a Tnite number of
disjoint subsets each with diameter less than
1
2
.
Now we prove that the !
p
space is a complete normed vector space (and
hence that it is a Banach space for any p satisfying 1 ≤ p ≤∞)andthat
!
2
is a Hilbert space with the inner product deTned by <x,y>=
∞
P
i=1
x
i
y
i
.
First, we need to show that k·k
p
deTnes a norm. On !
p
, 1 ≤ p ≤∞.The
important role in investigating !
p
plays another space !
q
whose exponent q
is associated with p by the relation
1
p
+
1
q
=1wherep,q are non-negative
4.5. NORMED VECTOR SPACES 101
extended real numbers. Two such numbers are called (mutually) conjugate
numbers. If p = 1 the conjugate is q = ∞ since
1
1
+
1
∞
=1+0=1. Also
notice that q =
p
p?1
> 1forp>1. If p =2, then q =2. It is straightforward
to show that k·k
p
satisTes the Trst three properties of a norm. The triangle
property is a tricky one. Before showing it we shall establish some important
inequalities.
Lemma 235 Let a,b > 0 and p,q ∈ (1,∞) with
1
p
+
1
q
=1. Then ab ≤
a
p
p
+
b
q
q
,
with equality if a
p
= b
q
.
Proof. Since the exponental function is convex, we have exp(λA+(1?λ)B) ≤
λexpA +(1? λ)expB, for any real numbers A and B. By substituting
A = ploga, λ =
1
p
,B= qlogb, and 1?λ =
1
q
, we get the desired inequality.
See Figure 4.5.9.
The next result is the analogue of Cauchy-Schwartz in inTnite dimensions.
Theorem 236 (H?older inequality) Let p,qd[1,∞] with
1
p
+
1
q
=1. If the
sequences hx
n
i∈ !
p
and hy
n
i∈ !
q
,thenhx
n
y
n
i∈ !
1
and
∞
X
n=1
|x
n
y
n
| ≤ khx
n
ik
p
khy
n
ik
q
3
= kxk
p
kyk
q
′
(4.5)
where x = hx
n
i and y = hy
n
i
Proof. For p =1,q= ∞, we have
∞
X
i=1
|x
i
y
i
|≤
(
sup{y
n
,ndN}·
∞
X
n=1
|x
n
|
)
= khx
n
ik
1
khy
n
ik
∞
.
Next, let p,qd(1,∞). If hx
n
i or hy
n
i is a zero vector, we have equality in
(4.5). Now let hx
n
i 6=0,hy
n
i 6=0.
16
Substituting x
n
=
|hx
N
i|
kxk
p
,y
n
=
|hy
N
i|
kyk
q
for
ab in lemma 235, we have
∞
X
n=1
|hx
n
i|
kxk
p
·
|hy
n
i|
kyk
q
≤
1
p
∞
X
n=1
μ
|hx
n
i|
kx
p
k
?
p
+
1
q
∞
X
n=1
?
|hy
n
i|
kyk
q
!
q
≤
1
p
1
3
kxk
p
′
p
·
3
kxk
p
′
p
+
1
q
1
3
kyk
q
′
q
·
3
kyk
q
′
q
≤
1
p
+
1
q
=1.
16
Note this means that not all terms in the sequence equal 0 (i.e. there is at least one
term di?erent from 0).
102 CHAPTER 4. METRIC SPACES
By multiplying kxk
p
·kyk
q
we get the result.
Note that if p = q =2, Inequality (4.5) is called the Cauchy-Schwartz
inequality.
Nowwecanprovethatthek·k
p
norm satisTes the triangle inequality.
Theorem 237 (Minkowski) Let 1 ≤ p ≤∞,x = hx
n
i,y= hy
n
i ∈ !
p
.
Then
kx+yk
p
≤kxk
p
+kyk
p
. (4.6)
Proof. If p =1orp = ∞, the proof is trival. Let p ∈ (1,∞). By
multiplying both sides of (4.6) by
3
kx+yk
p
′
p?1
we get the equivalent in-
equality
3
kx+yk
p
′
p
≤
3
kxk
p
+kyk
p
′3
kx+yk
p
′
p?1
. Asimplecalculation
showsthisisequivalentto
P
∞
i=1
|x
i
|(|x
i
+y
i
|)
p?1
+
P
∞
i=1
|y
i
|(|x
i
+y
i
|)
p?1
≤
kxk
p
·
3
kx+yk
p
′
p?1
+ kyk
p
·
3
kx+yk
p
′
p?1
. Due to symmetry of x,y, it now
su?ces to show that
∞
X
i=1
|x
i
|(|x
i
+y
i
|)
p?1
≤kxk
p
·
3
kx+yk
p
′
p?1
. (4.7)
Let z
i
=(|x
i
+y
i
|)
p?1
thenkzk
q
=(
P
∞
i=1
(z
i
)
q
)
1
q
=
3
P
(|x
i
+y
i
|)
(p?1)q
′1
q
=
(
P
∞
i=1
|x
i
+y
i
|
p
)
p?1
p
= kx+yk
p?1
p
whereweusedthefactthatq(p?1) = p
and
1
q
=
p?1
p
. Now by Hyolder inequality (4.5), we have
P
∞
i=1
|x
i
·z
i
| ≤
kxk
p
·kzk
q
which by plugging in z
i
yields
P
∞
i=1
|x
i
|(|x
i
+y
i
|)
p?1
≤ kxk
p
·
3
kx+yk
p
′
p?1
is just inequality (4.7).
Now that we showed that for 1 ≤ p ≤∞,!
p
with k·k
p
is a normed vector
space, we ask ?Is it complete?? The answer is yes as the following theorem
shows.
Theorem 238 For 1 ≤ p ≤∞, the !
p
space is a complete normed vector
space (i.e. a Banach space).
Proof. Firstweshowitfor1≤ p<∞. Let hx
m
i be a Cauchy sequence
in !
p
, where x
m
=
D
ξ
(m)
i
E
(Note that hx
m
i is a sequence of sequences) such
4.5. NORMED VECTOR SPACES 103
that
P
∞
i=1
ˉ
ˉ
ˉξ
(m)
i
ˉ
ˉ
ˉ
p
< ∞ (m =1,2, ). Since hx
m
i is Cauchy with respect
to k·k
p
, this means that for ε ∈ (0,1), ? N such that
kx
m
?x
n
k
p
=
?
∞
X
i=1
ˉ
ˉ
ˉξ
(m)
i
?ξ
(n)
i
ˉ
ˉ
ˉ
p
!1
p
<ε,?m,n ≥ N (4.8)
=?
ˉ
ˉ
ˉξ
(m)
i
?ξ
(n)
i
ˉ
ˉ
ˉ <ε,?m,n ≥ N,i =1,2.
This shows that for each Txed i the sequence
D
ξ
(m)
i
E
∞
m=1
( ith compement of
hx
n
i ) is a Cauchy sequence in R. Since (R,|·|) is complete, it converges in
R. Let ξ
(m)
i
→ ξ
?
i
as m →∞which generates a sequence x =<ξ
?
1
,ξ
?
2
,.... > .
We must show that x ∈ !
p
and x
n
→ x with respect to the k·k
p
norm. From
(4.8) we have
P
k
i=1
ˉ
ˉ
ˉξ
(m)
i
?ξ
(n)
i
ˉ
ˉ
ˉ
p
<ε
p
for m,n ≥ N, k ∈N. Letting n →∞
we obtain
P
k
i=1
ˉ
ˉ
ˉξ
(m)
i
?ξ
i
ˉ
ˉ
ˉ
p
≤ ε
p
,?m ≥ N,k∈N and letting k →∞gives
∞
X
i=1
ˉ
ˉ
ˉξ
(m)
i
?ξ
?
i
ˉ
ˉ
ˉ
p
≤ ε
p
≤ ε,?m ≥ N. (4.9)
This shows that x
m
? x =
D
ξ
(m)
i
?ξ
?
i
E
∈ !
p
.Sincex
m
∈ !
p
, it fol-
lows by the Minkowski Theorem 237 that kxk
p
= kx
m
+(x?x
m
)k
p
≤
kx
m
k
p
+ k(x?x
m
)k
p
for x ∈ !
p
. Furthermore, if p = ∞, from (4.9) we
obtain kx
m
?xk
p
<ε,?m ≥ N which means x
m
→ x with respect to the
k·k
p
norm.
The proof works by taking a Cauchy sequence in !
p
(say << x
1
>,<
x
2
>,... < x
m
>,.. >) and showing that a sequence of components (say the
Trst one is <ξ
1
1
,ξ
2
1
,...,ξ
m
1
,... >)isalsoCauchyinR (convergingtosayξ
?
1
).
Then we show the original sequence of sequences converges to the sequence
<ξ
?
1
,ξ
?
2
,...,ξ
?
m
,... > . .
The following theorem shows that !
p
spaces can be ordered with
respect to the set relation ?? ?. That is, if a sequence belongs to !
1
,thenit
belongs to !
2
, etc. For example, <
1
n
>/∈ !
1
,but<
1
n
>∈ !
p
for p>1.
Theorem 239 If 1 <p<q<∞, then !
1
? !
p
? !
q
? !
∞
and khx
n
ik
q
≤
khx
n
ik
p.
104 CHAPTER 4. METRIC SPACES
Proof. Start with !
p
? !
∞
. Let x ∈ !
p
(i.e.
P
∞
i=1
|x
i
|
p
< ∞)sothat< |x
n
| >
is bounded. Then sup{|x
n
|,n∈N} < ∞ so that x ∈ !
∞
. We also have
?j , |x
j
|
p
≤
∞
X
i=1
|x
i
|
p
?? |x
j
|≤
?
∞
X
i=1
|x
i
|
p
!1
p
.
Therefore sup{|x
j
|,j∈N}≤ (
P
∞
i=1
|x
i
|
p
)
1
p
or kxk
∞
≤kxk
p
.
Next we show !
p
? !
q
for p<qand 1 ≤ p,q < ∞.
3
kxk
q
′
q
=
∞
X
i=1
|x
i
|
q
=
3
kxk
p
′
q
∞
X
i=1
?
|x
i
|
kxk
p
!
q
≤
3
kxk
p
′
q
∞
X
i=1
?
|x
i
|
kxk
p
!
p
=
3
kxk
p
′
q kxk
p
p
kxk
p
p
=
3
kxk
p
′
q
.
where the inequality follows since
|x
i
|
kxk
q
≤ 1andq>p.Taking the q-root of
the above inequality gives kxk
q
≤ kxk
p
. Now if x ∈ !
p
(i.e. kxk
p
< ∞),then
kxk
q
< ∞ and x ∈ !
q
.
Example 240 Note that the inclusion l
p
? l
q
for p<qis strict. To see
this, consider the sequence hx
n
i =
D
1
n
1
p
E
∞
n=1
. Itissimplertoworkwiththe
pth power of a norm to avoid using the pth root. Hence, take
μ
°
°
°
1
n
1
p
°
°
°
p
?
p
=
P
∞
n=1
3
1
n
1
p
′
p
=
P
∞
i=1
1
n
which is inTnitely large (we showed this in the exam-
ple of a harmonic series). Hence,
D
1
n
1
p
E
∞
n=1
/∈ !
p
. However
D
1
n
1
p
E
∞
n=1
∈ !
q
.To
see this,
μ
°
°
°
1
n
1
p
°
°
°
q
?
q
=
P
∞
n=1
1
n
q
p
where
q
p
> 1, this series is bounded (this can
be shown by using the integral criterion - See Bartle).
The fundamental di?erence between !
p
with 1 ≤ p<∞ and l
∞
is the
behavior of their tails. While it?s easy to see that for 1 ≤ p ≤∞if x ∈ !
p
4.6. CONTINUOUS FUNCTIONS 105
then lim
n?→∞
P
∞
i=n
|x
i
|
p
=0. It is not true in !
∞
. For instance the sequence
hx
i
i = h1,1,....1,.....i ∈ !
∞
but the norm of its tail is 1. This is the reason
why there are properties of !
∞
that are di?erent from those of !
p,
1 ≤ p<∞.
One of these properties is separability (i.e. the existance of a dense countable
subset.)
Theorem 241 !
p
is separable for 1 ≤ p<∞.
Proof. Let {e
i
,idN} be a basis of unit vectors. Then the set of all linear
combinations H = {
P
n
i=1
α
i
e
i
,α
i
dQ} is countable and dense in !
p
because if
x =(x
1
,x
2
,......) ∈ !
p
, then the tail of x (given by)
°
°
°
°
°
x?
n
X
i=1
x
i
e
i
°
°
°
°
°
p
=
?
∞
X
i=n+1
|x
i
|
p
!1
p
n?→∞
→ 0.
Thus, x is approximated by an element of H.
Theorem 242 !
∞
is not separable.
Proof. Let S be the set of all sequences containing only 0 and 1; that is S =
{0,1}
N
. Clearly S ? !
∞
and if x = hx
n
i , y = hy
n
i are two distinct elements
of S, then kx?yk
∞
=1. Hence B1
2
(x)∩B1
2
(y)=? for any x,y ∈ !
∞
,x6= y.
Let A be a dense set in !
∞.
Then for ε =
1
2
and given x ∈ S ? !
∞
, there
exists an element a ∈ A such that kx?ak
∞
<
1
2
. Because S is uncountable
A must be uncountable, thus any dense set in !
∞
must be uncountable.
4.6 Continuous Functions
Now we return to another important topological concept in mathematics that
is employed extensively in economics. Before deTning continuity, we amend
DeTnition 49 of a function in Section 5.2 in terms of general metric spaces.
DeTnition 243 Afunctionf from a metric space (X,d
X
) intoametric
space (Y,d
Y
) is a rule that associates to each x ∈ X a unique y ∈ Y.
106 CHAPTER 4. METRIC SPACES
DeTnition 244 Given metric spaces (X,d
X
) and (Y,d
Y
),the function f :
X → Y is (pointwise) continuous at x if, ?ε>0, ?δ(ε,x) > 0 such that if
d
X
(x
0
,x) <δ(ε,x),thend
Y
(f(x),f(x
0
)) <ε.Thefunctioniscontinuous
if it is continuous at each x ∈ X. See Figure 4.6.1.
Example 245 Let (X,d
X
)=((?∞,0)∪(0,∞),|·|), (Y,d
Y
)=(R,|·|), and
deTne
f(x)=
?
1 if x>0
?1 if x<0
.
Then f : X →Y is continuous on (X,d
X
).SeeFigure4.6.2.
Example 246 Let (X,d
X
)=(R,|·|), (Y,d
Y
)=(R,|·|), and deTne f(x)=
bx, b ∈ R\{0}.Thenf : X → Y is continuous on (X,d
X
) sincewecan
simply let δ(ε,x)=
ε
|b|
. Then, for any ε>0, if |x
0
?x| <δ(ε,x) we have
|bx
0
?bx| = |b||x
0
?x| <ε.Notice that in the case of linear functions, δ is
independent of x. Figure 4.6.3.
Example 247 Let (X,d
X
)=(R\{0},|·|), (Y,d
Y
)=(R,|·|), and deTne
f(x)=
1
x
.Foranyx ∈ X, then
|f(x
0
)?f(x)| =
ˉ
ˉ
ˉ
ˉ
1
x
0
?
1
x
ˉ
ˉ
ˉ
ˉ
=
|x
0
?x|
|xx
0
|
.
We wish to Tnd a bound for the coe?cient of |x
0
?x| which is valid around
0.If|x
0
?x| <
1
2
|x|, then
1
2
|x| < |x
0
| in which case
|f(x
0
)?f(x)|≤
2
|x|
2
|x
0
?x|.
In this case, δ(ε,x)=inf{
1
2
|x|,
1
2
ε|x|
2
}.Figure 4.6.4.
There is an equivalent way to deTne pointwise continuity in terms of the
inverse image (DeTnition 53) and in terms of sequences.
Theorem 248 Given metric spaces (X,d
X
) and (Y,d
Y
),the following state-
ments are equivalent: (i) function f : X → Y is continuous; (ii) if for each
open subset V of Y,thesetf
?1
(V) is an open subset of X; and (iii) if for
every convergent sequence x
i
→x in X,the sequence f(x
i
) → f(x).
4.6. CONTINUOUS FUNCTIONS 107
Proof. (Sketch)(ii?i) Any ε-ball around f(x)isopensothereisaδ-ball
around x inside f
?1
(B
ε
(f(x)). (iii)?(ii) If not, then there is an x ∈ f
?1
(V)
such that for any
1
n
neighborhood of it , we can Tnd a point x
n
such that
f(x
n
) /∈ V.But<x
n
> contradicts (iii). (i)?(iii) From (i) for x
n
close
enough to x, f(x
n
)willbeasclosetof(x) as we want, so that f(x
n
) → f(x).
The previous two examples go against ?conventional wisdom? that the
graph of a continuous function is not interupted and may raise the question
of the existence of a function that is not continuous. The following example
provides such a function.
Example 249 Let (X,d
X
)=(R,|·|), (Y,d
Y
)=(R,|·|), and deTne
17
f(x)=
?
?
?
1 if x>0
0 if x =0
?1 if x<0
.
Then f
?1
((?
1
2
,
1
2
)) = {0}, the inverse image of an open set is closed, therefore
this function is not continuous in (X,d).See Figure 4.6.5.
Next we show that the composition of continuous functions preserves
continuity.
Theorem 250 Given metric spaces (X,d
X
),(Y,d
Y
),and (Z,d
Z
),andcon-
tinuous functions f : X → Y and g : Y → Z, then h : X → Z given by
h = g?f is continuous.
Proof. Let U ? Z be open. Then g
?1
(U)isopeninY and f
?1
(g
?1
(U)is
open in X.Butf
?1
(g
?1
(U)=(g?f)
?1
.
It follows that certain simple operations with continuous functions pre-
serve continuity.
Theorem 251 Given a metric space (X,d
X
) and a normed vector space
(Y,d
Y
), and continuous functions f : X → Y and g : X → Y, then the
following are also continuous: (i) f ±g; (ii) f ·g; (iii)
f
g
; (iv) |f|.
Exercise 4.6.1 Prove Theorem 251.
17
This is known as the ?sgn? function.
108 CHAPTER 4. METRIC SPACES
It should be emphasized that Theorem 250 does not say that if f is
continuous and U is open in X then the image f(U)={f(x),x∈ U} is open
in Y.
Example 252 Let (X,d
X
)=(R,|·|), (Y,d
Y
)=(R,|·|), and deTne f(x)=
x
2
.Thenf((?1,1)) = [0,1) is the image of an open set which is not open.
SeeFigure4.6.7.
Therefore continuity does not preserve openness. It does not preserve
closedness either as the next example shows.
Example 253 Let (X,d
X
)=(R\{0},|·|), (Y,d
Y
)=(R,|·|), and deTne
f(x)=
1
x
.Thenf([1,∞)) = (0,1] is the image of a closed set which is not
closed. See Figure 4.6.8.
There are, however, important properties of a set which are preserved
under continuous mapping. The next subsections establish this.
4.6.1 Intermediate value theorem
Theorem 254 (Preservation of Connectedness) The image of a con-
nected space under a continuous function is connected.
Proof. Let f : X → Y be a continuous function on X and let X be
connected. We wish to prove that Z = f(X) is connected. Assume the
contrary. Then there exists open disjoint sets A and B such that Z =
(A∩Z)∪(B ∩Z)and(A∩Z),(B∩Z) is a separation of Z into two dis-
joint, non-empty sets in Z.Thenf
?1
(A∩Z)=f
?1
(A)∩f
?1
(Z)=f
?1
(A)∩
X = f
?1
(A)andf
?1
(B ∩ Z)=f
?1
(B) are disjoint sets whose union is
X (= f
?1
(A∩Z)∪f
?1
(B∩Z)). They are open in X because f is continu-
ous and non-empty because f : X → f(X) is a surjection. Therefore f
?1
(A)
and f
?1
(B) form a separation of X which contradicts the assumption that
X is connected.
Inthespecialcasewherethemetricspace(Y,d
Y
)=(R,|·|) then the
corollary of this theorem is the well-known Intermediate Value Theorem.
Corollary 255 (Intermediate Value Theorem) Let f : X → R be a
continuous function of a connected space X into R.Ifa,b ∈ X and if r ∈ Y
such that f(a) ≤ r ≤ f(b), then ?c ∈ X such that f(c)=r. See Figure 4.6.9.
4.6. CONTINUOUS FUNCTIONS 109
Exercise 4.6.2 Prove Corollary 255.
Note that it is connectedess that is required for the Intermediate value
theorem and not compactness.
Example 256 Let (X,d
X
)=([?2,?1]∪[1,2],|·|), (Y,d
Y
)=(R,|·|), and
deTne
f(x)=
?
1 if x ∈ [1,2]
?1 if x ∈ [?2,?1]
.
Then f : X → Y is continuous on the compact set X but for r =0, there
doesn?t exist c ∈ X such that f(c)=0.
A nice one dimensional example of how important the intermediate value
theorem is for economics, is the following Txed point theorem.
Corollary 257 (One Dimensional Brouwer) Let f :[a,b] → [a,b] be a
continuous function. Then f has a Txed point.
Proof. Let g :[a,b] → R be deTned by g(x)=f(x) ?x. Clearly g(a)=
f(a)?a ≥ 0sincef(a) ∈ [a,b]andg(b)=f(b)?b ≤ 0forthesamereason.
Since g(x) is a continuous function
18
with g(b) ≤ 0 ≤ g(a), we know by
the Intermediate Value Theorem 255 that ?x ∈ [a,b]suchthatg(x)=0or
equivalently that f(x)=x.
The proof is illustrated in Figure 4.6.10. For a more general version of
this proof, see Section 4.8.
The next series of examples shows how connectedness of R
+
can be used
to construct a continuous ?utility? function u(x) that represents a preference
relation %. Before establishing this, however, we need to deTne continuity in
terms of relations.
DeTnition 258 Thepreferencerelation% on X is continuous if for any
sequence of pairs < (x
n
,y
n
) >
∞
n=1
with x
n
% y
n
?n, x = lim
n→∞
x
n
,and y =
lim
n→∞
y
n
, then x%y.
18
To see g(x) is a continuous function, we must show ?ε>0andx,y ∈ [a,b],?δ
g
> 0
such that if |x?y| <δ
g
then |g(x)?g(y)| <ε.But
|g(x)?g(y)| = |(f(x)?f(y))?(x?y)|≤|f(x)?f(y)|+|x?y|
by the triangle inequality. Continuity of f implies ?ε>0,?δ
f
> 0suchthat|x?y| <δ
f
and |f(x)?f(y)| <ε.Thus, let if we let δ
g
=min{δ
f
,ε}/2, then |g(x)?g(y)| <ε.
110 CHAPTER 4. METRIC SPACES
An equivalent way to state this notion of continuity is that ?x ∈ X,the
upper contour set {y ∈ X : y %x} and the lower contour set {y ∈ X : x%y}
are both closed; that is, for any <y
n
>
∞
n=1
such that x % y
n
, ?n and
y =limy
n
,we have x%y (just let x
n
= x,?n).
There are some preference relations that are not continuous as the fol-
lowing example shows.
Example 259 Lexicographic preferences (on X = R
2
+
) are deTned in
the following way: x%y if either ?x
1
>y
1
? or ?x
1
= y and x
2
≥ y
2
?.Tosee
they are not continuous, consider the sequence of bundles <x
n
=(
1
n
,0) >
and <y
n
=(0,1) >. For every n we have x
n
? y
n
. But lim
n→∞
y
n
=(0,1) ?
(0,0) = lim
n→∞
x
n
. Thatis,aslongastheTrst component of x is larger than
that of y, x is preferred to y even if y
2
is much larger than x
2
. But as soon as
the Trst components become equal, only the second components are relevant
so that the preference ranking is reversed at the limit points.
Now we establish that we can ?construct? a continuous utility function.
Example 260 If the rational preference relation %on X is continuous, then
there is a continuous utility function u(x) that represents %.Toseethis,by
continuity of %, we know that the upper and lower contour sets are closed.
Then the sets A
+
= {α ∈R
+
: αe % x} and A
?
= {α ∈R
+
: x % αe},where
e is the unit vector, are nonempty and closed. By completeness of %, R
+
?
(A
+
∪A
?
). The nonemptiness and closedness of A
+
and A
?
,alongwiththe
fact that R
+
is connected, imply A
+
∪A
?
6= ?.Thus,?α such that αe ~ x.
By monotonicity of %, α
1
e ? α
2
e whenever α
1
>α
2
. Hence, there can be at
most one scalar satisfying αe ~ x. This scalar is α(x),whichwetakeasthe
utility function.
4.6.2 Extreme value theorem
The next result is one of the most important ones for economists we will
come across in the book.
Theorem 261 (Preservation of Compactness) The image of a compact
set under a continuous function is compact.
Proof. Let f : X → Y be a continuous function on X and let X be
compact. Let G be an open covering of f(X)bysetsopeninY.The
4.6. CONTINUOUS FUNCTIONS 111
collection {f
?1
(G),G ∈ G} is a collection of sets covering X. These sets
are open in X because f is continuous. Hence Tnitely many of them, say
f
?1
(G
1
),...,f
?1
(G
n
)coverX.ThenthesetsG
1
,...,G
n
cover f(X).
Againinthespecialcasewhere(Y,d
Y
)=(R,|·|),a direct consequence of
this theorem is the well known Extreme Value Theorem of calculus.
19
Corollary 262 (Extreme Value Theorem) Let f : X → R be a con-
tinuous function of a compact space X into R.Then?c,d ∈ X such that
f(c) ≤ f(x) ≤ f(d) for every x ∈ X. f(c) is called the minimum and f(d) is
called the maximum of f on X.
Proof. Sincef is continuous andX is compact, the setA = f(X)iscompact.
We show that A has a largest element M and a smallest element m.Then
since m and M belong to f(X), we must have m = f(c)andM = f(d)for
some points c and d of X.
If A has no largest element, then the collection {(?∞,a),a∈ A} forms an
open covering ofA. SinceAis compact, some Tnite subcollection{(?∞,a
1
),...,
(?∞,a
n
)} covers A.Leta
M
=max{a
1
,...a
n
} then a
M
∈ A belongs to none
of these sets, which contradicts the fact that they cover A.
A similar argument can be used to show that A has a smallest element.
Exercise 4.6.3 Let X =[0,1) and f(x)=x. Why doesn?t a maximum exist?
See Figure 4.6.11.
4.6.3 Uniform continuity
One might believe from part (iii) of the Theorem 248 that if <x
n
> is
Cauchy and if f is continuous, then <f(x
n
) > is also Cauchy. The following
examples show this is false if f is pointwise continuous.
Example 263 Take the sequence <x
n
>=
D
(?1)
n
n
E
and consider the func-
tion f deTned in Example 245. While <x
n
> is Cauchy in (?∞,0)∪(0,∞),
<f(x
n
) >=< ?1,1,?1,1,.... > which is not Cauchy. See Figure 4.6.12.
Example 264 Let f(x)=
1
x
on (0,1] which was shown to be pointwise con-
tinuous in Example 247. Consider the Cauchy sequence <
1
n
> on (0,1]. It
is clear that <f(x
n
) >=<n>,which is obviously not Cauchy.See Figure
4.6.13.
19
Sometimes this is called the Maximum and Minimum Value Theorem. Since in the
next section we will introduce the Maximum Theorem, we choose the above terminology.
112 CHAPTER 4. METRIC SPACES
For the above intuition to hold, we need a stronger concept of continuity.
DeTnition 265 Given metric spaces (X,d
X
) and (Y,d
Y
),the function f :
X → Y is uniformly continuous if ?ε>0,?δ(ε) > 0 such that ?x,x
0
∈ X
with d
X
(x
0
,x) <δ(ε),thend
Y
(f(x),f(x
0
)) <ε.
While this deTnition looks similar to that of pointwise continuity in Def-
inition 244, the di?erence is that while δ generally depends on both ε and x
in the case of pointwise continuity, it is independent of x in case of uniform
continuity.
Theorem 266 Given metric spaces (X,d
X
) and (Y,d
Y
), let the function
f : X → Y be uniformly continuous. If <x
n
> is a Cauchy sequence in X,
then <f(x
n
) > isaCauchysequenceinY.
Proof. Let <x
n
> be a Cauchy sequence in X. Because f : X → Y is
uniformly continuous then ?ε>0, ?δ(ε) > 0 such that ?x,x
0
∈ X with
d
X
(x
0
,x) <δ(ε), then d
Y
(f(x),f(x
0
)) <ε.Since<x
n
> is Cauchy for given
δ(ε) > 0there?N such that ?m,n ∈ N with m,n > N then d
X
(x
m
,x
n
) <
δ(ε). But then d
Y
(f(x
m
),f(x
n
)) <ε. Hence <f(x
n
) > is a Cauchy sequence
in Y.
AccordingtothistheoremthefunctionsinExamples263and264arenot
uniformly continuous. Notice that the domains of each of the functions in
the examples are not compact in (R,|·|). Let?s consider another example.
Example 267 Let f :[0,∞) → R given by f(x)=x
2
. This function
is continuous on R. Is it uniformly continuous? No. We show this by
Tnding an ε>0 such that ?δ>0, ?x
1
,x
2
such that d
X
(x
n
,x) <δand
d
Y
(f(x
1
)),f(x
2
)) ≥ ε. Let ε =2and take any δ>0. Then ?n ∈N such that
1
n
<δ.DeTne x
1
= n+
1
n
and x
2
= n.Thend
X
(x
1
,x
2
)=(n+
1
n
?n)=
1
n
<δ
and d
Y
(f(x
1
)),f(x
2
)) = (n+
1
n
)
2
?n
2
=2+
1
n
2
> 2. Notice that the domain
of this function [0,∞) is not compact in (R,|·|).
If the domain of a continuous function is compact, then the function is
also uniformly continuous as the following theorem asserts.
Theorem 268 (Uniform Continuity Theorem) Let f : X → Y be a
continuous function of a compact metric space (X,d
X
) to the metric space
(Y,d
Y
). Then f is uniformly continuous.
4.7. HEMICONTINUOUS CORRESPONDENCES 113
Proof. (Sketch) For a given ε>0, by continuity of f around any x ∈ X we
canTnd aδ(
1
2
ε,x)-ball suchthat forx
0
∈ B
δ(
1
2
ε,x)
(x)wehaved
Y
(f(x)),f(x
0
)) <
1
2
ε. Since the collection of such open balls is an open covering of X and X is
compact, there exists a Tnite (say n)subcoverofthem.Thenforx,x
0
∈ X
such that d
X
(x
0
,x) <δ(ε)=
1
2
min{δ(
1
2
ε,x
1
),....,δ(
1
2
ε,x
n
)}, there exists k
such that x ∈ B
δ(
1
2
ε,x
k
)
(x
k
)andx
0
∈ B
δ(
1
2
ε,x
k
)
(x
k
). Therefore by the triangle
inequality d
Y
(f(x)),f(x
0
)) <ε.
The number δ(ε) that we constructed in the proof of Theorem 268, is
called the Lebesgue number of the covering G.
Exercise 4.6.4 Why is f(x)=
1
x
not uniformly continuous on X =(0,1]
butitison[10
?1000
,1]?
4.7 Hemicontinuous Correspondences
Many problems in economics result in set-valued mappings or correspon-
dences as deTned in Section 2.3. For instance, if preferences are linear, a
household?s demand for goods may described by a correspondence and in
game theory we consider best response correspondences.
Before deTning hemicontinuity, we amend DeTnition 48 of a correspon-
dence in Section 2.3 in terms of general metric spaces.
DeTnition 269 A correspondence Γ from a metric space (X,d
X
) into
ametricspace(Y,d
Y
) is a rule that associates to each x ∈ X asubset
Γ(x) ∈ Y.Itsgraph is the set A = {(x,y) ∈ X × Y : y ∈ Γ(x)} which
we will denote Gr(Γ). The image of a set D ? X, denoted Γ(D) ? Y, is
the set Γ(D)=∪
x∈D
Γ(x). A correspondence is closed valued at x if the
image set Γ(x) is closed in Y. A correspondence is compact valued at x if
the image set Γ(x) is compact in Y.SeeFigure4.7.1.
Unlike a (single-valued) function, there are two ways to deTne the inverse
image of a correspondence Γ of subset D.
DeTnition 270 For Γ : X 3 Y and any subset D ? Y we deTne the in-
verse image (also lower or weak) as Γ
?1
(D)={xdX : Γ(x)∩D 6=0} and
the core (also upper or strong inverse image) Γ
+1
(D)={xdX : Γ(x) ? D}.
114 CHAPTER 4. METRIC SPACES
It is clear that Γ
+1
(D) ?Γ
?1
(D). Also observe that
Γ
+1
(Y\D)=X ?Γ
?1
(D)and
Γ
?1
(Y\D)=X\Γ
+1
(D).
See Figure 4.7.2. These two types of inverse image naturally coincide when
Γ is single-valued.
To make the notion of correspondence clearer we present a number of
examples (see Figure 4.7.3a-3f).
Example 271 Γ :[0,1]3[0,1] deTned by Γ(x)=
?
?
?
1 if x<
1
2
{0,1} if x =
1
2
0 if x>
1
2
.
Example 272 Γ :[0,1] 3[0,1] deTned by Γ(x)=
?
?
?
1 if x<
1
2
[0,1] if x =
1
2
0 if x>
1
2
.
Example 273 Γ :[0,1]3[0,1] deTned by Γ(x)=[x,1].
Example 274 Γ :[0,1]3[0,1] deTned by Γ(x)=
? £
0,
1
2
¤
if x 6=
1
2
[0,1] if x =
1
2
.
Example 275 Γ :[0,1]3[0,1] deTned by Γ(x)=
?
[0,1] if x 6=
1
2£
0,
1
2
¤
if x =
1
2
.
Example 276 Γ :[0,∞) 3R deTned by Γ(x)=[e
?x
,1].
We next deTne a set valued version of continuity.
DeTnition 277 Given metric spaces (X,d
X
) and (Y,d
Y
), the correspon-
dence Γ : X 3 Y is lower hemicontinous (lhc) at x ∈ X if Γ(x)
is non-empty and if for every open set ???CHANGE TO S,TV ? Y with
Γ(x)∩V 6= ?, there exists a neighborhood U of x such that Γ(x
0
)∩V 6= ? for
every x
0
∈ U.The correspondence is lower hemicontinuous if it is lhc at
each x ∈ X.
20
SeeFigure4.7.4.
20
There are various names given to this concept. In many math books, this is called
semicontinuity.
4.7. HEMICONTINUOUS CORRESPONDENCES 115
Note that the correspondences presented in Examples 273, 275, and 276
are Ihc.
As in the case of continuity of a function, there are equivalent characteri-
zations of Ihc in terms of open (closed) sets or sequences as the next theorem
shows.
Theorem 278 Given metric spaces (X,d
X
) and (Y,d
Y
), for a correspon-
dence Γ : X 3 Y the following statements are equivalent. (i) Γ is Ihc; (ii)
Γ
?1
(V) is open in X whenever V ? Y is open in Y; (iii) Γ
+1
(U) is closed
in X whenever U ? Y is closed in Y;and(iv)?xdX, ?y ∈ Γ(x) and every
sequence <x
n
>→ x, ?N such that <y
n
>→y and y
n
∈Γ(x
n
), ?n ≥ N.
Proof. (i) ?? (ii)Let V be open in Y , Γ
?1
(V)={x ∈ X : Γ(x)∩V 6=0}
and take x
0
∈Γ
?1
(V). Since Γis Ihc at x
0
then ? U open such that, X
0
∈ U,
Γ(x
0
)∩V 6=0foreveryx
0
∈ U. Hence U ?Γ
?1
(V)sothatΓ
?1
(V)isopen.
(ii) ?? (iii) follows immediatly from X\Γ
?1
(U)=Γ
+1
(Y\U).
(i) ?? (iv) First start with (?). Let <x
n
>→ x and Tx an arbitrary
point y ∈ Γ(x). For each k ∈ N, B1
k
(y)∩Γ(x) 6= ?.SinceΓ is lhc at x, ?k
thereexistsanopensetU
k
ofxsuch that?x
0
k
∈ U
k
we have Γ(x
0
k
)∩B1
k
(y) 6= ?.
Since <x
n
>→ x, ?k we can Tnd n
k
such that x
n
∈ U
k
, ?n ≥ n
k
and they
can be assigned so that n
k+1
>n
k
.Also,sincex
n
∈ U
k
, ?n ≥ n
k
, then
Γ(x
n
)∩B1
k
(y) 6= ?. Hence we can construct a companion sequence <y
n
>,
with y
n
chosen from the set Γ(x
n
)∩B1
k
(y)foreachn ≥ n
k
. As k, and hence n,
increases the radius of the balls B1
k
(y) shrinks to zero, implying <y
n
>→ y.
Next we prove (?). In this case, it is su?cient to prove the contrapositive.
Assume Γ is not lhc at x.Then?V with Γ(x) ∩ V 6= ? such that every
neighborhood U of x contains a point x
0
u
with Γ(x
0
u
) ∩ V = ?.Takinga
sequence of such neighborhoods, U
n
= B1
n
(x) and a point in each of them,
we obtain a sequence <x
n
>→ x by construction and has the property
Γ(x
n
)∩V = ?. Hence every companion sequence <y
n
> with y
n
∈Γ(x
n
)is
contained in the complement of V, and if <y
n
>→ y then y is contained in
the complement of V since Y\V is closed. Thus no companion sequence of
<x
n
> can converge to a point in V.
Thus, Γ is lhc at x if any y ∈Γ(x) can be approached by a sequence from
both sides. Also, if the correspondence F is a function, then F
?1
(U)isthe
inverse image of a function so (ii)statesF is Ihc i? F is continuous.
116 CHAPTER 4. METRIC SPACES
DeTnition 279 Given metric spaces (X,d
X
) and (Y,d
Y
), the correspon-
dence Γ : X 3 Y is upper hemicontinous (uhc) at x ∈ X if Γ(x)
is non-empty and if for every open set V ? Y with Γ(x) ? V,thereexistsa
neighborhood U of x such that Γ(x
0
) ? V for every x
0
∈ U.Thecorrespon-
dence is upper hemicontinuous if it is uhc at each x ∈ X.SeeFigure
4.7.5.
The correspondences presented in Examples 271-276 are uhc.
Again, uhc can be characterized in terms of open (closed) sets or se-
quences.
Theorem 280 Given metric spaces (X,d
X
) and (Y,d
Y
), for a correspon-
dence Γ : X 3 Y the following statements are equivalent: (i) Γ is uhc; (ii)
Γ
+1
(V) is open in X whenever V ? Y is open in Y; (iii) Γ
?1
(U) is closed in
X whenever U ? Y is closed in Y; and if Γ is compact valued, then (iv) for
every sequence <x
n
>→ x and every sequence <y
n
> such that y
n
∈Γ(x
n
),
?n, there exists a convergent subsequence <y
g(n)
>→ y and y ∈Γ(x).
Proof. (Sketch)(i) and Γ compact?(iv). First, we must show that the
the companion sequence <y
n
> is bounded. Since <y
n
> is bounded, it is
containedinacompactsetsothatbyTheorem 193, there exists a convergent
subsequence. Finally, we must show that the limit of this subsequence is in
Γ(x).
(i)?(iv) Again, it is su?cient to prove the contrapositive; If Γ is not uhc
at x,then there is no subsequence converging to a point in Γ(x).
Exercise 4.7.1 Finish the proof of Theorem 280.
21
???Thus, Γ is uhc at x if any y ∈Γ(x) can be approached by a sequence
from ????.
If the correspondence F is a function, then F
+1
(U)=F
?1
(U)isthe
inverse image of the function and so by (ii),F is uhc i? F is continuous.
Each type of hemicontinuity can be interpreted in terms of the restrictions
of the ?size? of the set Γ(x)asx changes.
? Suppose Γ is uhc at x and Tx V ? Γ(x). As we move from x to a
nearby point x
0
,thesetV gives an ?upper bound? on the size of Γ(x
0
)
21
de la Fuente (Theorem 11.2, p. 110).
4.7. HEMICONTINUOUS CORRESPONDENCES 117
sincewerequireΓ(x
0
) ? V. Hence uhc requires the image set Γ(x)
does not ?explode? with small changes in x, but allows it to suddenly
?implode?.
? SupposeΓis lhc at x. As we move from x to a nearby point x
0
,thesetV
gives a ?lower bound? on the size ofΓ(x
0
) since we requireΓ(x
0
)∩V 6= ?.
Hence lhc requires the image set Γ(x) does not ?implode? with small
changes in x, but allows it to suddenly ?explode?.
DeTnition 281 Given metric spaces (X,d
X
) and (Y,d
Y
), a correspondence
Γ : X 3Y is continuous at x ∈ X if it is both lhc and uhc at x.
The correspondences in Examples 273 and 276 are continuous.
Example 282 Consider the following example of a best response correspon-
dence derived from game theory. The game is played between two individuals
who can choose between two actions, say go up (U) or go down (D).Ifboth
choose U or both choose D, they meet. If one chooses U and the other chooses
D,they don?t meet. Meetings are pleasurable and yield each player payo? 1,
while if they don?t meet they receive payo? 0. This is known as a coordination
game. The players choose probability distributions over the two actions: say
player 1 chooses U with probability p and D with probability 1?p while player
2 chooses U with probability q and D with probability 1?q. We represent this
game in ?normal form? by the matrix in Table 4.7.1. Agent 1?s payo? from
playing U with probability p while his opponent is playing U with probability
q is denoted π
1
(p,q) and given by
π
1
(p,q)=p·[q ·1+(1?q)·0] + (1?p)·[q ·0+(1?q)·1]
=1?q?p+2pq
Agent 1 chooses p ∈ [0,1] to maximize π
1
(p,q). We call this choice a best
response correspondence p
?
(q). It is simple to see that: for any q<
1
2
, proTts
are decreasing in p so that p
?
=0is a best response, for any q>
1
2
, proTts
are increasing in p so that p
?
=1is a best response, and at q =
1
2
, proTts
are independent of p so that any choice of p
?
∈ [0,1] is a best response.
22
22
To see this, note that
dπ
dp
=2q?1sothat
dπ
dp
< 0ifq<
1
2
dπ
dp
=0 ifq =
1
2
dπ
dp
> 0ifq>
1
2
.
118 CHAPTER 4. METRIC SPACES
Obviously, p
?
(q) is a correspondence. It is not lhc at q =
1
2
since if we
let V =(
1
4
,
3
4
), then p
?
(
1
2
) ∩ V 6= ? and there exists no neighborhood U
around
1
2
such that p
?
(q) ∩ V 6= ? for q ∈ U (e.g. for any
1
2
≥ ε>0,
p
?
(
1
2
? ε)=0/∈ (
1
4
,
3
4
) and p
?
(
1
2
+ ε)=1/∈ (
1
4
,
3
4
)). It is, however, uhc
at q =
1
2
sincewemusttakeV =(a,b) with a<0 and b>1 to satisfy
p
?
(
1
2
)=[0,1] ? V. But then there exist many neighborhoods U around
1
2
such
that p
?
(q) ? V for q ∈ U (e.g. for any
1
2
≥ ε>0,p
?
(
1
2
?ε)=0∈ (a,b)
and p
?
(
1
2
+ ε)=1∈ (a,b)). See Figure 4.7.6. Finally, you should recognize
that this game is symmetric so that agent 2?s payo?s and hence best response
correspondence is identical to that of agent 1.
Table 4.7.1
player 2 q 1?q
1 U D
p U 1,1 0,0
1?p D 0,0 1,1
Just as it was cumbersome to apply DeTnition 185 to establish compact-
ness, it is similarly cumbersome to apply DeTnitions 277 and 279 to establish
hemicontinuity. In the case of compactness, we provided simple su?cient
conditions (e.g. the Heine-Borel Corollary 194). Here we supply another set
of simple su?cient conditions to establish hemicontinuity.
Theorem 283 Let Γ : X 3 Y be a non-empty valued correspondence and
let A be its graph. If (i) A is convex and (ii) for any bounded set
b
X ? X,
there is a bounded set
b
Y ? Y such that Γ(x)∩
b
Y 6= ?, ?x ∈
b
X, then Γ is lhc
at every interior point of X.
Proof. Let bx be an interior point of X, by ∈Γ(bx), and <x
n
>? X with x
n
→
bx. Since x
n
is convergent, choose ε>0suchthat
b
X = B
ε
(bx) ? X.LetD
denote the boundary set of
b
X.We can represent x
n
as a convex combination
of bx and a point in D.Thatis,?α
n
,d
n
such that x
n
= α
n
d
n
+(1?α
n
)bx where
α
n
∈ [0,1] and d
n
∈ D.SinceD is a bounded set, α
n
→ 0asx
n
→ bx. Choose
b
Y such that Γ(x)∩
b
Y 6= ?,?x ∈
b
X. Then for each n, choose by
n
∈Γ(d
n
)∩
b
Y
so that y
n
= α
n
by
n
+(1?α
n
)by. Since (d
n
,by
n
) ∈ A,?n, (bx,by) ∈ A, and A is
convex, then (x
n
,y
n
) ∈ A,?n.Sinceα
n
→ 0andby
n
∈
b
Y,y
n
→ by. Hence
< (x
n
,y
n
) >? A and converges to (bx,by).
4.7. HEMICONTINUOUS CORRESPONDENCES 119
Theorem 284 Let Γ : X 3 Y be a non-empty valued correspondence and
let A be its graph. If (i) A is closed and (ii) for any bounded set
b
X ? X,the
set Γ(
b
X) is bounded, then Γ is compact valued and uhc.
Proof. Compactness follows directly from (i) and (ii). Let x
n
→ x ∈ X
with <x
n
>? X.SinceΓ is non-empty, ?y
n
∈ Γ(x
n
),?n.Sincex
n
→ x,
there is a bounded set
b
X ? X such that <x
n
>?
b
X with x ∈
b
X by
Theorem 164. Then by (ii), Γ(
b
X) is bounded. Hence <y
n
>? Γ(
b
X)hasa
convergent subsequence, <y
g(n)
>→ y. Thus, < (x
g(n)
,y
g(n)
) > is a sequence
in A converging to (x,y). Since A is closed, (x,y) ∈ A.
In a future section we will use the following relationship between uhc of
a correspondence and the closedness of its graph.
Theorem 285 The graph of an uhc correspondence Γ : X 3 Y with closed
valuesisclosed.
Proof. We have to prove that X ×Y\Gr(Γ)isopen. Take(x,y)dX ×
Y\Gr(Γ)sothaty/∈Γ(x). Now we can choose an open neighborhood V
y
of
y in Y and V
Γ(x)
ofΓ(x)inY such that V
y
∩V
Γ(x)
= ?. By (ii) of Theorem 280,
U
x
= Γ
+1
?
V
Γ(x)
¢
is an open neighborhood of x in X,consequentlyU
x
×V
y
is
an open neighborhood of (x,y)inX ×Y . Because U
x
×V
y
∩Gr(Γ)=? we
have U
x
×V
y
? X ×Y\Gr(Γ) and hence X ×Y\Gr(Γ) is open. See Figure
4.7.7.
The converse of this theorem doesn?t hold as the following example indi-
cates.
Example 286 Consider the function F : R→R given by
F (x)=
?
1
x
, x 6=0
0 , x =0
.
F has a closed graph but is not uhc since it is clear that for an open set (?ε,ε)
in R, Γ
+1
(?ε,ε)=(?∞,?
1
ε
)∪{0}∪(
1
ε
,∞), which is not open. However if
the image F (X) is compact, or a subset of a compact set, then the converse
of Theorem 285 holds (i.e. a closed graph implies uhc). Hence, closedness of
the graph can be used as a criterion of uhc. See Figure 4.7.8.
Theorem 287 Let Γ : X 3 Y be a correspondence such that Γ(X) ? K
where K is compact and the graph Gr(Γ) is closed. Then Γis uhc.
120 CHAPTER 4. METRIC SPACES
Proof. Assume to the contrary that Γ is not uhc at x
0
. Then there exists
an open neighborhood V
Γ(x
0
)
of Γ(x
0
)inY such that for every open neigh-
borhood U
x
0
of x in X we have that Γ(U
x
0
) is not contained in V
Γ(x
0
)
. We
take U
x
0
= B1
n
(x
0
),n∈N. Then for every n we get a point x
n
dB1
n
(x
0
)such
that Γ(x
n
)isnotcontainedinV
Γ(x
0
)
. Let y
n
∈Γ(x
n
)andy
n
/∈ V
Γ(x
0
)
. Then
we have hx
n
i → x
0
and hy
n
i ? K. Since K is compact, there exists a subse-
quence
-
y
g(n)
?
→ y ∈ K. Since y
n
/∈ V
Γ(x
0
)
, ?
n
, this implies y
n
∈ Y\V
Γ(x
0
)
.
Since Y\V
Γ(x
0
)
is closed, then y ∈ Y\V
Γ(x
0
)
so that y/∈ V
Γ(x
o
)
.Then we have
-
x
n
,y
g(n)
?
? Gr(Γ)and
-
x
n
,y
g(n)
?
→ (x,y). Since the Gr(Γ)isclosed,
(x,y)dGr(Γ). But this contradicts y/∈ V
Γ(x
0
)
.
Now we state a few lemmas that will be very useful in the next chapter.
DeTnition 288 Let (X,d) be a metric space and (Y,k·k) be a normed vector
space. Let Γ : X 3 Y be a correspondence. Then we can deTne two new
correspondences: Γ(the closure of Γ)andco(Γ) (the convex hull of Γ)bythe
following
Γ : X 3Y given by Γ(x)=Γ(x), ?xdX
co(Γ):X →Y given by (co(Γ))(x)=coΓ(x), ?xdX.
Note Γ is by deTnition always closed valued and co(Γ)isbydeTnition
always convex valued.
Example 289 Γ :[0,1] 3R given by Γ(x)=[0,x). Then Γ(x)=[0,x]. See
Figure 4.7.9.
Example 290 Γ :[0,1] 3 R given by Γ(x)={0,1}. Then co(Γ(x)) =
[0,x].See Figure 4.7.10.
Lemma 291 If Γ : X 3Y is lhc then Γis also lhc.
Proof. The proof uses the following result:
If G is open in Y and if A ? Y, then A∩G 6= ? i?
ˉ
A∩G 6= ?. (4.10)
Since A∩G ?
ˉ
A∩G, one direction is clear. Let
ˉ
A∩G 6= ?.If X ∈
ˉ
A∩G,
then X ∈
ˉ
A and if X ∈ G, then ? <x
n
>→ x and x
n
∈ A, ?n ∈ N. Since G
is open, ?ε such that B
ε
(x) ? G. Since <x
n
>→ x, we have x
n
dB
ε
(x) ? G,
?n su?ciently large. Hence x
n
∈ A∩G so that A∩G 6= ?.
Now we need to prove that Γ
?1
(V)isopeninX if V is open in Y. But
from (4.10), Γ
?1
(V)=Γ
?1
(V) which is open because Γ is lhc.
4.7. HEMICONTINUOUS CORRESPONDENCES 121
Lemma 292 If Y isanormedvectorspaceandΓ : X 3 Y is lhc, then
co(Γ) is lhc.
Proof. Let x ∈ X, <x
n
>→ x, and y ∈ co(Γ (x)). We need to show that
? <y
n
> such that <y
n
>→ y and y
n
∈ co(Γ(x
n
)). Since y ∈ co(Γ(x)),
then y =
P
m
i=1
λ
i
y
i
,wherey
i
∈ Γ(x)and
P
m
i=1
λ
i
=1. Since Γ is lhc,
? <y
i
n
>
∞
n=1
such that y
i
n
∈Γ(x
n
)and<y
i
n
>→ y
i
for each i =1,...,m. Let
y
n
=
P
m
i=1
λ
i
y
i
n
. Then <y
n
>→y and y
n
∈ co(Γ(x
n
)).
Given two correspondences Γ
1
: X 3Y and Γ
2
: X 3Y ,providedthat
Γ
1
(x)∩Γ
2
(x) 6= ?,?xdX, we can deTne a new correspondence
Γ
1
∩Γ
2
: X 3Y
given by
(Γ
1
∩Γ
2
)(x)=Γ
1
(x)∩Γ
2
(x)
Also let (X,d)beametricspaceandA ? X. The subset A can be
expanded by a non-negative factor β denoted by β + A where β + A =
∪
aDA
B
β
(a)={x ∈ X
i
: d(x
i
,A) <β}.
23
. See Figure 4.7.11. Then for a
correspondence Γ : X 3 Y where Y is a normed vector space, we have β+
Γ(x)={y ∈ Y : kΓ(x)?yk <β}. We say that β+ Γ(x)isaβ? band around
the set Γ(x). See Figure 4.7.12.
We need the following lemma for Michael?s selection theorem which is
critical for the proof of a Txed point of a correspondence.
Lemma 293 If Y is a normed vector space, if a correspondence F is deTned
by F(x)=β + f(x) where f is a continuous function from X to Y, and if
Γ : X 3Y is a lhc correspondence, then F∩ Γ is lhc.
24
Proof. If <x
n
>?→ x and y ∈ F(x)∩ Γ(x), then y ∈ Γ (x). Since Γ is
lhc, ? <y
n
> such that y
n
∈ Γ(x
n
)and<y
n
>→ y. We need to show that
y
n
∈ F(x
n
) (i.e.y
n
∈ (f(x
n
)?β, f(x
n
)+β)forn large enough. But by the
triangle property of a norm, we have
k y
n
?f(x
n
)k≤ky
n
?yk+ky?f(x)k+kf(x)?f(x
n
)k. (4.11)
23
Note that this distance between a point and a set is deTned in 127.
24
Remember that F is a correspondence not a function; F(x)=(f(x)? β,f(x)+β)
which is an interval for every x.
122 CHAPTER 4. METRIC SPACES
The Trst term is su?ciently small because <y
n
>→ y and the third term
is su?ciently small because <x
n
>?→ x and f is continuous. Since y ∈
F(x)=(f(x)?β,f(x)+β), the second term is less than β.Hence for n large
enough, the right hand side of (4.11) is less than β and thus y
n
∈ F(x
n
).
4.7.1 Theorem of the Maximum
In economics, often we wish to solve optimization problems where households
maximize their utility subject to constraints on their purchases of goods or
Trms maximize their proTts subject to constraints given by their technology.
In particular, consider the following example.
Example 294 A household has preferences over two consumption goods (c
1
,c
2
)
characterized by a utility function U : R
2
+
→R given by U(c
1
,c
2
)=c
1
+ c
2
.
The household has a positive endowment of good 2 denoted ω ∈ R
+
. The
household can trade its endowment on a competitive market to obtain good
1 where the price of good 1 in terms of good 2 is given by p ∈ R
+
. The
household?s purchases are constrained by its income; its budget set is given
by
B(p,ω)={(c
1
,c
2
) ∈R
2
+
: pc
1
+c
2
≤ ω}.
Taking prices as given, the household?s problem is
v(p,ω)= max
(c
1
,c
2
)∈B(p,ω)
U(c
1
,c
2
) (4.12)
The Trst question we might ask is does a solution to this problem exist? When
is it unique? How does it change as we vary parameters? The maximum
theorem gives us an answer to these questions.
Before turning to the theorem, let us continue to work with Problem
(4.12). First, let us establish properties of the budget set. In particular,
we establish that if p ∈ R
++
, then B(p,ω) is a compact-valued, continuous
correspondence. In this case, we will establish that the graph of the budget
correspondence A = {(p,ω,c
1
,c
2
) ∈ R
2
+
×R
2
+
:(c
1
,c
2
) ∈ B(p,ω)} satisTes
the conditions of Theorems 283 and 284 only when p>0. It is obviously
non-empty since (0,0) ∈ B(p,ω)forany(p,ω) ∈ R
2
+
. The problem is that
for any bounded set
b
X ? R
2
+
of prices and incomes, there may not be a
bounded set
b
Y ? R
2
+
of consumptions. In particular, if p>0,B(p,ω)is
bounded since 0 ≤ c
2
≤ ω and 0 ≤ c
1
≤
ω
p
but if p =0,c
1
is unbounded.
4.7. HEMICONTINUOUS CORRESPONDENCES 123
See Figure 4.7.13. Under the assumption that p>0, however, we have that
B(p,ω) is a non-empty, compact valued, continuous correspondence.
Next we establish continuity of the utility function U. In particular, we
show that ?ε>0, ?δ>0 such that if
p
(c
1
?x
1
)
2
+(c
2
?x
2
)
2
<δ, (4.13)
then
|U(c
1
,c
2
)?U(x
1
,x
2
)| <ε. (4.14)
Now rewrite the lhs of (4.14) as
|c
1
+c
2
?x
1
?x
2
|≤|c
1
?x
1
|+|c
2
?x
2
|
where the inequality follows from the triangle inequality. If we let δ =
ε
2
,
then (4.13) implies (4.14), establishing continuity of U. It is also instructive
to graph the level sets (or ?indi?erence curves?) of U. These are just given by
the equations c
2
= U ?c
1
in Figure 4.7.14 as we vary U. In the same Tgure
we also plot budget sets with p>1,p=1,and0<p<1. It is simple to see
from the Tgure that the solution, which we denote by ?
?
?, to the household?s
problem (4.12) is given by the demand correspondence
(c
?
1
,c
?
2
)=
?
?
?
(0,ω)ifp>1
(x,ω?x)withx ∈ [0,ω]ifp =1
(
ω
p
,0) if p<1
.
That is, if goods 1 and 2 are perfect substitutes for each other from the
household?s preference perspective, then if good 1 is expensive (inexpensive),
the household consumes none of (only) it, while if the two goods are the
samepricethepossibilitiesareuncountable!Noticethatthevaluefunction
is continuous and increasing
v(p,ω)=
?
ω if p ≥ 1
ω
p
if 1 >p>0
.
There is a more formal way of establishing the existence of a solution to
such mathematical programming problems and how the solution varies with
parameters. In general, let X ? R
n
,Y? R
m
, f : X × Y → R be a single
valued function, Γ : X 3 Y be a non-empty correspondence and consider
the problem sup
y∈Γ(x)
f(x,y). If for each x, f(x,·) is continuous in y and the
124 CHAPTER 4. METRIC SPACES
set Γ(x) is compact, then we know from the Extreme Value Theorem 262
that for each x the maximum is attained. In this case,
v(x)=max
y∈Γ(x)
f(x,y) (4.15)
is well deTned and the set of values y which attain the maximium
G(x)={y ∈Γ(x):f(x,y)=v(x)} (4.16)
is non-empty (but possibly multivalued). The Maximum theorem puts fur-
ther restrictions on Γ to ensure that v and G vary in a continuous way with
x. The proof works in the following way. Consider a convergent sequence of
elements in the constraint set x
n
→x ∈ X (which we can always Tnd since Γ
is compact valued). By the extreme value theorem, there is a corresponding
sequence of optimizing choices y
n
∈ G(x
n
)andy
n
→ y. We must show that
the limit of that sequence y is the optimizing choice in the constraint set
deTned at x. There are two parts to demonstrating this result. First we
must show that y is in the constraint set ( y ∈Γ(x)). Then we must show y
is the optimizing choice in Γ(x).
Theorem 295 (Berge?s Theorem of the Maximum) Let X ?R
n
,Y?
R
m
, f : X×Y →R be a continuous function, and Γ : X 3Y be a nonempty,
compact-valued, continuous correspondence. Then v : X → R deTned in
(4.15) is continuous and the correspondence G : X → Y deTned in (4.16) is
nonempty, compact valued, and uhc.
Proof. The Extreme Value Theorem 262 ensures that for each x the max-
imum is attained and G(x) is nonempty. Since G(x) ? Γ(x)andΓ(x)is
compact, G(x) is bounded. To show G(x) is closed, we suppose y
n
→ y
with y
n
∈ G(x),?n andneedtoshowthaty ∈ G(x).
25
Since Γ(x)isclosed,
y ∈ Γ(x). Since v(x)=f(x,y
n
)?n and f is continuous, then v(x)=f(x,y)
and y ∈ G(x). Thus, G(x) is nonempty and compact for each x.
To see that G(x)isuhc,letx
n
→ x and choose y
n
∈ G(x
n
). We need
to show that there exists a convergent subsequence <y
g(n)
>→ y and y ∈
G(x). Since Γ is uhc, ? <y
g(n)
> converging to y ∈ Γ(x) by Theorem 280.
Consider an alternative z ∈ Γ(x). Since Γ is lhc, ? <z
g(n)
> converging
to z with z
g(n)
∈ Γ(x
g(n)
), ?g(n) by Theorem 278. Since f(x
g(n)
,y
g(n)
) ≥
25
This follows from DeTnition 111.
4.7. HEMICONTINUOUS CORRESPONDENCES 125
f(x
g(n)
,z
g(n)
),?g(n) by optimality and f is continuous, f(x,y) ≥ f(x,z).
Since this holds for any z ∈Γ(x), then y ∈ G(x),satisfying uhc.
To see that v(x)iscontinuous,Tx x and let x
n
→ x. Choose y
n
∈
G(x
n
), ?n.Letv =limsupv(x
n
)andv =liminfv(x
n
). We can choose
<x
g(n)
> (a subsequence corresponding to <y
g(n)
> above) such that
v =limf(x
g(n)
,y
g(n)
). Since G is uhc, ? <y
h(g(n))
> converging to y ∈ G(x).
Hence v = limf(x
h(g(n))
,y
h(g(n))
)=f(x,y)=v(x). An analogous argument
establishes that v = v(x). Hence <v(x
n
) > converges to v(x).
The next three examples illustrate the Maximum theorem with simple
mathematical problems.
Example 296 Let X = R, Y = R, f : Y → R be given by f(y)=y and
Γ : X3Y be given by
Γ(x)=
?
[0,1] if x ≤ 1
1
2
x>1
.
Consider the problem v(x)=max
y∈Γ(x)
f(y). Then
v(x)=
?
1 if x ≤ 1
1
2
x>1
and G(x)=
?
1 if x ≤ 1
1
2
x>1
.
Notice that v(x) is not continuous and that G(x) is not uhc. What condi-
tion of Theorem 295 did we violate? The constraint correspondence is not
continuous; in particular, while Γ(x) is uhc, it is not lhc. See Figure 4.7.15.
Example 297 Let X = R, Y = R, f : Y → R be given by f(y)=cos(y),
and Γ : X3Y be given Γ(x)={y ∈ Y : ?x ≤ y ≤ x for x ≥ 0 and
x ≤ y ≤?x for x<0}. Consider the problem v(x)=max
y∈Γ(x)
f(y). Then
v(x)=1,?x and G(x)=
?
?
?
?
?
?
?
{0} ?2π<x<2π
{?2π,0,2π} ?4π<x<4π
{?4π,?2π,0,2π,4π} ?6π<x<6π
etc etc
.
Notice that G(x) is uhc but not lhc since, for example, if we take V =(2π?
ε,2π + ε) with 2π>ε>0,thenG(2π)∩V 6= ? but ?δ>0 ?x
0
∈ B
δ
(2π)
such that G(x
0
)∩V = ? (in particular all those x
0
< 2π). See Figure 4.7.16.
126 CHAPTER 4. METRIC SPACES
Example 298 Let X = R, Y = R, f : Y → R be given by f(y)=y
2
and
Γ : X3Y be given Γ(x)={y ∈ Y : ?x ≤ y ≤ x for x ≥ 0 and x ≤ y ≤?x
for x<0}. Consider the problem v(x)=max
y∈Γ(x)
f(y). Then
v(x)=x
2
,?x and G(x)={?x,x}.
Notice that G(x) is uhc and lhc but not convex valued. See Figure 4.7.17.
If we put more restrictions on the objective function and the constraint
correspondence we can show that the set of maximizers G(x) is single-valued
and continuous.
Theorem 299 Let X ? R
n
,Y? R
m
.LetΓ : X 3 Y be a nonempty,
compact- and convex- valued, continuous correspondence. Let A be the graph
of Γ and assume f : X →R is continuous function and that f(x,·) is strictly
concave, for each x ∈ X.
26
If we deTne
g
?
(x)=argmax
y∈Γ(x)
f(x,y),
then g
?
(x) is a continuous function. If X is compact, then g
?
(x) is uniformly
continuous.
Exercise 4.7.2 Prove Theorem 299.
We illustrate Theorem 299 through the next exercise.
Exercise 4.7.3 In Example 294 let the utility function U : R
2
+
→R be given
by u(c
1
)+u(c
2
) where u : R
+
→R is a strictly increasing, continuous, strictly
concave function. Establish the following: (i) The objective function U(c
1
,c
2
)
is continuous and strictly concave on R
2
; (ii) The budget correspondence
B(p,y) is compact and convex; (iii) Existence and uniqueness of the set of
maximizers; (iv) v(p,y) is increasing in y and decreasing in p; (iv) v(p,y) is
continuous (try this as a proof by contradiction).
26
We say f : R→R is strictly concave if f(αx +(1?α)z) >αf(x)+(1?α)f(z)for
x,z,∈R and α ∈ [0,1].
4.8. FIXED POINTS AND CONTRACTION MAPPINGS 127
4.8 Fixed Points and Contraction Mappings
One way to prove the existence of an equilibrium of an economic environment
amounts to showing there is a zero solution to a system of excess demand
equations. In the case of Example 294, households take as given the relative
price p and optimization may induce a continuous aggregate excess demand
correspondence ED(p) for good 1. If there is excess demand (supply), prices
rise (fall) until equilibrium ( ED(p) = 0) is achieved. We may represent this
?tatonnement? process by the mapping f(p)=p +ED(p). In that case, an
equilibrium is equivalent to a Txed point p = f(p).
DeTnition 300 Let (X,d) be a metric space and f : X → X be a function
or correspondence. We call x ∈ X a Txed point of the function if x = f(x)
or of the correspondence if x ∈ f(x).
We now present four di?erent Txed point theorems based upon di?erent
assumptions on the mapping f.
4.8.1 Fixed points of functions
The Trst Txed point theorem does not require continuity of f but uses only
the fact that f is nondecreasing.
Theorem 301 (Tarsky) Let f :[a,b] → [a,b] be a non-decreasing function
(that is, if x>yfor x,y ∈ [a,b], then f(x) ≥ f(y)), ?a,b ∈ R with a<b.
Then f has a Txed point.
Proof. Let P = {x ∈ [a,b]:x ≤ f(x)}. We prove this in 4 parts. (i) Since
f(a) ∈ [a,b] implies a ≤ f(a),then a ∈ P and hence P is non-empty. (ii)
Since P ? [a,b]and[a,b] is bounded, then P is bounded. Therefore, by the
Completeness Axiom 3, x =supP exists. (iii) Since?x ∈ P,x≤ x by (ii), we
have f(x) ≤ f(x) because f is nondecreasing. Since x ∈ P, x≤ f(x) ≤ f(x)
so that f(x)isanupperboundofP. Therefore, x ≤ f(x)sincex is the least
upper bound and hence x ∈ P. (iv) Since x ≤ f(x) implies f(x) ≤ f[f(x)],
we know f(x) ∈ P. Therefore, x ≥ f(x)sincex is an upper bound of P.
Given that x ≤ f(x)andx ≥ f(x)weknowthatx = f(x). Note that we
have not ruled out that there may be other points x
0
such that x
0
= f(x
0
).
If so, then for all such points x
0
∈ P.Oursolutionx is the maximal Txed
point.
128 CHAPTER 4. METRIC SPACES
The proof is illustrated in Figure 4.8.1. For a more general version of this
proof, see Aliprantis and Border (1999, Theorem 1.8).
The next result by Brouwer requires that f be a continuous function. We
saw a one dimensional version of it in section 4.6 which used the Intermediate
value Theorem 255. That proof was very simple but the method we used
there cannot be extended to higher dimensions. As it turns out proving it
in R
n
where n ≥ 2 is quite di?cult. There are proofs that use calculus but
we are going to present an elementary one based on simplexes which were
introduced in section 4.5.2. Brouwer?s Txed point theorem could be stated
for a non-empty convex, compact subset of R
n
. Because a nondegenerate
simplex is homeomorphic with (i.e. topologically equivalent to) a nonempty,
convex, compact subset of R
n
it su?ces to state Brouwer?s theorem for the
simplex.
Exercise 4.8.1 Show that a simplex is homeomorphic with a nonempty, con-
vex, compact subset P = {(p
1
,p
2
) ∈R
2
:0≤ p
1
,p
2
≤ M,M Tnite}.
For notational simplicity and better intuition we prove Brouwer?s theorem
in R
2
but this simpliTcation has no e?ectwhatsoeveronthelogicofthe
proof. The proof for general R
n
can be replicated with only minor notational
changes.
Theorem 302 (Brouwer) If f (x) maps a nondegenerate simplex continu-
ously into itself then there is a Txed point x
?
= f (x
?
).
Proof. (Sketch) The farther a point is from a vertex, the smaller is its
barycentric coordinate. Thus, in Figure 4.8.12, a
0
s largest barycentric coor-
dinate is the Trst one while b?s largest barycentric coordinate is the second
one. For a given f, we introduce an indexing function I(x)asfollows.Let
y = f(x)andI(x)=min{i : x
i
>y
i
}.If b = f(a),then I(a) = 0 because
a
0
>b
0
. (the arrow connecting a with f(a) points away from the vertex v
0
).
x
?
is a Txed point of f if α
?
i
= β
?
i
,i=0,1,2whereα
?
i
and β
?
i
= f
i
(x
?
)
are barycentric coordinates of x
?
and f(x
?
). See Figure 4.8.4. In the case
of barycentric coordinates, instead of equality (4.24) it su?ces to show the
following inequalities:
α
?
i
≥ β
?
i
,i=0,1,2 (4.17)
because α
?
i
≥ 0,β
?
i
≥ 0and
P
2
i=0
α
?
i
=1=
P
2
i=0
β
?
i
.SpeciTcally, If f doesn?t
have a Txed point, then I(x)iswelldeTned for all x ∈ S and obtains values
4.8. FIXED POINTS AND CONTRACTION MAPPINGS 129
0,1, or 2 with certain restrictions on the boundary. Divide the simplex into
m
2
equal subsimplexes and index all the vertices of the the subsimplexes using
I(x) obeying restrictions on the boundary. Sperner?s lemma guarantees that
for each m there is at least one simplex with a complete set of indices (i.e.
arrows originating at these verteces point inside the triangle). By choosing
onevertexofsuchsimpexforeachm we get an inTnite sequence of poins
from S that is the sequence is bounded. Hence by the Bolzano-Weierstrass
theorem there exists a convergent subsequence with the limit point x
?
∈ S.
As m →∞, a triangle collapses into one point (which is x
?
) (at this point
all arrows point inside itself). Since f is continuous, it preserves inequalities
so that x
?
is a Txed point of f.
In Chapter 6, we will introduce an inTnite dimensional version of Brouwer?s
Txed point theorem by Schauder.
Example 303 (On Existence of Equilibrium) Consider the following 2
period t =1,2 exchange problem with a large number I of identical agents.
Let c
i
t
,y
i
t
,q
t
denote an element (date t) of agent i?s consumption and en-
dowment vector, as well as the price vector, respectively. Let a represen-
tative agent i?s budget set be given by B(q,y
i
)={c
i
∈ R
2
:
P
2
t=1
q
t
c
i
t
≤
P
2
t=1
q
t
y
i
t
}. Notice that an agent?s budget set is homogeneous of degree zero
in q.
27
That is, B(λq,y
i
)=B(q,y
i
). Thus, we are free to take λ =
1
P
2
t=1
q
t
> 0
and set p =
3
q
1
P
2
t=1
q
t
,
q
2
P
2
t=1
q
t
′
.This deTnes a one dimensional price sim-
plex S
1
=
?
p ∈R
2
t
:
P
2
t=1
p
t
=1
a
.Let the representative agent i?s utility
function be given by U(c
i
)=
P
2
t=1
log(c
i
t
). Exercise 4.7.3 establishes that
B(p,y
i
) is a non-empty, compact- and convex-valued continuous correspon-
dence and that U(c
i
) is strictly concave. Thus by version 299 of the The-
orem of the Maximum the set of maximizers {c
i
1
(p,y),c
i
2
(p,y)} are single
valued and continuous functions. Since the sum of continuous functions is
continuous, the aggregate excess demand function z : S
1
→ R
2
given by
z(p)=
P
I
i=1
c
i
(p,y)?y
i
is continuous. It is a consequence of Walras Law
that the inner product <p,z(p) >=0. To prove existence of equilibrium, we
need to show that at the equilibrium price vector p
?
there is no excess demand
(i.e. z(p
?
) ≤ 0).SpeciTcally, we must show that if z : S
1
→R
2
is continuous
and satisTes <p,z(p) >=0, then ?p
?
∈ S
1
such that z(p
?
) ≤ 0 (in the
27
We say a function f(x) is homogeneous of degree k =0,1,2... if for any λ>0,
f(λx)=λ
k
f(x).
130 CHAPTER 4. METRIC SPACES
case that all goods are desireable, this is z(p
?
)=0). To this end, deTne the
mapping which raises the price of any good for which there is excess demand:
f
t
(p)=
p
t
+max(0,z
t
(p))
1+
P
2
j=1
max(0,z
j
(p))
for t =1,2.
Notice that f
t
(p) is continuous since z
t
and max(·,·) are continuous func-
tions and that f(p) lies in S
1
since
P
2
t=1
f
t
(p)=1.ByBrouwer?sFixed
Point Theorem 302, there is a Txed point where f(p
?
)=p
?
. But this can be
shown (by applying Walras Law) to imply that z
t
(p
?
) ≤ 0.You should con-
vince yourself that with these preferences and endowments, the markets for
current and future goods are cleared if p
?
implies
q
2
q
1
=
y
1
y
2
so that the relative
price of future goods in terms of current goods is lower the more plentiful
future goods or less plentiful current goods are. Since
1
1+r
=
q
2
q
1
, this means
that interest rates are higher the smaller is current output relative to future
output. In other words, identical (representative) agents would like to borrow
against plentiful future output to smooth consumption if current output is
low; this would drive up the interest rate.
4.8.2 Contractions
Note that while the above theorems proved existence, they said nothing about
uniqueness. The next set of conditions on f provide both.
DeTnition 304 Let (X,d) beametricspaceandf : X →X be a function.
Then f satisTes a Lipschitz condition if ?γ>0 such that d(f(x),f(ex)) ≤
γd(x,ex), ?x,ex ∈ X.Ifγ<1,then f is a contraction mapping (with
modulus γ).
One way to interpret the Lipschitz condition is as a restriction on the
slope of f.Thatis,
?y
?x
=
d(f(x),f(ex))
d(x,ex)
≤ γ. Then a contraction is simply a
function whose slope is everywhere less than 1. If f is Lipschitz, then it is
uniformly continuous since we can take δ(ε)=
ε
γ
in which case d(x,ex) <δ?
d(f(x),f(ex)) <ε.On the other hand, if f is uniformly continuous, it may
not satisfy the Lipschitz condition as the next example shows.
Example 305 Let f :[0,1] → R be given by f(x)=
√
x. To see that f is
uniformly continuous, for any ε>0, let δ(ε)=
ε
2
2
. Then if |x?y| <δ,we have
4.8. FIXED POINTS AND CONTRACTION MAPPINGS 131
|
√
x?
√
y|≤
p
2|x?y| <
q
2·
ε
2
2
= ε where the weak inequality follows since
?
√
x?
√
y
¢
2
= x?2
√
xy+y ≤ 2(max{x,y}?min{x,y})=2|x?y|.Tosee
that f is not Lipschitz, suppose so. Then for some γ>0, |
√
x?
√
y|≤ γ|x?y|
or
|
√
x?
√
y|
|x?y|
≤ γ, ?x,y ∈ [0,1].But
|
√
x?
√
y|
|x?y|
=
|
√
x?
√
y|
|
√
x?
√
y||
√
x+
√
y|
=
1
|
√
x+
√
y|
.
Choose x =
1
(1+γ)
2
and y =0so x,y ∈ [0,1].Then
1
|
√
x+
√
y|
=1+γ which
contradicts
|
√
x?
√
y|
|x?y|
≤ γ.
The next result establishes conditions under which there is a unique Txed
point and provides a result on speed of convergence helpful for computational
work.
Theorem 306 (Contraction Mapping) If (X,d) is a complete metric space
and f : X → X is a contraction with modulus γ,then f has a unique Txed
point x ∈ X and (ii) for any x
0
∈ X, d(x,f
n
(x
0
)) ≤
γ
n
1?γ
d(f(x
0
),x
0
) where
f
n
are iterates of f.
28
Proof. Choose x
0
∈ X and deTne <x
n
>
∞
n=0
by x
n+1
= f(x
n
)sothatx
n
=
f
n
(x
0
). Since f is a contraction d(x
2
,x
1
)=d(f(x
1
),f(x
0
)) ≤ γd(x
1
,x
0
).
Continuing by induction,
d(x
n+1
,x
n
)=d(f(x
n
),f(x
n?1
)) ≤ γd(x
n
,x
n?1
) ≤ γ
n
d(x
1
,x
0
),n=1,2,...
(4.18)
For any m>n,
d(x
m
,x
n
) ≤ d(x
m
,x
m?1
)+... +d(x
n+2
,x
n+1
)+d(x
n+1
,x
n
) (4.19)
≤
£
γ
m?1
+...+γ
n+1
+γ
n
¤
d(x
1
,x
0
)
= γ
n
[γ
m?n?1
+... +γ +1]d(x
1
,x
0
)
≤
γ
n
1?γ
d(x
1
,x
0
)
where the Trst line uses the triangle inequality and the second uses (4.18). It
follows from (4.19) that <x
n
> is a Cauchy sequence. Since X is complete,
x
n
→ x.Thatx is a Txed point follows since
d(f(x),x) ≤ d(f(x),f
n
(x
0
)) +d(f
n
(x
0
),x)
≤ γd(x,f
n?1
(x
0
)) +d(f
n
(x
0
),x)
28
The iterates of f (mappings {f
n
}), are deTned as n-fold compositions f
0
(x)=x,
f
1
(x)=f(x),f
2
(x)=f(f
1
(x)),...,f
n
(x)=f(f
n?1
(x)).
132 CHAPTER 4. METRIC SPACES
where the Trst line uses the triangle inequality and the second simply uses
that f is a contraction. Since γ<1, (4.18) implies lim
n→∞
d(x,f
n?1
(x
0
)) =
0=lim
n→∞
d(f
n
(x
0
),x)sothatd(f(x),x)=0orx is a Txed point.
To prove uniqueness, suppose to the contrary there exists another Txed
point x
0
. Then d(x
0
,x)=d(f(x
0
),f(x)) ≤ γd(x
0
,x) implies γ ≥ 1, contrary
to γ<1foracontraction.
Finally, the speed of convergence follows since
d(x,f
n
(x
0
)) ≤ d(x,f
m
(x
0
)) +d(f
m
(x
0
),f
n
(x
0
))
≤
γ
n
1?γ
d(f(x
0
),x
0
)
where the Trst line follows from the triangle inequality and the second from
(4.19) and lim
m→∞
d(x,f
m
(x
0
)) = 0.See Figure 4.8.13.
Sometimes it is useful to establish a unique Txed point on a given space
X and then apply Theorem 306 again on a smaller space to characterize the
Txed point more precisely.
Corollary 307 Let (X,d) be a complete metric space and f : X → X be
a contraction with Txed point x ∈ X. If X
0
is a closed subset of X and
f(X
0
) ? X
0
, then x ∈ X
0
.
Proof. Let x
0
∈ X
0
. Then <f
n
(x
0
) > is a sequence in X
0
converging to x.
Since X
0
is closed, x ∈ X
0
.
4.8.3 Fixed points of correspondences
In considering Txed points of correspondences we would like to utilize Txed
point theorems (particularly Brouwer?s Txed point theorem) of functions.
How can we reduce multiple valued case to the single-valued one? This can
be done by means of selection (i.e. a single-valued function that is selected
from a multiple valued correspondence). Depending on circumstances we
might have extra conditions on these choice functions. For instance, we
might look for a continuous choice function (called continuous selection) or
for a measurable choice function (called measurable selection, which we will
deal with next chapter).
DeTnition 308 Let Γ : X 3Y be a correspondence, then the single-valued
function Γ
0
: X ?→ Y such that Γ
0
(x) ∈Γ(x),?x ∈ X is called a selection.
SeeFigure4.8.14.
4.8. FIXED POINTS AND CONTRACTION MAPPINGS 133
The existence of a Txed point of a function proven by Brouwer requires
continuity. Hence in this section we will deal with the problem of continuous
selection. After proving the existence of a continuous selection, we will use
Brouwer?s Txed point theorem for functions to show that the selection has a
Txed point, which is then obviously, a Txed point for the original correspon-
dence. There are two main results of this subsection, Michael?s continuous
selection theorem and Kakutani?s Txed point theorem for correspondences.
First we introduce a new notion, a partition of unity, that will be used in
the proof of the selection theorem. Existence of a partition of unity is based
on a well-known result from topology.
Lemma 309 (Urysohn) Let A,B be two disjoint, closed subsets of a metric
space X. Then there exists a continuous function f : X ?→ [0,1] such that
f(x)=0,?x ∈ A and f(x)=1,?x ∈ B.
For a proof, see Kelley (1957).
To continue we need to introduce the following topological concept.
DeTnition 310 Let X be a metric space and let {G
i
,idΛ} be an open cover
of X. Then a partition of unity subordinate to the cover {G
i
} is a
family of continuous real-valued functions ?
i
: X ?→ [0,1] such that ?
i
(x)=
0,?x ∈ X\G
i
, and such that ?x ∈ X,
P
iDΛ
?
i
(x)=1.
Example 311 Let X =[0,1],G
1
=[0,
2
3
),G
2
=(
1
3
,1] be an open cover of
[0,1].Let
?
1
=
?
?
?
1 , 0 ≤ x ≤
1
3
?3
?
x?
2
3
¢
,
1
3
<x≤
2
3
0 ,
2
3
<x≤ 1
?
?
?
?
2
=
?
?
?
0 , 0 ≤ x ≤
1
3
3
?
x?
1
3
¢
,
1
3
<x≤
2
3
1 ,
2
3
<x≤ 1
?
?
?
See Figure 4.8.15. Then {?
1
,?
2
} is partition of unity subordinate to {G
1
,G
2
}.
Lemma 312 (Partition of unity) Let X be a metric space and let {G
1
,
....,G
n
} be a Tnite open cover of X. Then there exists a partition of unity
subordinate to this cover.
134 CHAPTER 4. METRIC SPACES
Proof. We begin by constructing a new cover {H
1
,...,H
n
} of X by open sets
such that (i) H
i
? H
i
? G
i
for i =1,2,...,n and (ii) {H
i
: i ≤ j}∪{G
i
,i>
j} is a cover for each j. This is done inductively. Let F
1
= X\∪
n
i=2
G
i
. Then
F
1
is closed and F
1
? G
1
.The sets F
1
and X\G
1
are closed disjoint subsets
of the metric space X and hence can be separated by two disjoint open sets
(see the separation axioms in Chapter 7), H
1
and X ?H
1
. We have
F
1
? H
1
? H
1
? G
1
.
This satisTes (ii) for j =1.Now suppose H
1
,H
2
,...,H
k?1
have been con-
structed. Then since {H
i
: i ≤ k ?1}∪{G
i
: i>k?1} is a cover for X,
F
k
= X\(
?
∪
k=1
i=1
H
i
¢
∪
?
∪
n
i=k+1
G
i
¢
) ? G
k
. Again by separating F
k
and X\G
k
we get H
k
such that F
k
? H
k
? H
k
? G
k
. Clearly the collection {H
1
,...,H
k
}
satisTes (ii) with j = k.By Urysohn?s lemma 309we can construct real-valued
functions ψ
i
on X such that ψ
i
(x)=0ifx ∈ X\G
i
and ψ
i
(x)=1ifx ∈ H
i
and 0 ≤ ψ
i
≤ 1. Finally, let
?
i
(x)=
ψ
i
(x)
(Σ
n
j=1
ψ
j
(x))
.
Since the collection {H
i
,i=1,2,....,n} is a cover, we have
P
n
j=1
ψ
j
(x) 6=0
for each x and hence ?
i
(x) is well-deTned. {?
i
}
n
i=1
is the partition of unity
subordinate to cover {G
i
}
n
i=1
.
Theorem 313 (Michael) Let X be a metric space and Y be a Banach
space, Γ : X 3 Y be lhc and Γ(x) closed and convex for every x ∈ X.
Then Γ admits a continuous selection.
Proof. We prove the theorem under a stronger assumption, that X is a
compact metric space, than is necessary.
29
We Trst show that for each pos-
itive real number β there exists a continuous function f
β
: X ?→ Y such
that f
β
(x) ∈ β + Γ(x)foreachx ∈ X. The desired selection will then be
constructed as a limit of a suitable Cauchy sequence in such functions (that?s
why we need Y to be a complete normed vector space). For each y ∈ Y ,
let U
y
= Γ
?1
(B
β
(y)) where B
β
(y) is an open ball around y of diameter β.
Since Γ is lhc and B
β
(y)isopeninY , U
y
is open in X. The collection
29
To prove the more general version we would need to use the concept of paracom-
pactness which goes beyond the scope of this book. For the more general result, see
Aubin-Frankowska (1990).
4.8. FIXED POINTS AND CONTRACTION MAPPINGS 135
{U
y
,y∈ Y} is then an open cover of X. Since X is compact there exists a
Tnite subcollection {U
y
i
,i=1,.....,n,y
i
∈ Y} which by Lemma 312 (requir-
ing a Tnite collection) has a partition of unity {π
i
,i=1,...,n} subordinate
to U
y
.Letf
β
be deTned by f
β
(x)=
P
n
i=1
π
i
(x)y
i
,wherey
i
is chosen in
such a way that π
i
=0inX\U
y
i
. Since f
β
(x)isthesumofTnitely many
continuous functions it is a continuous function from X to Y. f
β
is a convex
combination of those points y
i
for which π
i
(x) 6=0. But π
i
(x) 6= 0 only if
xdU
y
i
. Thus Γ(x)∩B
β
(y
i
) 6= ? and so y
i
∈ β +Γ(x). Thus f
β
is a convex
combination of points y
i
which lie in the convex set β +Γ(x)andsof
β
is
also in that set (i.e. f
β
∈ β +Γ(x)).
Next we construct a sequence of such functions f
i
to satisfy the following
two conditions:
f
i
(x) ∈
1
2
i?2
+f
i?1
(x),i=2,3,4,..... (4.20)
f
i
(x) ∈
1
2
i
+Γ(x),i=1,2,3,..... (4.21)
For f
1
we take the function f
β
already constructed with β =
1
2
. Suppose
that f
1
,f
2
,.......f
n
have already been constructed. Let Γ
n+1
(x)=Γ(x) ∩
?
1
2
n
+f
n
(x)
¢
. Then since f
n
(x)satisTes condition (4.21), Γ
n+1
(x)isnon-
empty and being the intersection of two convex sets is convex. Moreover
Γ
n+1
(x) is lhc (see Lemma 293). Therefore by the Trstpartoftheproofand
with β =
1
2
n+1
, there exists a function f
n+1
with the property that f
n+1
(x) ∈
1
2
n+1
+Γ
n+1
(x). Since Γ
n+1
(x) ? Γ(x), we have f
n+1
(x) ∈
1
2
n+1
+Γ(x)so
that condition (4.21) is satisTed. Furthermore, since Γ
n+1
(x) ?
1
2
n
+ f
n
(x)
we have f
n+1
(x) ∈
?
1
2
n
+
1
2
n+1
¢
+ f
n
(x) ?
1
2
n?1
+ f
n
(x)whichmeansthat
condition (4.20) is satisTed. We constructed the sequence hf
i
,i∈Ni of func-
tions for which kf
n+1
(x)?f
n
(x)k
Y
<
1
2
n+1
for all n and all x. Therefore
sup
xDX
kf
m
(x)?f
n
(x)k
Y
<
1
2
n?2
for all m,n with m>n.Thus the sequence
hf
i
i is a Cauchy sequence in the space of bounded continuous functions from
X to Y which is complete because Y is complete (which we will see in Theo-
rem 452 in Chapter 6). Then there exists a continuous function f : X ?→ Y
such that hf
i
i?→ f (with respect to the sup norm). Since (4.21) states that
kf
n
(x)?Γ(x)k <
1
2
n
for all n, it follows that the limit function f has the
property that f (x) ∈ Γ(x) (the closure of Γ). By assumption that Γ(x)is
closed we have f (x) ∈Γ(x)sinceΓ(x)=Γ(x).
Note that a correspondence that is uhc does not guarantee a continuous
selection. See Example 272.
136 CHAPTER 4. METRIC SPACES
Combining Brouwer?s Txed point theorem 302 with Michael?s selection
theorem 313 we immediately get the existence of a Txed point for lhc corre-
spondences.
Corollary 314 Let K be a non-empty, compact, convex subset of a Tnite
demensional space R
m
and let Γ : K 3 K be a lhc, closed, convex valued
correspondence. Then Γhas a Txed point.
Kakutanis theorem is usually stated with the condition that the corre-
spondence Γ be uhc and closed valued. However, since we are dealing with
acompactsetK , by Theorems 285 and 287, uhc together with the closed
valued property are equivalent to having a closed graph. It seems that this
condition is somewhat easier to check.
In order to make the switch from uhc (or equivalently from closedness of
graph) to lhc, we use the following lemma.
Lemma 315 Let X and Y be compact subsets of a Tnite dimensional normed
vector space R
m
and let Γ : X 3 Y be a convex-valued correspondence
which has a closed graph (or equivalently is closed-valued and uhc). Then
given β>0 , there exists a lhc, convex-valued correspondence F : X 3 Y
such that Gr(F) ? β +Gr(Γ).
Proof. Consider Trst the new correspondences
b
F
ε
deTned for all ε>0by
b
F
ε
(x)=∪
bx∈X,kx?bxk<ε
Γ(bx). To see that
b
F
ε
is lhc at x
0
,consider an open set
G such that
b
F
ε
(x
0
)∩G 6= ?. Then there exists bx ∈ X with kbx?x
0
k <εand
Γ(bx)∩G 6= ?.If μis su?ciently small (μ<ε?kbx?x
0
k)andifkx
0
?xk <μ,
then kbx?xk <ε,and so
b
F
ε
(x)∩G 6= ? because Γ(bx) ?
b
F
ε
(x). Thus
b
F
ε
is lhc at an arbitrary x
0
dX and hence lhc on X. It follows from Lemma 292
that F
ε
= co(
b
F
ε
) is also lhc. Since F
ε
is certainly convex-valued the proof
is Tnished by showing that Gr(F
ε
) ? β + Gr(Γ)ifε is su?ciently small.
Supposethatitisnotso.Thatis,forsomeβ>0andalln ∈N , Gr
3
F1
n
′
is not contained in β +Gr(Γ). Then there exists a sequence h(x
n
,y
n
),ndNi
in X × Y such that (x
n
,y
n
) ∈ Gr
3
F1
n
′
but d((x
n
,y
n
),Gr(Γ)) ≥ β. To
say that (x
n
,y
n
) ∈ Gr
3
F1
n
′
means that y
n
=
P
m+1
i=1
λ
n,i
y
n,i
with λ
n,i
≥ 0
,
P
m+1
i=1
λ
n,i
=1, and y
n,i
dΓ(bx
n,i
)wherekbx
n,i
?x
n
k <
1
n
. Here we used
Caratheodory?s theorem 226 saying that in R
m
if y
n
is a convex combination
of certain points, it can always be expressed as a convex combination of
4.8. FIXED POINTS AND CONTRACTION MAPPINGS 137
di?erent((m+1)points).SinceX and Y are compact and λ
n,i
d[0,1] (which
is also compact) all the above sequences contain subsequences (we will use the
same indexes for subsequences) that converge, that is hx
n
i → x, hy
n
i → y ,
hλ
n,i
i→λ
i
andhbx
n,i
i→ bx
i
.Sincekbx
n,i
?x
n
k <
1
n
, bx
i
= x,fori =1,...,m+1.
We also have that
P
m+1
i=1
λ
i
=1andy
n
=
P
m+1
i=1
λ
n,i
y
n,i
→
P
m+1
i=1
λ
i
y
i
=
y. Now (bx
n,i
,y
n,i
) ∈ Gr(Γ)andso(bx
i
,y
i
)=(x,y
i
) ∈ Gr(Γ)=Gr(Γ)
(since Gr(Γ)isclosed). Thusy
i
∈ Γ(x)and,sinceΓ(x)isconvex,y ∈
Γ(x) (being a convex combination of y
i
). Hence (x,y)dGr(Γ). But since
d((x
n
,y
n
),Gr(Γ)) ≥ β for all n, this is not possible. This contradiction
completes the proof.
Corollary 316 In Lemma 315 we may also take F to be closed-valued.
Proof. Let F = F
ε
for su?ciently small ε. Then by lemma 291, F is
lhc. It is of course still convex-valued and if Gr(F
ε
) ?
β
2
+ Gr(Γ), then
Gr(F) ? β +Gr(Γ).
Theorem 317 (Kakutani) Let K be a non-empty, compact, convex subset
of Tnite-dimensional space R
m
and let Γ : K 3 K beaclosed,convex
valued, uhc correspondence (or convex valued with closed graph). Then Γ has
a Txed point.
Proof. By lemma 315 and corollary 316, for each n ∈ N there exists a lhc
correspondence F
n
: K 3 K such that Gr(F
n
) ?
1
n
+ Gr(Γ)andF
n
has
values which are closed and convex. Then by Michael?s selection theorem
313, there is a continuous selection f
n
for F
n
. The function f
n
is continous
mapping of K into itself and so, by Brouwer Txed point theorem 302, there
exists x
n
dK with f
n
(x
n
)=x
n
. The compactness of K means that there exists
a convergent subsequence of the sequence hx
n
i such that
-
x
g(n)
?
→x
?
. Since
(x
n
,x
n
) ∈ Gr(F
n
) ?
1
n
+ Gr(Γ), it follows that (x
?
,x
?
) ∈ Gr(Γ)=Gr(Γ).
Thus x
?
∈Γ(x
?
)isaTxed point of Γ.
We now use an example to illustrate an important result due to Nash
(1950). Nash?s result says that every Tnite strategic form game has a mixed
strategy equilibrium.
Example 318 Reconsider the Tnite action coordination game in Example
282. We say that the mixed strategy proTle (p
?
,q
?
) is a Nash Equilibrium
138 CHAPTER 4. METRIC SPACES
if π
1
(p
?
,q
?
) ≥ π
1
(p,q
?
) and π
2
(p
?
,q
?
) ≥ π
2
(p
?
,q), ?p,q ∈ [0,1]. In Example
282, we showed
p
?
(q)=
?
?
?
0 if q<
1
2
[0,1] if q =
1
2
1 if q>
1
2
and q
?
(p)=
?
?
?
0 if p<
1
2
[0,1] if p =
1
2
1 if p>
1
2
.
Given that the two agents are symmetric, to prove that the above game has
a mixed strategy equilibrium, it is su?cient to show that p
?
:[0,1] 3 [0,1]
has a Txed point p ∈ p
?
(p). From Kakutani?s theorem, it is su?cient to check
that p
?
(p) is a non-empty, convex-valued, uhc correspondence all of which
was shown in Example 282. See Figure 4.8.16.
Exercise 4.8.2 Using Kakutani?s theorem, prove Nash?s result generally.
See Fudenberg and Tirole p.29.
4.9 Appendix - Proofs in Chapter 4
Proof of Caratheodory Theorem 226. . x ∈ co(X) implies x =
P
m
i=1
λ
i
x
i
, (x
1
,...,x
m
) ∈ X, λ
i
> 0 ?i, and
P
m
i=1
λ
i
=1byTheorem??.
Suppose m ≥ n+2. Then the vectors
·
x
1
1
?
,
·
x
2
1
?
,...,
·
x
m
1
?
∈R
n+1
are linearly dependent. Hence there exist μ
1
,...,μ
m
, not all zero, such that
m
X
i=1
μ
i
·
x
i
1
?
=0
(i.e.
m
X
i=1
μ
i
x
i
=0and
m
X
i=1
μ
i
1=0). Letμ
j
> 0forsomej, 1 ≤ j ≤
m. DeTne α =
λ
j
μ
j
=min
n
λ
i
μ
i
: μ
i
6=0
o
so that λ
j
? αμ
j
=0.If we deTne
θ
i
≡ λ
i
? αμ
i
,then θ
j
=0,
m
X
i=1
θ
i
=
m
X
i=1
λ
i
? α
m
X
i=1
μ
i
=1? α0=1, and
m
X
i=1
θ
i
x
i
=
m
X
i=1
λ
i
x
i
?α
m
X
i=1
μ
i
x
i
=
m
X
i=1
λ
i
x
i
= x. Henceweexpressedx as a
4.9. APPENDIX - PROOFS IN CHAPTER 4 139
convex combination of m?1pointsofX with θ
j
=0forsomej,reducingit
from m points. If m?1 >n+1, then the process can be repeated until x is
expressed as a convex combination of n+1pointsofX.
Proof of Theorem 248.. (i ? ii)Leta ∈ f
?1
(V). Then ?y ∈ V such
that f(a)=y. Since V is open, then ?ε>0suchthatB
ε
(y) ? V.Sincef is
continuous for this ε, ?δ(ε,a) > 0 such that ?x ∈ X with d
X
(x,a) <δ(ε,a)
we have d
Y
(f(x),f(a)) <ε. Hence f(B
δ
(a)) ? B
ε
(f(a)) or equivalently
B
δ
(a) ? f
?1
(B
ε
(f(a))) ? f
?1
(V).
(ii ? iii)Let<x
n
>→ x. Take any open ε-ball B
ε
(f(x)) ? V. Then x ∈
f
?1
(B
ε
(f(x))) and f
?1
(B
ε
(f(x))) is open (by assumption ii). Now ?δ>0
such that B
δ
(x) ? f
?1
(B
ε
(f(x))). Since <x
n
>→ x,?N such that n ≥ N,
x
n
∈ B
δ
(x) ? f
?1
(B
ε
(f(x))). Hence f(x
n
) ∈ B
ε
(f(x)) ?n ≥ N so f(x
n
) →
f(x). See Figure 4.6.5.
(iii ?i)Itissu?cient to prove the contrapositive. Thus, suppose?ε>0
such that ?δ =
1
n
, ?x
n
such that d
X
(x
n
,x) <
1
n
and d
Y
(f(x
n
)),f(x)) ≥ ε.
Thus we have a sequence <x
n
>→ x but none of the elements of <f(x
n
) >
is in an ε-ball around f(x)). Hence <f(x
n
) > doesn?t converge to f(x).
30
Proof of Theorem 268.. In the Trst step, for a given ε>0,we construct
δ that depends only on ε.Thus,takeε>0. Since f is continuous on X,
then for any x ∈ X there is a number δ(
1
2
ε,x) > 0suchthatif x
0
∈ X and
d(x,x
0
) <δ(
1
2
ε,x), then d
Y
(f(x)),f(x
0
)) <
1
2
ε. The collection of open balls
G ={B
δ(
1
2
ε,x)
(x),x∈ X} is an open covering of X.SinceX is compact there
exists a Tnite subcollection, say {B(x
1
),...B(x
n
)} of these balls that covers
X.ThendeTne δ(ε)=
1
2
min{δ(
1
2
ε,x
1
),....,δ(
1
2
ε,x
n
)} which is obviously
independent of x.
In the second step, we use the δ(ε) constructed above to establish uni-
form continuity. Suppose that x,x
0
∈ X and d
X
(x
0
,x) <δ(ε). Because
{B(x
1
),...B(x
n
)} covers X, x∈ B(x
k
)forsomek.Thatis
d
X
(x,x
k
) <
1
2
δ(
1
2
ε,x
k
). (4.22)
By the triangle inequality it follows that
d
X
(x
0
,x
k
) ≤ d
X
(x
0
,x)+d
X
(x,x
k
) ≤ 2δ(ε) ≤ δ(
1
2
ε,x
k
). (4.23)
30
Munkres p.127 Th10.1, sequences see Munkres p. 128, Th 10.3.
140 CHAPTER 4. METRIC SPACES
Then (4.22) and continuity of f at x
k
imply d
Y
(f(x)),f(x
k
)) <
1
2
ε, while
(4.23) and continuity of f at x
k
imply d
Y
(f(x
0
)),f(x
k
)) <
1
2
ε. Again by the
triangle inequality it follows that
d
Y
(f(x)),f(x
0
)) ≤ d
Y
(f(x)),f(x
k
)) +d
Y
(f(x
0
)),f(x
k
)) <
1
2
ε+
1
2
ε = ε.
Thus, we have shown that if x,x
0
∈ X for which d
X
(x
0
,x) <δ(ε), then
d
Y
(f(x)),f(x
0
)) <ε.
Proof of Brouwer?s Fixed Point Theorem 302. (in R
2
). Let f : S ?→
S be continuous, where S is a Txed nondegenerate simplex with vertices
v
0
,v
1
,v
2
.x
?
= f (x
?
) implies that
α
?
i
= β
?
i
,i=0,1,2 (4.24)
where α
?
i
and β
?
i
= f
i
(x
?
) are barycentric coordinates of x
?
and f(x
?
). See
Figure 4.8.4. In the case of barycentric coordinates, instead of equality (4.24)
it su?ces to show the following inequalities.
α
?
i
≥ β
?
i
,i=0,1,2 (4.25)
(4.24) and (4.25) are equivalent because α
?
i
≥ 0,β
?
i
≥ 0and
P
2
i=0
α
?
i
=1=
P
2
i=0
β
?
i
. To see this, note that
α
0
≥ β
0
≥ 0,
α
1
≥ β
1
≥ 0,
α
2
≥ β
2
≥ 0,
α
0
+α
1
+α
2
= β
0
+β
1
+β
2
(= 1),
and (α
0
?β
0
)+(α
1
?β
1
)+(α
2
?β
2
)=0
implies α
0
= β
0
,α
1
= β
1
,α
2
= β
2
.
Let y = f (x). If y 6= x, then some coordinates β
i
6= α
i
. Therefore, since
P
α
i
=
P
β
i
we must have both some α
k
>β
k
and some α
i
<β
i
. Focus
on the Trst inequality, α
k
>β
k
; this cannot occur when α
k
=0.In other
words this cannot occur on the boundary segment opposite vertex v
k
(see the
calculations in Example 228). For example, on the boundary line segment
joining v
0
and v
1
opposite v
2
(wewilldenotethislinesegment(v
0
,v
1
),the
inequality α
2
>β
2
cannot occur because α
2
= 0 for all these points but
0 >β
2
≥ 0 is false. See Figure 4.8.5.
4.9. APPENDIX - PROOFS IN CHAPTER 4 141
Now we introduce an indexing scheme for points in the simplex as follows.
Given functions y = f (x)(f : S ?→ S),foreachx ∈ S such that x 6= y =
f (x)wehaveseenthatx
i
>y
i
for some i. Now deTne I(x) as the smallest
such i (that is, I (x)=min{i : α
i
>β
i
}). Hence I(x) can obtain values
0,1,2 (in our case for R
2
). These values depend on the function y = f (x)of
course but on the boundary, we know that I(x) is restricted. For example
on the boundary (v
0
,v
1
) where α
2
=0wecan?thaveα
2
>β
2
so that I(x)
can?t obtain 2. Thus I (x) = 0 or 1 on the line segment (v
0
,v
1
). In general,
I(x)satisTes the same set of restrictions as I(x) in (4.3) in Section 4.5.2 and
hence we can use the results of Sperner?s Lemma 229.
Why are we doing this? We are looking for a Txed point of y = f (x).
That is a point x whose barycentric coordinates satisfy all the inequalities
α
0
≥ β
0
,α
1
≥ β
1
,α
2
≥ β
2
.To do so, for m =2,3,4,..., we form the mth
barycentric subdivision of our simplex S. For example, see Figure 4.8.7 for
m =2.The vertices in the subdivision are points z =
1
2
(μ
0
,μ
1
,μ
2
) where the
μ
i
are integers (and
μ
i
2
, i =0,1,2 are barycentric coordinates with respect
to the mth subdivision) with all μ
i
≥ 0and
P
2
j=0
μ
j
=2. In general for the
m?th subdivision, the vertices are the points x =
1
m
(μ
0
,μ
1
,μ
2
). Where μ
i
are integers satisfying all μ
i
≥ 0,
P
2
i=0
μ
i
= m. We will call a little shaded
triangle a cell. The original simplex is the whole body. For m =5wesee25
cells in Figure 4.8.8. Each cell is small; the diameter of each cell is
1
5
of the
diameter of the body. In general, in the m?th subdivision of a simplex, the
number of cells m
2
tends to inTnity as m ?→ ∞ and the diameter of each
cell tends to zero. If ? is the diameter of the body, then the diameter of
each cell is
?
m
.
We are given a continuous function y = f (x) that maps the simplex into
itself. We assume that f (x)hasnoTxed point and we show that this leads
to contradiction. Since we assume x 6= y = f (x)(i.e. noTxed point) we
may use the indexing function I(x)foreachpointx ∈ S. The index takes
one of the values 0,1,2 at each point of the body and on the boundary of
the simplex. The index satisTes the restrictions (??). For example in Figure
4.8.9 there are 21 vertices. Label each vertex x with an index I (x)=0,1,2
arbitrarily except that this indexing has to obey the restrictions (??)onthe
boundary. That means you must use I = 0 or 1 on the bottom side, I =0
or 2 on the left and I = 1 or 2 on the right. Also I =0atv
0
, I =1atv
1
and I =2atv
2
. This leaves 6 interior vertices, each to be labeled arbitralily
0,1, or 2. Try to label these vertices such that none of the 25 cells has all the
142 CHAPTER 4. METRIC SPACES
labels 0,1,2. No matter how hard you try, at least one of the cells must have
a complete set of labels. This is guaranteed by Sperner?s Lemma 229 which
follows immediately after this proof. In particular, the lemma guarantees
that for any m, in the m-th subdivision there is a cell with a complete set of
labels, say
I = 0 at the vertex x
0
(m) (4.26)
I = 1 at the vertex x
1
(m)
I = 2 at the vertex x
2
(m)
What does this mean for the function y = f (x)? If I = j then for barycentric
coordinates of the points x and y we have α
j
>β
j
. Therefore, (4.26) implies
α
0
>β
0
at x
0
(m) (4.27)
α
1
>β
1
at x
1
(m)
α
2
>β
2
at x
2
(m)
If m is large all the vertices of the cell are close to each other, since the
diameter of the cell is
?
m
. Therefore
max
0≤i<j≤2
ˉ
ˉ
x
i
(m)?x
j
(m)
ˉ
ˉ
=
?
m
?→ 0asm ?→ ∞. (4.28)
As m →∞, what can be said about the vertices (say x
0
(m))? This vertex
might move unpredictably through the simplex in some bounded inTnite
sequence. See Figure 4.8.10. Since S is compact, by the Bolzano-Weierstrass
Theorem 180 this sequence contains a subsequence that has a limit, say
x
0
(m
s
) →x
?
as s →∞. The limit point x
?
∈ S because S is closed.
But because of the closeness of the vertices, (4.28) implies that all tend
to x
?
as m
s
→∞.x
p
(m
s
) → x
?
as s →∞, p =0,1,2. Now the continuity
of f (x) implies f (x
p
(m
s
)) → f (x
?
)=y
?
as s →∞, p =0,1,2. But the
barycentric coordinates of a point x depend continuously on x. Therefore, if
we let m = m
s
→∞in (4.27) we obtain the limiting inequalities
α
0
≥ β
0
at the limit x
?
?? α
?
0
≥ β
?
0
= f
0
(x
?
)
α
1
≥ β
1
at the limit x
?
?? α
?
1
≥ β
?
1
= f
1
(x
?
)
α
2
≥ β
2
at the limit x
?
?? α
?
2
≥ β
?
2
= f
2
(x
?
)
Butweknowby(4.25)thattheseinequalities imply equalities, thus x
?
=
y
?
= f (x
?
).
4.9. APPENDIX - PROOFS IN CHAPTER 4 143
Proof. Figures for Sections ?? to 4.8
Figure ??.1: Open Sets
Figure ??.2: Sup Balls and Open Neighborhoods
Figure ??.3: (0,1) vs (0,1]
Figure ??.4: {(x,y)|0 <x<1,y=2}
Figure ??.5: Closure and Boundary Points
Figure 4.1.1: On Cluster points and the Limit of < (?1)
n
>
Figure 4.1.2: On the Limit of <
?
1
n
¢
>
Figure 4.1.3: On the Limit of <
?
x
n
¢
>
Figure 4.1.4: On the Limit of <x
n
>
Figure 4.3.1: Construction of (?)H closed.
Figure 4.3.2: Compactness for General Metric Spaces.
Figure 4.4.1: A disconnected set
Figure 4.6.1: Pointwise continuity in R
See Figure 4.7.1: Lower Hemicontinuity
SeeFigure4.7.2:UpperHemicontinuity
See Figure 4.7.3: Best Response Correspondence
See Figure 4.7.4: Budget Sets with p =0andp>0
See Figure 4.7.5: Demand Correspondence with Linear Preferences
Figure 4.8.1: Tarski?s Fixed Point Theorem in [a,b]
Figure 4.8.2: Brouwer?s Fixed Point Theorem in [a,b]
Figure 4.8.3: Fixed Point of a Contraction Mapping in [a,b]
Figure 4.8.4: Kakutani?s Fixed Point Theorem in [a,b]
Figure 4.8.5: Existence of Nash Equilibria
Figure 4.5.1: Open Sets
144 CHAPTER 4. METRIC SPACES
4.10 Bibilography for Chapter 4
Sections ?? to are based on Royden (Chapters 2 and 7) and Bartle (Sections
9,14-16). Section 4.2 is based on Royden (Chapter 7, Section 4) and Munkres
(Chapter 7, Section 1). Section 4.3 is based on Munkres (Chapter 3, Sections
5 and 7, Chapter 7, Section 3), Royden (Chapter 7, Section 7), and Bartle
(Chapter 11). Section 4.6 is based on Munkres (Chapter 3, ). Section 4.5 is
from Bartle (Sec 8).
4.11. END OF CHAPTER PROBLEMS 145
4.11 End of Chapter Problems
1) The next results (from DeTnition 319 to Theorem 164) require a total
ordering of a set X, so we restrict X to be R.
DeTnition 319 Let <x
n
> be a bounded sequence in R.Thelimit su-
perior of <x
n
>,denotedlimsupx
n
or limx
n
, is given by inf
n
sup
k≥n
x
k
.
The limit inferior of <x
n
>, denoted liminf x
n
or limx
n
,isgivenby
sup
n
inf
k≥n
x
k
.
That is, l ∈ R is the limit superior of <x
n
> i? given ε>0, there are
at most a Tnite number n ∈N such that l +ε<x
n
but there are an inTnite
number such that l?ε<x
n
.The limit superior is just the maximum cluster
point and the limit inferior is just the minimum cluster point.
Example 320 Recall Example 141 where we considered the sequence < (?1)
n
>
whichhadtwoclusterpoints. There,liminf x
n
= ?1 and limsupx
n
=1.
To see why the limit inferior is ?1, consider: n =1has inf
k≥1
x
k
= ?1,
n =2has inf
k≥2
x
k
= ?1; and any given n has inf
k≥n
x
k
= ?1.But then the
sup{?1,?1,...} is just ?1.
Theorem 321 Let <x
n
> be a bounded sequnce of real numbers. Then
limx
n
exists i? liminf x
n
= limsupx
n
=limx
n
.
Exercise 4.11.1 Prove Theorem 321.
While Theorem 321 hinges on the fact that R is totally ordered, a similar
result holds for any totally ordered set.
Example 322 Recall Example 137 where we considered the sequence <
?
1
n
¢
>
. It is simple to see that it has a cluster point at 0 since any open ball
around 0 of size δ has an inTnite number of elements in the sequence past
N(δ)=w(
1
δ
)+1contained in it. Furthermore, liminf x
n
=0=limsupx
n
.
To see why the limit superior is 0, consider: n =1has sup
k≥1
=1, n =2has
supx
k
=
1
2
; and any given n has supx
k
=
1
n
.But then the inf{1,
1
2
,...,
1
n
,
1
n+1
,...}
is just 0.
146 CHAPTER 4. METRIC SPACES
Example 323 Consider
-
(?1)
n
+
1
n
?
n∈N
. Then <x
n
>=
-
?2,
3
2
,?
2
3
,
5
4
,?
4
5
,
7
6
,...
?
.
See Figure 4.1.5. The cluster points of <x
n
> are ?1 and 1, which are also
the limit inferior and limit superior, respectively. Notice that the subsequence
ofoddnumberedindiceshx
2k?1
i =
-
(?1)
2k?1
+
1
2k?1
?
∞
k=1
=
-
?2,?
2
3
,?
4
5
,...
?
→?1 and the subsequence of even numbered indices hx
2k
i =
-
(?1)
2k
+
1
2k
?
∞
k=1
=
-
3
2
,
5
4
,
7
6
,...
?
→ 1.
Note that while a limit point is unique, we saw in Example 141 that a
sequence can have many cluster points. In that case, the smallest cluster
point is called the limit inferior and the largest cluster point is called the
limit superior.
2) We provide another useful criterion in R to establish convergence,
which is true only because R is totally ordered and complete.
Theorem 324 (Monotone Convergence) Let <x
n
> be a monotone in-
creasing sequence (i.e. x
1
≤ x
2
≤ ... ≤ x
i
≤ x
i+1
≤ ....) in the metric space
(R,|·|).
31
Then <x
n
> converges i? it is bounded and its limit is given by
limx
n
=sup{x
n
|n ∈N}.
Proof. (?) Boundedness follows by Lemma 164, so all we must show is x =
sup{x
n
}. Convergence implies x?δ<x
n
< x +δ, ?n ≥ N(δ)bydeTnition
136. As a property of the supremum, we know that if x
n
<y
n
,?n ∈ N,
then supx
n
≤ supy
n
,?n ∈ N.Thisimpliesx ? δ ≤ sup{x
n
} ≤ x + δ or
|sup{x
n
}?x|≤ δ.
(?)If<x
n
> is a bounded, monotone increasing sequence of real
numbers, then by the Completeness Axiom 3.3 its supremum exists (call
it x
0
=sup{x
n
}). Since x
0
is a sup, x
0
?δ is not an ub and ?K(δ) ∈N such
that x
0
? δ<x
K(δ)
for any δ>0.
32
Since <x
n
> is monotone, x
0
? δ<
x
n
≤ x
0
<x
0
+δ, ?n ≥ K(δ)or|x
n
?x
0
| <δ.
Example 325 Re-consider Example 137 where <
1
n
>
n∈N
. It is clear that
this sequence is monotone decreasing with inTmum 0, which is also its limit.
Exercise 4.11.2 Consider the sequence f : N→R given by < (1+
1
n
)
n
>
n∈N
.Show that this sequence is increasing and bounded above so that by the Mono-
tone Convergence Theorem 324, the sequence converges in (R,|·|).
31
That is, x
1
≤ x
2
≤ ... ≤ x
i
≤ x
i+1
≤ ....
32
Existence of this index follows from property (ii) in the footnote to deTnition ?? of a
supremum.
4.11. END OF CHAPTER PROBLEMS 147
3)
Exercise 4.11.3 Let (X,d) be totally bounded. Show that X is separable.
148 CHAPTER 4. METRIC SPACES
Chapter 5
Measure Spaces
Many problems in economics lend themselves to analysis in function spaces.
For example, in dynamic programming we deTne an operator that maps
functions to functions. As in the case of metric spaces, we need some way to
measuredistancebetweentheelementsin the function space. Since function
spaces are deTnedonuncountablyinTnite dimensional sets, the distance
measure involves integration.
1
In this chapter we will focus primarily on
Lebesgue integration. Since Lebesgue integration can be applied to a more
general class of functions than the more standard Riemann approach, this
will allow us to consider, for example, successive approximations to a broader
class of functional equations in dynamic programming.
To understand Lebesgue integration we focus on measure spaces. This
has the added beneTt of introducing us to the building blocks of probability
theory. In probability theory, we start with a given underlying set X and
assign a probability (just a real valued function) to subsets of X.Forin-
stance, if the experiment is a coin toss, then X = {H,T} and the set of all
possible subsets is given by P(X)={?,{H},{T},X} describedinDeTnition
9. Then we assign zero probability to the event where the ?ip of the coin
results in neither an H nor a T (i.e. μ(?) = 0), we assign probability one
to the event where the ?ip results in either H or T (i.e. μ(X)=1),andwe
assign probability
1
2
totheeventwherethe?ip of the fair coin results in H
(i.e. μ(H)=
1
2
).
1
In Section 4.5, we saw that in the (I
p
) space of (countably) inTnite sequences, the dis-
tance measure involved countable sums; that is, d(<x
n
>,< y
n
>)=(
P
∞
n=1
(x
n
?y
n
)
p
)
1
p
.
Integration is just the uncountable analogue of summation.
149
150 CHAPTER 5. MEASURE SPACES
One of the important results we show in this Chapter is that the collec-
tion of Lebesgue measurable sets is a σ-algebra in Theorem 341 and that
the collection of Borel sets is a subset of the Lebesgue measurable sets in
Theorem 346. Then we introduce the concept of measurability of a function
and a correspondence and deTne the Lebesgue integral of measurable func-
tions. Then we provide a set of convergence theorems for the existence of a
Lebesgue integral which are applicable under di?erent conditions. These are
the Bounded Convergence Theorem 386, Fatou?s Lemma 393, the Monotone
Converge Theorem 396, the Lebesgue Dominated Convergence Theorem 404,
and Levi?s Theorem 407. Essentially these provide conditions under which
alimitcanbeinterchangedwithanintegral. Thenweintroducegeneral
and signed measures. Here we have two important results, namely the Hahn
Decomposition Theorem 427 of a measurable space with respect to a signed
measure and the Radon-Nikodyn Theorem 434 where a signed measure can
be respresented simply by an integral. The chapter is concluded by introduc-
ing an example of a function space (which is the subject of the next chapter
6). In particular, we focus on the space of integrable functions, denoted L
1
,
and prove it is complete in Theorem 443.
5.1 Lebesgue Measure
Before embarking on the general deTnition of a measure space, in the context
of a simple set X = R we will introduce the notion of length (again just a
real-valued function deTnedonasubsetofR), describe desireable properties
of a measure space, and describe a simple measure related to length.
DeTnition 326 A set function associates an extended real number to each
set in some collection of sets. In R,thelength l(I) of an interval I ? R
is the di?erence of the endpoints of I.
2
Thus, in the case of the set function
length, the domain is the collection of all intervals.
We would like to extend the notion of length to more complicated sets
than intervals. For instance, we could deTne the ?length? of an open set to
be the sum of the lengths of open intervals of which it is composed. Since the
collection of open sets is quite restrictive, we would like to construct a set
2
That is l(I)=b?a with a,b ∈R∪{?∞,∞},a<b,andI =[a,b],(a,b),[a,b),(?∞,b],
etc.
5.1. LEBESGUE MEASURE 151
function f that assigns to each set E in the collection P(R) a non-negative
extended real number fEcalled the measure of E (i.e. f : P(R)→R
+
∪{∞}).
Remark 1 The ?ideal? properties of the set function f : P(R)→R
+
∪{∞}
are: (i) fE is deTned for every set E ?R; (ii) for an interval I, fI = l(I);
(iii) f is countably additive; that is, if {E
n
}
n∈N
is a collection of disjoint sets
(for which f is deTned), f(∪E
n∈N
)=
P
n∈N
fE
n
;and(iv)f is translation
invariant; that is, if E is a set for which f is deTned and if E +y is the set
{x + y : x ∈ E} obtained by replacing each point x ∈ E by the point x + y,
then f(E +y)=fE.
3
Unfortunately, it is impossible to construct a set function having all four
of the properties in Remark 1. As a result at least one of these four properties
must be weakened.
? Following Henri Lebesgue, it is most useful to retain the last three
properties (ii)-(iv) and to weaken the property in (i) so that fE need
not be deTned on P(R).
? It is also possible to weaken (iii) by replacing it with Tnite addi-
tivity (i.e., require that for each Tnite collection {E
n
}
N
n=1
, we have
f(∪E
N
n=1
)=
P
N
n=1
fE
n
).
5.1.1 Outer measure
Another possibility is to retain (i),(ii),(iv), and weaken (iii) in Remark 1 to
allow countable subadditivity (i.e., f(∪E
n∈N
) ≤
P
n∈N
fE
n
). A set function
which satisTes this is called the outer measure.
DeTnition 327 For each set A ?R,let{I
n
}
n∈N
denote a countable collec-
tion of open intervals that covers A (i.e. collections such that A ?∪
n∈N
I
n
)
and for each such collection consider
P
n∈N
l(I
n
).Theouter measure m
?
:
P(R)→R
+
∪{∞} is given by
m
?
(A)= inf
{I
n
}
n∈N
(
X
n∈N
l(I
n
):A ?∪
n∈N
I
n
)
.
3
For instance, translation invariance simply says the length of a unit interval starting
at 0 should be the same as a unit interval starting at 3.
152 CHAPTER 5. MEASURE SPACES
Thus, the outer measure is the least overestimate of the length of a given
set A. The outer measure is well deTned since each element of P(R) (i.e.
subset A ? R) can be covered by a countable collection of open intervals
which follows from Theorem 108. We establish the properties of the outer
measure in the next series of theorems.
Theorem 328 (i) m
?
(A) ≥ 0. (ii) m
?
(?)=0. (iii) If A ? B, then m
?
(A) ≤
m
?
(B) (i.e. monotonicity). (iv) m
?
(A)=0for every singleton set A.(v)
m
?
is translation invariant.
Exercise 5.1.1 Prove Theorem 328. Theorem 2.2, p. 56 of Jain and Gupta.
The next theorem shows that we can extend the notion of length that is
deTned for any subset of R.
Theorem 329 The outer measure of an interval is its length.
Proof. (Sketch) Let {I
n
} be an open covering of [a,b]. Then by the Heine-
Borel Theorem 194 there is a Tnite subcollection of intervals that also covers
[a,b].Arrange them such that their left endpoints form an increasing sequence
a
1
<a
2
<...<a
n
. See Figure 5.1.1. Since [a,b]is connected, intervals must
overlap which means that ∪
N
i=1
(a
i
,b
i
)=(a
1
,b
k
)forsomek with 1 ≤ k ≤ N
and [a,b] ? (a
1
,b
k
). Thus b?a ≤ (b
k
?a
1
) ≤
P
∞
n=1
!(I
n
).
DeTnition 330 Let {A
n
} be a countable collection of sets with A
n
?R.We
say m
?
is countably subadditive if m
?
(∪
n∈N
A
n
) ≤
P
n∈N
m
?
A
n
.
Theorem 331 Let {A
n
} be a countable collection of sets with A
n
?R. Then
m
?
is countably subadditive.
Proof. (Sketch) By the inTmum property, for a given ε>0, there is a
countable collection of intervals {I
n
k
}
k∈N
covering A
n
(i.e. A
n
?∪
k∈N
I
n
k
)
such that
P
k∈N
l(I
n
k
) ≤ m
?
(A
n
)+
ε
2n
.Noticethat∪
n∈N
A
n
must be covered
by ∪
n∈N
(∪
k∈N
I
n
k
) which is a countable union of countable sets and hence
countable. By monotonicity of m
?
we have
m
?
(∪
n∈N
A
n
) ≤
X
n∈N
m
?
(A
n
)+ε
5.1. LEBESGUE MEASURE 153
since
P
∞
n=1
ε
2
n
= ε. Subadditivity follows since ε ≥ 0 was arbitrary and we
can let ε → 0.
There are also two important corollaries that follow from Theorem 331.
The Trst important point is that there are unbounded sets with with Tnite
outer measure.
Corollary 332 If A is a countable set, then m
?
(A)=0.
Proof. Since A is countable, it can be expressed as {a
1
,a
2
, ...,a
n
,...}. Given
ε>0, we can enclose each a
n
in an open interval I
n
with l(I
n
)=
ε
2
n
to get
m
?
(A) ≤
X
n∈N
l(I
n
)=
X
n∈N
ε
2
n
= ε.
The result follows as we let ε → 0.
One important example of this is to let A = Q (i.e. the rationals are a
set of outer measure zero). The contrapositive of Corollary 332, that a set
with outer measure di?erent from zero is uncountable, is obviously true.
Corollary 333 [0,1] is uncountable.
Proof. Suppose, tothe contrary, that [0,1] is countable. Thenby Corollary332,
m
?
([0,1]) = 0 in which case l([0,1]) = 0 by Theorem 329, which leads to the
contradiction.
The converse of Corollary 332, that a set with outer measure zero is count-
able is not always true. To see this, consider the Cantor set F constructed
in Section 3.4. In particular,
F =
n∈N
F
n
?
≡ [0,1][
n∈N
A
n
!
where
? A
1
=(
1
3
,
2
3
)
? A
2
=(
1
9
,
2
9
)∪(
3
9
,
6
9
)∪(
7
9
,
8
9
)
? A
3
=(
1
27
,
2
27
)∪(
1
9
,
2
9
)∪(
7
27
,
8
27
)∪ (
1
3
,
2
3
)∪(
19
27
,
20
27
)∪(
7
9
,
8
9
)∪(
25
27
,
26
27
),
? etc.
154 CHAPTER 5. MEASURE SPACES
But
? m
?
(A
1
)=
1
3
? m
?
(A
2
)=
1
9
+
1
3
+
1
9
=2
1
·
?
1
3
¢
2
+2
0
·
?
1
3
¢
1
? m
?
(A
3
)=
1
27
+
1
9
+
1
27
+
1
3
+
1
27
+
1
9
+
1
27
=2
2
·
?
1
3
¢
3
+2
1
·
?
1
3
¢
2
+2
0
·
?
1
3
¢
1
andingeneral
m
?
(A
n
)=2
n?1
·
μ
1
3
?
n
+2
n?2
·
μ
1
3
?
n?1
+... +2
1
·
μ
1
3
?
2
+2
0
·
μ
1
3
?
1
=
1
3
"
μ
2
3
?
n?1
+
μ
2
3
?
n?2
+...+
μ
1
3
?
1
+1
#
=
1
3
·
"
1?
?
2
3
¢
n
1?
?
2
3
¢
#
=1?
μ
2
3
?
n
.
Since F
1
? F
2
? ... ? F
n
? ... and m
?
(F
1
)=
2
3
< ∞, by Theorem 344.
m
?
(F) = lim
n→∞
m
?
(F
n
) = lim
n→∞
m
?
([0,1]\A
n
)= lim
n→∞
[m
?
([0,1])?m
?
(A
n
)]
=1? lim
n→∞
m
?
(A
n
)=1? lim
n→∞
1?
μ
2
3
?
n
=1?1=0.
Hence, the Cantor set presents an example of an uncountable set with outer
measure zero.
Sets of outer measure zero provide another notion of ?small? sets. From
the point of view of cardinality, F is big (uncountable) while Q is small
(countable). From the topological point of view, F is small (nowhere dense)
while Q is big (dense). From the point of view of measure, both F and Q
are small (measure zero).
5.1.2 L?measurable sets
While the outer measure has the advantage that it is deTned for P(R), Theo-
rem 331 showed that it is countably subadditive but not necessarily countably
additive. In order to satisfy countable additivity, we have to restrict the do-
main of the function m
?
to some suitable subset, call it L (for Lebesgue) of
P(R). The members of L are called L-measurable sets.
5.1. LEBESGUE MEASURE 155
DeTnition 334 AsetE ?R is (Lebesgue) L-measurable if ?A ?R we
have m
?
(A)=m
?
(A∩E)+m
?
(A∩E
c
).
The deTnition of L-measurability says that the measurable sets are those
(bounded or unbounded) which split every set (measurable or not) into two
parts that are additive with respect to the outer measure.
Since A =(A∩E)∪(A∩E
c
)andm
?
is subadditive, we always have
m
?
(A) ≤ m
?
(A∩E)+m
?
(A∩E
c
).
Thus, in order to establish that E is measurable, we need only show, for any
set A,that
m
?
(A) ≥ m
?
(A∩E)+m
?
(A∩E
c
). (5.1)
Inequality (5.1) is often used in practice to determine whether a given set E
is measurable where A is called the test set.
Since DeTnition 334 is symmetric in E and E
c
,wehavethatE
c
is L-
measurable whenever E is. Clearly, ? and R are L-measurable.
Lemma 335 If m
?
(E)=0,thenE is L-measurable.
Proof. Let A ? R be any set. Since A ∩ E ? E we have m
?
(A ∩ E) ≤
m
?
(E) = 0. Since A∩E
c
? A we have m
?
(A) ≥ m(A∩E
c
)=m
?
(A∩E
c
)+
m
?
(A∩E) which follows from above. Hence E is L-measurable.
Corollary 336 Every countable set is L-measurable and its measure is zero.
Proof. From Lemma 335 and Corollary 332.
Exercise 5.1.2 Show that if m
?
(E)=0,then m
?
(E ∪A)=m
?
(A) and that
if in addition A ? E, then m
?
(A)=0.
Lemma 337 If E
1
and E
2
are L-measurable, so is E
1
∪E
2
.
Proof. Since E
1
and E
2
are L-measurable, for any set A,we have
m
?
(A)=m
?
(A∩E
1
)+m
?
(A∩E
c
1
)
= m
?
(A∩E
1
)+m
?
([A∩E
c
1
]∩E
2
)+m
?
([A∩E
c
1
]∩E
c
2
)
= m
?
(A∩E
1
)+m
?
(A∩E
2
∩E
c
1
)+m
?
(A∩[E
1
∪E
2
]
c
)
≥ m
?
(A∩[E
1
∪E
2
]) +m
?
(A∩[E
1
∪E
2
]
c
)
156 CHAPTER 5. MEASURE SPACES
where the Trst equality follows from DeTnition 334 and the fact that E
1
is measurable, the second equality follows from the deTnition and taking
the test set to be A ∩E
c
1
and E
2
is measurable, the third equality follows
from simple set operations like DeMorgan?s law, and the inequality follows
from the subadditivity of m
?
and the fact that [A∩E
1
] ∪ [A∩E
2
∩E
c
1
]=
A∩[E
1
∪E
2
].
4
But this satisTes (5.1), which is su?cient for L-measurability.
Corollary 338 The collection L of all L-measurable sets is an algebra of
sets in P(R).
Proof. Follows from DeTnition 81, the symmetry (w.r.t. complements) in
DeTnition 334, and Lemma 337
Lemma 339 Let A ? R be any set and {E
n
}
N
n=1
be a Tnite collection of
disjoint L-measurable sets in R. Then m
?
?
A∩
£
∪
N
n=1
E
n
¤¢
=
P
N
n=1
m
?
(A∩
E
n
).
Proof. The result is clearly true for N = 1. Consider an induction on N.
Suppose the result is true for N?1. Since the E
n
are disjoint, A∩
£
∪
N
n=1
E
n
¤
∩
E
N
= A∩E
N
and A∩
£
∪
N
n=1
E
n
¤
∩E
c
N
= A∩
£
∪
N?1
n=1
E
n
¤
. Then
m
?
?
A∩
£
∪
N
n=1
E
n
¤¢
= m
?
(A∩E
N
)+m
?
?
A∩
£
∪
N?1
n=1
E
n
¤¢
= m
?
(A∩E
N
)+
N?1
X
n=1
m
?
(A∩E
n
)
where the Trst equality follows from DeTnition 334 and the second follows
since the result is true for N ?1.
Corollary 340 If {E
n
}
N
n=1
is a Tnite collection of disjoint L-measurable sets
in R,thenm
?
?
∪
N
n=1
E
n
¢
=
P
N
n=1
m
?
E
n
.
Proof. Taking A = R, the result follows from Corollary 338 and Lemma
339.
The result in Corollary 340 veriTes that m
?
restricted to L is Tnitely
additive. However, we would like to extend it to the more general case of
countable additivity. First, we must show that L is a σ-algebra (as discussed
in section 2.6) so that (∪
∞
n=1
E
n
) ∈L for any {E
n
,E
n
∈L} so that the m
?
is
well deTned.
4
That is, we know m
?
([A∩E
1
]∪[A∩E
2
∩E
c
1
]) = m
?
(A∩[E
1
∪E
2
]) and subadditiv-
ity implies m
?
([A∩E
1
]∪[A∩E
2
∩E
c
1
]) ≤ m
?
([A∩E
1
]) +m
?
([A∩E
2
∩E
c
1
])
5.1. LEBESGUE MEASURE 157
Theorem 341 The collection L of all L-measurablesetsisaσ-algebra of
sets in P(R).
Proof. (Sketch)Let E = ∪
n∈N
E
n
. First we use the fact that Lis an algebra:
i.e. ∪
n∈N
E
n
∈L and that
?
∪
N
n=1
E
n
¢
c
? (∪
n∈N
E
n
)
c
= E
c
. Hence
m
?
(A) ≥
N
X
n=1
m
?
(A∩E
n
)+m
?
(A∩E
c
).
By letting N →∞and using countable subaddivity of m
?
we get where the
Trst equality follows by DeTnition 334, the inequality follows since F
c
N
? E
c5
,
and the last equality follows by Lemma 339. Since the left hand side of (5.13)
is independent of N,letting N →∞we have
m
?
(A) ≥ m
?
(A∩E)+m
?
(A∩E
c
).
DeTnition 342 The set function m : L→R
+
∪{∞},obtainedbyrestricting
the functions m
?
to the σ-algebra L ? P(R) is called the Lebesgue mea-
sure. That is, m = m
?
|
r
L.
6
Thenextresultshowsthatafterrelaxingpoint(i)inRemark1wecan
satisfy property (iii) with the Lebesgue measure.
Theorem 343 If {E
n
}
n∈N
is a countable collection of disjoint sets in R,
then m(∪
n∈N
E
n
)=
P
n∈N
m(E
n
).
Proof. Since
?
∪
N
n=1
E
n
¢
? (∪
n∈N
E
n
),?N ∈N, andbothsetsareL?measurable
by Theorems 338and 341, we have m(∪
n∈N
E
n
) ≥ m
?
∪
N
n=1
E
n
¢
=
P
N
n=1
m(E
n
)
where the equality follows by Corollary 340. Since the left hand of the
inequality is independent of N, letting N →∞we have m(∪
n∈N
E
n
) ≥
P
n∈N
m(E
n
). Since the reverse inequality holds by countable subaddivity in
Theorem 331, the result follows.
The next property will be useful in proving certain convergence properties
in upcoming sections and can be viewed as a continuity property of the
Lebesgue measure.
5
Recall by DeMorgan?s Law that F
c
N
=
£
∪
N
n=1
E
n
¤
c
= ∩
N
n=1
E
c
n
.
6
This follows from DeTnition 56.
158 CHAPTER 5. MEASURE SPACES
Theorem 344 Let <E
n
> be an inTnite decreasing sequence of L?measurable
sets (i.e. E
n+1
? E
n
, ?n). Let mE
1
be Tnite. Then m(∩
∞
i=1
E
i
) = lim
n→∞
m(E
n
).
Proof. Since L is a σ-algebra, ∩
∞
n=1
E
n
∈ L.ThesetE
1
\∩
∞
n=1
E
n
can be
written as the union of mutually disjoint sets {E
n
\E
n+1
} (see Figure 5.1.2)
E
1
\∩
∞
n=1
E
n
=(E
1
\E
2
)∪(E
2
\E
3
)∪...∪(E
n
\E
n+1
)∪...
Then using countable additivity of m we have
m(E
1
)?m(∩
∞
n=1
E
n
)=m(E
1
\∩
∞
n=1
E
n
)=m(∪
∞
n=1
E
n
\E
n+1
)
=
∞
X
n=1
m(E
n
\E
n+1
)=
∞
X
n=1
[m(E
n
)?m(E
n+1
)]
= m(E
1
)? lim
n→∞
E
n
.
Comparing the beginning and end we have m(∩
∞
n=1
E
n
) = lim
n→∞
E
n
.
5.1.3 Lebesgue meets borel
Now that we know that L is a σ-algebra, we might ask what type of sets
belong in L? For example, are open and/or closed sets in L?
Lemma 345 The interval (a,∞) is L?measurable.
Proof. (Sketch) Take any Let A.The open ray (a,∞)splitsA into two dis-
joint parts A
1
=(a,∞)∩Aand A
2
=(?∞,a].According to (5.1), it is su?ces
to show m
?
(A) ≥ m
?
(A
1
)+m
?
(A
2
). For ε>0, there is a countable collec-
tion {I
n
} of open intervals which covers A satisfying
P
∞
n=1
l(I
n
) ≤ m
?
(A)+ε
by the inTmum property in DeTnition 327. Again, (a,∞)splitseachinter-
val I
n
∈ {I
n
} into two disjoint intervals I
0
n
and I
00
n
.Clearly{I
0
n
} covers A
1
,
{I
00
n
} covers A
2
,and
P
n
!(I
n
)=
P
n
!(I
0
n
)+
P
n
!(I
00
n
). By monotonicity and
subaddivity of m
?
we have
m
?
(A
1
)+m
?
(A
2
) ≤ m
?
(∪
n
I
0
n
)+m
?
(∪
n
I
00
n
)
≤
∞
X
n=1
!(I
0
n
)+
∞
X
n=1
!(I
00
n
)
=
∞
X
n=1
!(I
n
) ≤ m
?
(A)+ε.
5.1. LEBESGUE MEASURE 159
But since ε>0 was arbitrary, the result follows.
The next result shows that every open or closed set in R is L-measurable.
Theorem 346 Every Borel set is L?measurable.
Proof. The result follows from Theorem 124 that the collection of all open
rays generates B
Thus, the Lebesgue measure m is deTned for Borel sets. Hence we can
work with sets we know a lot about. While it is beyond the scope of the book,
we note that there are examples of sets that show B ?L and L?P(R).
7
The next theorem gives a useful characterization of measurable sets. It
asserts that a measurable set can be ?approximated? by open and closed
sets. See Figure 5.1.3.
Theorem 347 Let E beameasurablesubsetofR. Then for each ε>0,
thereexistsanopensetG andaclosedsetF such that F ? E ? G and
m(G\F) <ε.
Proof. Since E is measurable, m(E)=m
?
(E). We use the inTmum property
for sets E and E
c
.Given
ε
2
, there exist open sets G and H (remember that
the union of open intervals is an open set) such that
E ? G and m(G) <m(E)+
ε
2
and E
c
? H and m(H) <m(E
c
)+
ε
2
.
Set F = H
c
. See Figure 5.1.4. Then by the properties of complements,
we have that F is closed, E ? F,andm(E) ? m(F) <
ε
2
. Thus we have
F ? E ? G and
m(G\F)=m(G)?m(F)=m(G)?m(E)+m(E)?m(F) <
ε
2
+
ε
2
= ε.
The Trst equality is due to additivity and the second is simply an identity.
5.1.4 L-measurable mappings
Before we actually begin to integrate a mapping, we must know that a given
mapping is integrable. We break this topic up into two parts: functions and
correspondences.
7
For an example of a non-measurable set see p. 289 of Carothers (2000).
160 CHAPTER 5. MEASURE SPACES
functions
Roughly speaking, a function is integrable if its behavior is not too irregular
and if the values it takes on are not too large too often. We now introduce
the notion of measurability which gives precisely the conditions required for
integrability, provided the function is not too large.
DeTnition 348 Let f : E →R∪{?∞,∞} where E is L-measurable. Then
f is L-measurable if the set {x ∈ E : f(x) ≤ α}∈L, ?α ∈R.
It is clear from the above deTnition that there is a close relation between
measurability of a function and the measurability of the inverse image set.
In particular, it can be shown that f is L-measurable i? for any closed set
G ?R,inverseimagef
?1
(G) is a measurable set. See Figure 5.1.4.1.
As α varies, the behavior of the set {x ∈ E : f(x) ≤ α} describes how
the values of the function f are distributed. The smoother is f,thesmaller
the variety of inverse images which satisfy the restriction on f.
Example 349 Consider an indicator or characteristic function χ
A
:
R→R given by
χ
A
(x)=
?
1 if x ∈ A
0 if x/∈ A
.
with A ?R.Thenχ
A
is L-measurable i? A ∈L. To see this, note that
{x ∈R : χ
A
(x) ≤ α} =
?
?
?
? if α<0
R\A if 0 ≤ α<1
R 1 ≤ α
.
But {?,A
c
,R}∈L. Figure 5.1.4.2a.
Example 350 Let the function f :[0,1] →R be given by
f(x)=
?
?
?
1 if x =0
1
x
if 0 <x<1
2 if x =1
.
Notice that this function is neither continuous nor monotone. To see that f
is L-measurable, note that
{x ∈R : f(x) ≤ α} =
?
?
?
?
?
?
?
? α<1
{0} α =1
[
1
α
,1)∪{0} 1 <α<2
[
1
α
,1]∪{0} 2 ≤ α
.
5.1. LEBESGUE MEASURE 161
Again, all these sets are in L. See Figure 5.1.4.2b. This example shows that
an L-measurable function need not be continuous.
The next result establishes that there are many criteria by which to es-
tablish measurability of a function.
Theorem 351 Let f : E →R∪{?∞,∞} where E is L-measurable. Then
the following statements are equivalent: (i) {x ∈ E : f(x) ≤ α} is L-
measurable ?α ∈ R; (ii) {x ∈ E : f(x) >α} is L-measurable ?α ∈ R;
(iii) {x ∈ E : f(x) ≥ α} is L-measurable ?α ∈ R;(iv){x ∈ E : f(x) <α}
is L-measurable ?α ∈ R; These statements imply (v) {x ∈ E : f(x)=α} is
L-measurable ?α ∈R∪{?∞,∞}.
Proof. (i) ? (ii) ? (iii) ? (iv) ? (i) can be established from
{x ∈ E : f(x) >α} = E\{x ∈ E : f(x) ≤ α}
{x ∈ E : f(x) ≥ α} = ∩
∞
n=1
?
x ∈ E : f(x) >α?
1
n
?
{x ∈ E : f(x) <α} = E\{x ∈ E : f(x) ≥ α}
{x ∈ E : f(x) ≤ α} = ∩
∞
n=1
?
x ∈ E : f(x) <α+
1
n
?
where each operation follows since L is a σ-algebra (which is closed under
complementation and countable intersection).
Next, if α ∈ R,then{x ∈ E : f(x)=α} = {x ∈ E : f(x) ≤ α} ∩
{x ∈ E : f(x) ≥ α}.Ifα = ∞,thensince{x ∈ E : f(x)=∞} =
∩
∞
n=1
{x ∈ E : f(x) ≥ n} we have (iii) ?(v). A similar result holds for α =
?∞.where the Trst line follows since L is a σ-algebra (which is closed under
countable intersection) and the second follows since the di?erence of two
measurable sets is measurable.
Next we present some properties of L-measurable functions
Lemma 352 (i) If f is an L-measurable function on the set E and E
1
? E
is an L-measurable set, then f is an L-measurable function on E
1
. (ii) If f
and g are L-measurable functions on E,thentheset{x ∈ E : f(x) <g(x)}
is L-measurable.
Proof. (i) follows since {x ∈ E
1
: f(x) >α} = {x ∈ E : f(x) >α}∩E
1
and the intersection of two L-measurable sets is measurable. (ii) DeTne
162 CHAPTER 5. MEASURE SPACES
A
q
= {x ∈ E : f(x) <q<g(x)} with q ∈ Q whose existence is guaranteed
by Theorem 100. Then A
q
= {x ∈ E : f(x) <q}∩{x ∈ E : g(x) >q} and
{x ∈ E : f(x) <g(x)} = ∪
q∈Q
A
q
, which is a countable union ofL-measurable
sets.
The next theorem establishes that certain operations performed on L-
measurable functions preserve measurability.
Theorem 353 Let f and g be L-measurable functions on E and c be a
constant. Then the following functions are L-measurable: (i) f ±c; (ii) cf;
(iii) f ±g;(iv)|f|;(v)f
2
; (vi) fg.
Proof. (i) {x ∈ E : f(x)±c>α} = {x ∈ E : f(x) >α
0
} with α
0
= α?c so
f ±c is L-measurable when f is.
(ii) If c =0, then cf is L-measurable since any constant function is L-
measurable. Otherwise
{x ∈ E : cf(x) >α} =
?
{x ∈ E : f(x) >α
0
} if c>0
{x ∈ E : f(x) <α
0
} if c<0
with α
0
=
α
c
is L-measurable since f is L-measurable.
(iii) {x ∈ E : f(x)+g(x) >α} = {x ∈ E : f(x) >α?g(x)}. Since α?g
is L-measurable by (i) and (ii), then f +g is L-measurable by Lemma 352.
(iv) Follows since
{x ∈ E : |f(x)| >α} =
?
E if α<0
{x ∈ E : f(x) >α}∪{x ∈ E : f(x) < ?α} if α ≥ 0
andbothsetsontherhsareL-measurable since f is L-measurable.
(v) Follows from
{x ∈ E :(f(x))
2
>α} =
?
E if α<0
{x ∈ E : |f(x)| >α} if α ≥ 0
and (iv).
(vi) Follows from the identity fg =
1
2
[(f +g)
2
?f
2
?g
2
] and (ii), (iii),
(v).
Parts (ii) and (iii) of Theorem 353 imply that scaled linear combinations
of indicator functions deTned on measurable sets are themselves measurable
functions. This type of function, known as a simple function, will play an
important role in approximating a given function. As we will see in the next
5.1. LEBESGUE MEASURE 163
section, unlike the standard (Riemann) way of approximating the integral of
f by calculating the area under f basedonpartitionsofthedomain,inthis
chapter we will be approximating the integral of f by calculating the area
under f based on partitions of the range. See Figure 5.1.4.2c.
DeTnition 354 Afunction? : E →R given by
?(x)=
n
X
i=1
a
i
χ
E
i
(x) (5.2)
is called a simple function if there is a Tnite collection {E
1
,...,E
n
} of
disjoint L?measurable sets with ∪
n
i=1
E
i
= E and a Tnite set of real numbers
{a
1
,...,a
n
} such that a
i
= ?(x),?x ∈ E
i
for i =1,...,n where χ
E
i
(x) is an
indicator function introduced in Example 349. The right hand side of (5.2)
is called the representation of ?.
We note that the real numbers {a
i
} and the sets {E
i
} in this representa-
tion are not uniquely determined as the next example shows.
Example 355 Let {E
1
,E
2
,E
3
} be disjoint subsets of an L-measurable set E.
Consider the two simple functions 2χ
E
1
+5χ
E
2
+2χ
E
3
and 2χ
E
1
∪E
3
+5χ
E
2
.
Clearly these two simple functions are equal. Notice that the coe?cients
in the Trst representation are not distinct. Since a simple function obtains
only a Tnite number of values {a
1
,...,a
n
} on E we can construct the inverse
image sets {A
i
} as A
i
= {x ∈ E : x = ?
?1
({a
i
})}, i =1,...,n, n ∈ R.
In this example, A
1
= {x ∈ E : x = ?
?1
({2})} = E
1
∪ E
3
and A
2
=
{x ∈ E : x = ?
?1
({5})} = E
2
, both of which are L-measurable, disjoint
sets. To avoid such problems with non-uniqueness, we use this construction
to deTne the canonical (or standard) representation of ? : E →R by
?(x)=
P
k
i=1
a
i
χ
A
i
(x) where the Tnite collection {A
1
,...,A
k
}of L?measurable
sets are disjoint with∪
k
i=1
A
i
= E and the Tnite set of real numbers {a
1
,...,a
k
}
are distinct and nonzero.
The next theoremandexercise provide su?cient conditions forL-measurability.
Theorem 356 A continuous function deTned on an L-measurable set is L-
measurable.
164 CHAPTER 5. MEASURE SPACES
Proof. Let f be a continuous function deTned on E (which is L-measurable).
Consider the set A = {x ∈ E : f(x) >α} which is the inverse image of the
open ray (α,∞). Since f is continuous, Theorem 248 implies f
?1
((α,∞)) is
open and hence L-measurable.
Exercise 5.1.3 Show that any monotone function f : R→Ris L-measurable.
Next we consider how sequences of L-measurable functions behave.
Theorem 357 Let <f
n
> be a sequence of functions on a common do-
main E. Then the functions max{f
1
,...,f
n
}, min{f
1
,...,f
n
}, sup
n
f
n
, inf
n
f
n
,
limsup
n
f
n
,andliminf
n
f
n
are all L-measurable.
Proof. If g(x)=max{f
1
(x),...,f
n
(x)},then {x ∈ E : g(x) >α} = ∪
n
i=1
{x ∈
E : f
i
(x) >α}andL-measurability of eachf
i
impliesg isL-measurable. Sim-
ilarly, if h(x)=sup
n
f
n
(x), then {x ∈ E : h(x) >α} = ∪
∞
i=1
{x ∈ E : f
i
(x) >
α}. A similar argument, along with the fact that inf
n
f
n
= ?sup
n
(?f
n
)
and (ii) of Theorem 353, establishes the corresponding statements for inf .
To establish the last results, note that liminf
n
f
n
= ?limsup
n
(?f
n
)=
sup
n
(inf
k≥n
f
k
).
Using the above results, for any function f : E → R, we can construct
non-negative functions f
+
=max{f,0} and f
?
=max{?f,0}. The function
f is L-measurable i? both f
+
and f
?
are L-measurable. It is also easy to
verify that f = f
+
?f
?
and |f| = f
+
+f
?
. See Figure 5.1.4.3.
Corollary 358 (i) If <f
n
> is a sequence of L-measurable functions con-
verging pointwise to f on E, then f is L-measurable. (ii) The set of points
on which <f
n
> converges is L-measurable.
Proof. (i) Since f
n
→ f, we have limsup
n
f
n
= liminf
n
f
n
= f by Theorem
321 and the result thus follows from Theorem 357 above. (ii) From (i), the
set {x ∈ E : limsup
n
f
n
?liminf
n
f
n
=0} is L-measurable by (v) of Theorem
351.
In some cases, two functions may be ?almost? the same in the sense of
L-measurability. The next deTnition helps us make that precise.
DeTnition 359 Apropertyissaidtoholdalmost everywhere (a.e.) if
the set of points where it fails to hold is a set of measure zero.
5.1. LEBESGUE MEASURE 165
Example 360 Let f :[0,1]→{0,1} be given by
f(x)=
?
1 if x ∈Q
0 otherwise
known as the Dirichlet function. While this function is famous since it is
everywhere discontinuous (which we will use in Section 5.2.1, here we simply
use it to illustrate the concept of almost everywhere. In particular, f(x)=0
a.e. since {x ∈ [0,1] : f(x) 6=0} = {x ∈ [0,1] : x ∈ Q} and m({x ∈ [0,1] :
x ∈ Q})=0which follows from the countability of the rationals established
in Example 77 and Corollary 336..
Theorem 361 Let f and g have domain E and let f be an L-measurable
function. If f = g a.e., then g is measurable.
Proof. Let D = {x ∈ E : f(x) 6= g(x)}. Then mD = 0 by assumption. Let
α ∈R, and consider
{x ∈ E : g(x) >α} = {x ∈ E\D : f(x) >α}∪{x ∈ D : g(x) >α}
=[{x ∈ E : f(x) >α}\{x ∈ D : g(x) ≤ α}]∪{x ∈ D : g(x) >α}
Since f is L-measurable, the Trst set is L-measurable. Furthermore, since
the other two sets are contained in D, which has measure 0, they are L-
measurable by Lemma 335 and Exercise 5.1.2.
Now we consider a weaker notion of continuity than considered in Theo-
rem 356.
Theorem 362 If a function f deTned on E (which is L-measurable) is con-
tinuous a.e., then f is L-measurable on E.
Proof. Follows from Theorem 356.
Theorem 362 thus establishes the su?cient condition such that the dis-
continuous and non-monotone function in Example 350 is L-measurable.
Now we consider a weaker version of convergence than considered in
Corollary 358.
DeTnition 363 Asequence<f
n
> of functions deTned on E is said to
converge a.e. toafunctionf if lim
n→∞
f
n
(x)=f(x),?x ∈ E\E
1
where
E
1
? E with mE
1
=0.
166 CHAPTER 5. MEASURE SPACES
Theorem 364 If a sequence <f
n
> of L-measurable functions converges
a.e. to the function f,thenf is L-measurable.
Proof. Follows from Corollary 358.
Example 365 Let <f
n
> be given by <x
n
> on [0,1] which converges
pointwise to
f =
?
0 if x ∈ [0,1)
1 if x =1
and f is L-measurable since it is the constant (zero) function almost every-
where.
The next theorem establishes that if a sequence of functions converges
pointwise, then we can isolate a set of points of arbitrarily small measure
such that on the complement of that set the convergence is uniform.
Theorem 366 Let E be an L-measurable set with mE < ∞ and <f
n
> be
a sequence of L-measurable functions deTned on E.Letf : E → R be such
that ?x ∈ E, f
n
(x) → f(x). Then given ε>0 and δ>0, ? an L-measurable
set A ? E with m(A) <δand ?N such that for n ≥ N and x/∈ A we have
|f
n
(x)?f(x)| <ε.
Proof. Let G
n
= {x ∈ E : |f
n
(x)?f(x)| ≥ ε and E
k
= ∪
∞
n=k
G
n
= {x ∈ E :
|f
n
(x)?f(x)| ≥ ε for some n ≥ k}. Thus E
k+1
? E
k
and for each x ∈ E
theremustbesomesetE
k
such that x/∈ E
k
otherwise we would violate the
assumption f
n
(x) →f(x),?x ∈ E. Thus <E
k
> is a decreasing sequence of
L-measurable sets for which ∩
∞
k=1
E
k
= ? so that by Theorem 344 we have
lim
k→∞
mE
k
= 0. Hence given δ>0,?N such that mE
N
<δ;thatis,
m{x ∈ E : |f
n
(x)?f(x)| ≥ ε for some n ≥ N} <δ. If we write A for this
E
N
,then mA < δ and
E\A = {x ∈ E : |f
n
(x)?f(x)|≥ ε,?n ≥ N}.
The next theorem says that for any L-measurable function f there exists
a sequence of ?nice? functions (more speciTcally simple functions) that con-
verge pointwise to f. Moreover, on the subdomain where f is bounded, this
convergence is uniform. This means that a bounded measurable function can
be approximated by a simple function.
5.1. LEBESGUE MEASURE 167
Theorem 367 Let f be an L-measurable function deTned on a set E.Then
there exists a sequence <f
n
> of simple functions which converges pointwise
to f on E and converges uniformly to f on any set where f is bounded.
Furthermore, if f ≥ 0, then <f
n
> can be chosen such that 0 ≤ f
n
≤ f
n+1
,
?n ∈N.
Proof. (Sketch) We can assume that f ≥ 0. If not, then let f = f
+
?f
?
where f
+
and f
?
are non-negative. For n ∈ N,we divide the range of f
(which can be unbounded) into two parts: [0,2
n
)and[2
n
,∞). See Figure
5.1.4.4. Then divide [0,2
n
)into2
2n
?1 equal parts. Let F
n
be the inverse
image of [2
n
,∞)andE
n,k
be the inverse images of [k2
?n
,(k +1)2
?n
]for
k =0,1,...,2
2n
?1. Since f is measure, F
n
and E
n,k
are measurable. DeTne
a simple function
?
n
=2
n
χ
F
n
+
2
2n
?1
X
k=0
k2
?n
χ
E
n,k
. (5.3)
Note that 0 ≤ ?
n
≤ f and 0 ≤ f ??
n
≤ 2
?n
on ∪
2
2n
?1
k=1
E
n,k
. For any x ∈ E,
there exists n large enough such thatf(x) < 2
n
. Hence x ∈∪E
n,k
implies that
f(x)??
n
(x) ≤ 2
?n
and thus ?
n
(x) → f(x). Moreover, if f is bounded, there
exists an n large enough such that E = ∪
2
2n
?1
k=1
E
n,k
and f(x)??
n
(x) <
1
n
for
each x ∈ E, thus <?
n
> converges uniformly to f.
Exercise 5.1.4 Show that ?
n
increases in (5.3). Hint: E
n,k
= E
n+1,2k
∪
E
n+1,2k+n
.
The next deTnition will be useful in Chapter FS.
DeTnition 368 Let f be an L-measurable function. Then inf{α ∈ R : f ≤
α a.e.} is called the essential supremum of f, denoted esssupf, and
sup{α ∈ R : f ≥ α a.e.} is called the essential inTmum of f, denoted
ess inf f.
Example 369 Let f :[0,1]→{?1,0,1} be given by
f(x)=
?
?
?
1 if x ∈Q
++
0 if x is irrational
?1 if x ∈Q
?
which is a simple generalization of the Dirichlet function. Given the results
in Example 360 we have esssupf = ess inf f =0.
168 CHAPTER 5. MEASURE SPACES
correspondences
Let Γ : X 3Y be a correspondence where X = R or a subset of R equipped
with the Lebesque measure and L(X)isaσ? algebra of all L -measurable
subsets of X and Y is a complete, separable metric space. We will introduce
the concept of measurablility of correspondences the same way we deTned
measurability of single-valued functions (i.e. through inverse images). We
know a function f : X ?→ Y is L -measurable if f
?1
(V)isL -measurable
for every open set V ? Y or equivalently f
?1
(U)isL -measurable for every
closed set U ? Y.
DeTnition 370 Consider a measurable space (X,L) where X ?R (or X =
R ), Y is a complete separable metric space Y, and Γ : X 3 Y is a closed-
value correspondence. Γ is measurable if the inverse image of each open
set is a L -measurable set. That is, for every open subset V ? Y we have
Γ
?1
(V)={x ∈ X : Γ(x)∩V 6= ?}∈L.
Notice that measurability is deTned only for closed valued correspon-
dences.
Given a correspondence Γ : X 3 Y we can ask under what conditions
there exists a measurable selection of Γ (i.e. a single-valued, L?measurable
function f : X ?→ Y such that f (x) ∈ Γ(x) for all xdX. The following
theorem says that every L?measurable correspondence has a measurable
selection provided the spaces X and Y have certain properties.
Theorem 371 (Measurable Selection) Let (X,L) be a Lebesgue measur-
able space, let Y be a complete separable metric space, and let Γ : X 3Y be a
L?measurable, closed valued correspondence. Then there exists a measurable
selection of Γ.
Proof. (Sketch) By induction, we will deTne a sequence of measurable
functions f
n
: X ?→ Y such that
(i) f
n
(z)issu?ciently close to Γ(z)(i.e.d(f
n
(z),Γ(z)) <
1
2
n
)and
(ii)f
n
(z)andf
n+1
(z)aresu?cientlyclose toeach other (i.e. d(f
n+1
(z),f
n
(z)) ≤
1
2
n?1
on X for all n).
Then we are done, since from (ii)itfollowsthathf
n
(z)i is Cauchy for
each z and due to completeness of Y there exists a function f : X ?→ Y
such that f
n
(z) ?→ f (z)onX pointwise and by Corollary 358 the pointwise
limit f of a sequence of measurable functions is measurable. Hence we take f
5.1. LEBESGUE MEASURE 169
as a measurable selection. Condition (i) guarantees that f (z) ∈Γ(z),?z ∈
X (hereweusethefactthatΓ(z)isclosedandd(f(z),Γ(z)) = 0 implies
f(z) ∈Γ(z)) by Exercise 4.1.3
Now we construct a sequence hf
n
i of measurable functions satisfying (i)
and (ii).Let {y
n
,n∈N} be a dense set in Y (since Y is separable such a
countable set exists). DeTne f
k
(z)=y
p
where p is the smallest integer such
that the ball with center at y
p
with radius
1
k
has non-empty intersection with
Γ(z). See Figure 5.1.4.5. It can be shown that f
k
is measurable and <f
k
>
satisfy (i) and (ii).
How is measurability of a correspondence related to upper or lower hemi-
continuity? We would expect that hemicontinuity implies measurability and
we now show that this is true (a result similar to that for functions in Theo-
rem 356. In the case of lower hemicontinuity we get the result immediately.
Lemma 372 Under the assumptions of Theorem 371 if Γ : X 3 Y is lhc,
then Γ is measurable.
Proof. Since Γ is lhc, then f
?1
(V)isopenforV ? Y open. Since open sets
are L -measurable, then f
?1
(V) ∈L so that f is L -measurable.
To show that uhc implies measurability, we show that open sets can be
replaced by closed sets in DeTnition 370.
Lemma 373 Under the assumption of the Theorem 371,Γ : X 3Y is mea-
surable i? f
?1
(U) is L -measurableforeveryclosedsubsetU ? Y.
Proof. ??=?LetV be an open subset of Y. DeTne the closed sets C
n
=
?
xdY,d(x,Y\V) ≥
1
n
a
. Then V = ∪
n∈N
C
n
. Consequently Γ(x)∩V 6= ? i?
Γ(x)∩C
n
6= ? for some n. This yields Γ
?1
(V)=∪
n∈N
Γ
?1
(C
n
) ∈L because
Γ
?1
(C
n
) ∈ L by assumption (because C
n
is closed) and ∪
n∈N
Γ
?1
(C
n
) ∈ L
(because L is σ-algebra).
?=? ? We omit this direction since it would require introducing measur-
ability on the Cartesian product X ×Y. [See Aubin-Frankowske Section 8.3
pg.319].
Lemma 374 Under the assumption of Theorem 371, if Γ : X 3 Y is uhc
then Γis measurable.
Proof. Since Γ is uhc, then f
?1
(U)isclosedforU closed and closed sets
are L -measurable. Hence f
?1
(U) ∈L. Thus f is L -measurable by Lemma
373.
170 CHAPTER 5. MEASURE SPACES
5.2 Lebesgue Integration
In introductory calculus classes, you were introduced to Rieman integration.
While simple, it has many defects. First, the Rieman integral of a function
is deTned on a closed interval and cannot be deTned on an arbitrary set.
Second, a function is Rieman integrable if it is continuous or continuous
almost everywhere. The set of continuous functions, however is relatively
small. Third, given a sequence of Rieman integrable functions converging to
some function, the limit of the sequence of the integrated function may not
be the Rieman integral of the limit function. In fact the Rieman integral of
the limit function may not even exist. These defects are absent in Lebesgue
integration. To see these problems we begin by brie?yreviewingtheRiemann
integral.
5.2.1 Riemann integrals
Consider a bounded function f :[a,b] → R and a partition P = {a = x
0
<
x
1
< ... < x
n?1
<x
n
= b} of [a,b]. Let Υ be the set of all possible partions.
For each P,deTne the sums
S(P)=
n
X
i=1
(x
i
?x
i?1
)H
i
and s(P)=
n
X
i=1
(x
i
?x
i?1
)h
i
where H
i
=sup{f(x):x ∈ (x
i?1
,x
i
]} and h
i
=inf{f(x):x ∈ (x
i?1
,x
i
]},
?i =1,...n. The sums S(P)ands(P) are known as step functions. See
Figure 5.2.1.1. Then the upper Riemann integral of f over [a,b]isde-
Tned by R
u
R
b
a
f(x)dx =inf
P∈Υ
S(P)andthelower Riemann integral
of f over [a,b]isdeTned by R
l
R
b
a
f(x)dx =sup
P∈Υ
s(P). If R
u
R
b
a
f(x)dx =
R
l
R
b
a
f(x)dx, thenwesaytheRiemannintegral exists anddenoteitR
R
b
a
f(x)dx.
We state without proof (since it would take us far aTeld) the following
Proposition which characterizes the ?class? of Riemann integrable functions.
8
Proposition 375 A bounded function is Riemann integrable i? it is contin-
uous almost everywhere.
We next provide explicit examples of functions that are and are not Rie-
mann integrable.
8
See Jain and Gupta (1986), Appendix 1.
5.2. LEBESGUE INTEGRATION 171
Example 376 Consider the Riemann integral of Dirichlet?s function intro-
duced in Example 360. Then R
u
R
1
0
f(x)dx =1and R
l
R
1
0
f(x)dx =0so the
Riemann integral does not exist. Intuitively, this is because in any partition
P, however Tne, there are both rational and irrational numbers which follows
from the density of both sets established in Example 154. Formally, to see
that the Dirichlet function (while bounded) is not continuous anywhere (and
hence does not satisfy the requirements of the Proposition 375), consider the
following argument. If q ∈ Q∩[0,1], let <x
n
> be a sequence of irrational
numbers converging to q (the existence of such a sequence follows from The-
orem 102). Since f(x
n
)=0, ?n ∈ N, the sequence <f(x
n
) > does not
converge to f(q)=1so f is not continuous at a ∈ Q.Similarly,ifι is an
irrational number, let <y
n
> be a sequence of rational numbers converging to
ι (the existence of such a sequence again follows from Theorem 102). Since
f(y
n
)=1, ?n ∈N, the sequence <f(y
n
) > does not converge to f(ι)=0so
f is not continuous at ι ∈R\Q. See Figure 5.2.1.2.
Example 377 Next consider the Riemann integral of f :[0,1] → {0,1}
given by
f(x)=
?
1 if x =
1
n
0 otherwise
.
Hence this function takes on the value 1 on the rationals {
1
n
,n∈ N} rather
than the bigger set Q = {
m
n
, m,n ∈ N}. We begin by noting that one can
show that {
1
n
,n ∈ N} is not dense in [0,1]. As in the preceding example,
it is simple to show that f is discontinuous at {
1
n
,n ∈ N}. On the other
hand, f is continuous at D =[0,1]\{
1
n
,n∈N}.Toseethis,letx ∈ D\{0}.
Then ?n ∈ N such that x ∈ (
1
n+1
,
1
n
).Letδ =
1
2
min
?ˉ
ˉ
x?
1
n
ˉ
ˉ
,
ˉ
ˉ
x?
1
n+1
ˉ
ˉ
a
.
Then ?x
0
∈ (x?δ,x+ δ),wehavef(x
0
)=0.Thus,?ε>0, ?δ such that
?x
0
∈ (x ? δ,x + δ), we have |f(x)?f(x
0
)| = |0?0| =0<ε.Sincef
is discontinuous at a countable set of points, we know the Riemann integral
exists by Proposition 375 and is given by R
R
1
0
f(x)dx =0. See Figure 5.2.1.3.
Example 378 Finally, let {q
i
} be the enumeration of all the rational num-
bers in [0,1] and let Q
n
= {q
i
∈ Q∩[0,1] : i =1,2,...,n}, n ∈ N.DeTne,
for each n ∈N, the function f
n
:[0,1] →{0,1} by
f
n
(x)=
?
1 if x ∈ Q
n
0 otherwise
.
172 CHAPTER 5. MEASURE SPACES
The function f
n
is discontinuous only at the n points of Q
n
in [0,1].Sincef
n
is continuous a.e. and is bounded, the Riemann integral exists and R
R
1
0
f
n
(x)dx =
0. Notice however that while f
n
→ f, R
R
1
0
f
n
(x)dx does not converge to
R
R
1
0
f(x)dx since the latter doesn?t even exist!
5.2.2 Lebesgue integrals
Now that we?ve exposed some of the problems with the Riemann integral,
we take up a systematic treatment of the Lebesgue integral. As Proposition
375 suggests, the class of Riemann integrable functions is somewhat narrow.
On the other hand, the Lebesgue integrable functions are (relatively) larger.
This is because the Lebesgue integral replaces the class of step functions
(used in the construction of the Riemann integral) with the larger class of
simple functions that were deTned in 354. The essential di?erence between
stepfunctionsandsimplefunctionsistheclassofsetsuponwhichtheyare
deTned. In particular the collection of subsets upon which the step function
is deTned is a strict subset of the collection of subsets upon which simple
functions are deTned. We will construct Lebesgue integrals under three sep-
arate assumptions concerning boundedness of the function over which we are
integrating (f)andtheTniteness of the measure (m)ofthesets(E)upon
which the function (f)isdeTned.
Assumption 1: f is bounded and m(E) < ∞
Consider the representation ? : E → R deTnedinExample355givenby
?(x)=
P
k
i=1
a
i
χ
A
i
(x)whereA
i
? E ∈ L are disjoint and a
i
∈ R\{0} are
distinct. Then we deTne the elementary integral of this simple function to
be
R
?(x)dx =
P
k
i=1
a
i
m(A
i
). This integral is well deTned since m(A
i
) < ∞
?i and there are Tnitely many terms in the sum. In this case, we call the
function ? an integrable simple function. To economize on notation, let
R
E
? ≡
R
?(x)dx.
Sometimes it is useful to employ representations that are not canonical
and the following lemma asserts that the elementary integral is independent
of its representation.
Lemma 379 Let ? =
P
n
i=1
a
i
χ
E
i
, with E
i
∩E
j
= ? for i 6= j. Suppose each
E
i
is an L-measurable set of Tnite measure. Then
R
E
? =
P
n
i=1
a
i
m(E
i
).
5.2. LEBESGUE INTEGRATION 173
Proof. The set A
a
= {x ∈ E : ?(x)=a} = ∪
a
i
=a
E
i
.HenceamA
a
=
P
a
i
=a
a
i
m(E
i
) by additivity of m.Thus,
R
E
? =
P
am(A
a
)=
P
a
i
m(E
i
).
Next we establish two basic properties of the elementary integral.
Theorem 380 Let ? and ψ be simple functions which vanish outside a set
of measure zero.
9
Then (i) integration preserves linearity:
R
E
(a? + bψ)=
a
R
E
? + b
R
E
ψ and (ii) integration preserves monotonicity: if ? ≥ ψ a.e.,
then
R
E
? ≥
R
E
ψ.
Exercise 5.2.1 Prove Theorem 380.
Let f : E → R be any bounded function and E an L-measurable set
with mE < ∞. In analogy with the Riemann integral, we deTne the up-
per Lebesgue integral of f over E by L
u
R
E
f(x)dx =inf
ψ≥f
R
E
ψ and
the lower Lebesgue integral of f over E is deTned by L
l
R
E
f(x)dx =
sup
?≤f
R
E
? where ψ and ? range over the set of all simple functions deTned
on E. Notice that L
u
R
E
f(x)dx and L
l
R
E
f(x)dx are well deTned since f is
bounded and m has Tnite measure on E. See Figure 5.2.2.1.
DeTnition 381 If L
u
R
E
f(x)dx = L
l
R
E
f(x)dx, then we say the Lebesgue
integral exists and denote it
R
E
f(x)dx.
Notice that if f is a simple function, then inf
ψ≥f
= f and sup
?≤f
= f
so that L
u
R
E
f(x)dx = L
l
R
E
f(x)dx. Hence simple functions are Lebesgue
integrable.
The next question is what other functions are Lebesgue integrable? The
next theorem provides necessary and su?cient conditions for integrability.
In particular, su?ciency shows that if one can establish that the function is
L-measurable (as well as the conditions under which this section is based),
then we know it is integrable. This is another theorem like Heine-Borel where
su?ciency makes one?s life simple.
Theorem 382 A bounded function f deTned on an L-measurable set E of
Tnite measure is Lebesgue integrable i? f is L-measurable.
9
We say a function f vanishes outside a set of measure zero if m({x ∈ E : f(x) 6=0})=
0 or outside a set of Tnite measure if m({x ∈ E : f(x) 6=0}) < ∞.
174 CHAPTER 5. MEASURE SPACES
Proof. (Sketch) (?)Sincef : E → R is bounded, ?M ≤ f(x) ≤ M.
Divide [?M,M]inton equal parts. Construct sets
E
k
= f
?1
μ·
k?1
n
M,
k
n
M
??
(i.e. E
k
is the set of all x ∈ Esuch that f(x) belongs to a slice
£
k?1
n
M,
k
n
M
¢
.
See Figure 5.2.2.2. E
k
is measurable because it is an inverse image of a
measurable function of an interval. DeTne two simple functions ψ
n
(x)=
M
n
P
n
k=?n
kχ
E
k
(x)and?
n
(x)=
M
n
P
n
k=?n
(k ? 1)χ
E
k
(x). Then ? approxi-
mates f from below and ψ approximates f from above. Because ? and ψare
simple functions,
R
E
? and
R
E
ψ are well deTned. The upper (lower) Lebesgue
integrals of f on E,beingtheinTmum (supremum) satisfy
Z
E
?
n
≤ L
l
Z
E
f ≤ L
u
Z
E
f ≤
Z
E
ψ
n
.
As n gets larger, ?
n
and ψ
n
get closer to each other and hence so do their
integrals. Thus for n →∞, L
l
R
E
f = L
u
R
E
f which means
R
E
f exists.
(=?)Let f be integrable. Then
inf
ψ≥f
Z
ψ(x)dx =sup
?≤f
Z
?(x)dx
where ? and ψ are simple functions. Then by the property of inTmum
and supremum, for any n, there are simple functions ?
n
and ψ
n
such that
?
n
(x) ≤ f(x) ≤ ψ
n
(x)and
Z
ψ
n
(x)dx?
Z
?
n
(x)dx <
1
n
. (5.4)
DeTne ψ
?
=inf
n
ψ
n
and ?
?
=sup
n
?
n
, which are measurable by Theorem
357 and satisfy ?
?
(x) ≤ f(x) ≤ ψ
?
(x). Butthesetofx for which ?
?
(x)di?ers
from ψ
?
(x)(i.e. ? = {x ∈ E : ?
?
(x) <ψ
?
(x)}) has measure zero due to
(5.4). Thus ?
?
= ψ
?
except on a set of measure zero. Thus f is measurable
by Theorem 361.
Notice that the assumptions on boundedness and Tnite measure imply
MmE<∞ upon which the proof rests.
Next we establish that the Lebesgue integral is a generalization of the
Riemann integral.
5.2. LEBESGUE INTEGRATION 175
Theorem 383 Let f be a bounded function on [a,b]. If f is Riemann inte-
grable over [a,b], then it is Lebesgue integrable and R
R
b
a
f(x)dx =
R
[a,b]
f(x)dx.
Proof. The proof rests on the fact that every step function (upon which Rie-
mann integrals are deTned) is also a simple functions (upon which Lebesgue
integrals are deTned), while the converse is not true. Then
R
l
Z
b
a
f(x)dx ≤ sup
?≤f
Z
[a,b]
?(x)dx ≤ inf
ψ≥f
Z
[a,b]
ψ(x)dx ≤ R
u
Z
b
a
f(x)dx
where the Trst and third inequalities follow from the above fact and the
second follows from the fact that ? ≤ f ≤ ψ and (ii) of Theorem 380.
Of course, the converse is not true.
Example 384 In Example 376 we showed that the Dirichlet function was
not Riemann integrable. However, it is Lebesgue integrable by Theorem 382
since it is L-measurable (which is clear since it is a simple function). Hence
R
[0,1]
f(x)dx =1·m(Q∩[0,1])+0·m([0,1]\Q)=1·0+0·1.
Now we establish the following properties of Lebesgue integrals which
follow as a consequence of the factthatLebesgueintegralsaredeTned on
simple functions and elementary integrals preserve linearity and monotonicity
by Theorem 380.
Theorem 385 If f and g are bounded L-measurable functions deTned on
asetEof Tnite measure, then: (i)
R
E
(af + bg)=a
R
E
f + b
R
E
g; (ii) if
f = g a.e., then
R
E
f =
R
E
g; (iii) if f ≤ g a.e., then
R
E
f ≤
R
E
g and
hence
ˉ
ˉ
R
E
f
ˉ
ˉ
≤
R
E
|f|;(iv)ifc ≤ f(x) ≤ d,thencm(E) ≤
R
E
f ≤ dm(E);
and (v) if A and B are disjoint L-measurable sets of Tnite measure, then
R
A∪B
f =
R
A
f +
R
B
f.
Exercise 5.2.2 Prove Theorem 385.
We now prove a very important result concerning the interchange of limit
and integral operations of a convergent sequence of bounded L-measurable
functions.
Theorem 386 (Bounded Convergence) Let <f
n
> be a sequence of
L-measurable functions deTned on a set E of Tnite measure and suppose
|f
n
(x)| ≤ M, ?n ∈N and ?x ∈ E. If f
n
→ f a.e. on E, then f is integrable
and
R
E
f =lim
n→∞
R
E
f
n
..
176 CHAPTER 5. MEASURE SPACES
Proof. (Sketch) Since f
n
→ f, f is measurable by Theorem 364. Since f
n
is uniformly bounded, then f is bounded. Given ε, it is possible to split E
(by Theorem 366) into two parts E\A where f
n
→ f uniformly and A with
m(A) <ε.Then,
lim
n→∞
Z
E
f
n
→
Z
E
lim
n→∞
f
n
=
Z
f
if
ˉ
ˉ
ˉ
ˉ
Z
E
f
n
?
Z
E
f
ˉ
ˉ
ˉ
ˉ
=
ˉ
ˉ
ˉ
ˉ
Z
E
(f
n
?f)
ˉ
ˉ
ˉ
ˉ
is su?ciently small. Split this integral into two parts:
ˉ
ˉ
ˉ
ˉ
Z
E
(f
n
?f)
ˉ
ˉ
ˉ
ˉ
≤
Z
E
|f
n
?f|≤
Z
E\A
|f
n
?f|+
Z
A
|f
n
?f|
The Trst inegral is su?ciently small becausef
n
→f uniformly and the second
issu?ciently small because|f
n
?f|is bounded andm(A)issu?ciently small.
It is important to note that
R
E
f = lim
n→∞
R
E
f
n
only requires point-
wise convergence with Lebesgue integration. A similar result for Riemann
integration (i.e. R
R
E
f = lim
n→∞
R
R
E
f
n
) requires uniform convergence.
Example 387 Here we return to Example 378. There we saw that the
bounded function f
n
was discontinuous at the n points of the L-measurable
set Q
n
= {q
i
∈ Q∩[0,1] : i =1,2,...,n}, n ∈ N.WhileR
R
1
0
f
n
(x)dx =0
along the sequence and while f
n
→ f, we saw lim
n→∞
R
R
1
0
f
n
(x)dx did not
exist. On the other hand, since the bounded function f
n
is L-measurable and
m[0,1] < ∞,weknowlim
n→∞
R
1
0
f
n
(x)dx exists and equals 0 by Example
384.
Assumption 2: f is nonnegative and m(E) ≤∞
In many instances, economists consider functions which are unbounded (e.g.
most utility functions we write down are of this variety). Hence, it would be
nice to relax the above assumption about boundedness. This section does
that,albeitatthecostthatf must be nonnegative. Here we also do not
require E to be of Tnite measure.
DeTnition 388 If f : E →R
+
on an L-measurable set E is L-measurable,
we deTne
R
E
f =sup
h≤f
R
E
h where h is a bounded, L-measurable function
which vanishes outside a set of Tnite measure.
5.2. LEBESGUE INTEGRATION 177
Notice that the integral is deTnedonanyfunctionh (not just sim-
ple functions) which satisTes the conditions of the previous subsection and
sup
h≤f
R
E
h is similar to the deTnition of the lower Lebesgue integral in the
previous subsection. That is, his bounded andmH = m({x ∈ E : h(x) 6=0}) <
∞.Then
R
E
h and sup
h≤f
R
E
h are well deTned. Furthermore
R
E
h =
R
H
h+
R
E\H
h =
R
H
h.
DeTnition 389 A nonnegative L-measurable function f deTned on an L-
measurable set E is integrable (or summable) if
R
E
f<∞.If
R
E
f = ∞,
we say f is not integrable even though it has a Lebesgue integral.
Example 390 Let the function f :[0,1] →R
+
be given by
f(x)=
?
1
x
if x ∈ (0,1]
0 x =0
.
While f is unbounded, consider the sequence of functions h
n
:[0,1] → R
given by
h
n
(x)=
?
1
x
if x ∈ (
1
n
,1]
nx∈ [0,
1
n
]
.
In this case h
n
≤ f (except at h
n
(0) = n and f(0) = 0 butthisisaset
of measure 0)andh
n
is a bounded, L-measurable function which vanishes
outside a set of Tnite measure. See Figure 5.2.2.3. Then
R
[0,1]
h
n
(x)dx =
R
[0,
1
n
]
ndx +
R
(
1
n
,1]
1
x
dx = n· (
1
n
?0) + ln(1)?ln(
1
n
)=1+ln(n). Since {h
n
}
is contained in the set of all bounded h such that h ≤ f, as we take the sup
over all such functions, we know 1+ln(n) →∞as n →∞, so that f is not
integrable on [0,1].
Exercise 5.2.3 Let the function f :[1,∞) →R be given by f(x)=
1
x
. Is f
bounded? Is m[1,∞) Tnite? Is f integrable?
Lemma 391 (Chebyshev?s Inequality) Let ? be an integrable function
on A and ?(x) ≥ 0 a.e. on A. Let c>0. Then c·m{x ∈ A : ?(x) ≥ c} ≤
R
A
?(x)dm.
Proof. Let
b
A = {x ∈ A : ?(x) ≥ c}.Then
R
A
?(x)dm =
R
b
A
?(x)dm +
R
Ab
A
?(x)dm ≥
R
b
A
?(x)dm ≥ cm
3
b
A
′
.See Figure 5.2.2.4.
As in the previous subsection, there are various linearity and monotonicity
properties associated with Lebesgue integrals of non-negative L-measurable
functions.
178 CHAPTER 5. MEASURE SPACES
Theorem 392 Let f and g be nonnegative L-measurable functions deTned
on a set E.Then(i)
R
E
cf = c
R
E
f, c>0; (ii)
R
E
(f +g)=
R
E
f +
R
E
g; and
(iii) if g ≥ f a.e., then
R
E
g ≥
R
E
f.
Proof. (ii) Let h and k be bounded, L-measurable functions such that h ≤ f,
k ≤ g and vanish outside sets of Tnite measure. Then h+k ≤ f +g so that
R
E
h +
R
E
k =
R
E
(h + k) ≤
R
E
(f + g). Then sup
h≤f
R
E
h +sup
k≤g
R
E
k ≤
R
E
(f +g)sobyDeTnition 388 we have (i.e.
R
E
f +
R
E
g ≤
R
E
(f +g)).
To establish the reverse inequality, let l be a bounded L-measurable func-
tion which vanishes outside a set of Tnite measure and is such that l ≤ f +g.
DeTne h and k by setting h(x)=min(f(x),l(x)) and k(x)=l(x) ? h(x).
Then h ≤ f (by construction) and k ≤ g also follows from l = h + k ≤
f + g. Furthermore, h and k areboundedbytheboundforl and vanish
where l vanishes. Then
R
E
l =
R
E
h +
R
E
k ≤
R
E
f +
R
E
g. But this implies
sup
l≤f+g
R
E
l ≤
R
E
f +
R
E
g or
R
E
(f +g) ≤
R
E
f +
R
E
g.
Exercise 5.2.4 Let f be a nonnegative L-measurable function. Show that
f =0a.e. on E i?
R
E
f =0.
As in the previous subsection, we now prove some important results con-
cerning the interchange of limit and integral operations. The bounded conver-
gence theorem has one restrictive assumption. It is that the sequence <f
n
>
is uniformly bounded. In the following lemma this assumption is dropped.
Instead we assume nonnegativity of <f
n
> and the result is stated in terms
of inequality rather than equality.
Theorem 393 (Fatou?s Lemma) Let <f
n
> be a sequence of nonnega-
tive L-measurable functions and f
n
(x) → f(x) a.e. on E.Then
R
E
f ≤
lim
n→∞
R
E
f
n
.
Proof. (Sketch) Let <f
n
>→f pointwise onE. The idea of the proof is to
use the Bounded Convergence Theorem 386. To do so, we need a uniformly
bounded sequence of functions. Hence, let h be a bounded function such
that h(x) ≤ f(x), obtaining non-zero values only on a subset of E with Tnite
measure. DeTne a new sequence <h
n
> by h
n
(x)=min(f
n
(x),h(x)).Then
h
n
(x) is uniformly bounded, h
n
(x) ≤ f
n
(x)andh
n
→ h pointwise. Thus by
the bounded convergence theorem
Z
E
h = lim
n→∞
Z
E
h
n
≤ lim
n→∞
inf
Z
E
f
n
(5.5)
5.2. LEBESGUE INTEGRATION 179
whereweusethelim inf since the limit of <f
n
(x) > may not exist. Since
(5.5)holdsforanyh with the given properties, it also holds for the supremum
sup
h≤f
Z
E
h ≤ lim
n→∞
inf
Z
E
f
n
. (5.6)
But the left hand side of (5.6) is by deTntion
R
E
f.
Thenextexampleshowsthatthestrict inequality may be obtained.
Example 394 Let the functions f
n
:[0,1] →R
+
be given by
f
n
(x)=
?
n if
1
n
≤ x ≤
2
n
0 otherwise
See Figure 5.2.2.5. In this case lim
n→∞
f
n
(x)=0a.e. and lim
n→∞
R
[0,1]
f
n
(x)dx =
sup
n→∞
h
inf
k≥n
R
[0,1]
f
k
(x)dx
i
=sup{< 1 >} =1.
10
To see that nonnegativity matters for Fatou?s lemma, consider the fol-
lowing example.
Example 395 Instead, let the functions f
n
:[0,1] →R
+
be given by
f
n
(x)=
?
?n if
1
n
≤ x ≤
2
n
0 otherwise
.
Again lim
n→∞
f
n
(x)=0a.e. and lim
n→∞
R
[0,1]
f
n
(x)dx =sup
n→∞
h
inf
k≥n
R
[0,1]
f
k
(x)dx
i
=
sup{?1,?2,?3,...} = ?1. Hence without nonnegativity we may have
R
E
f>
lim
n→∞
R
E
f
n
.
The conclusion of Theorem 393 is weak. It is possible to strengthen it by
imposing more structure on the sequence of functions.
Theorem 396 (Monotone Convergence ) Let <f
n
> be an increasing
sequence of nonnegative L-measurable functions and f
n
(x) → f(x) a.e. on
E. Then
R
E
f =lim
n→∞
R
E
f
n
.
Proof. Since f
n
≤ f,?n we have
R
E
f
n
≤
R
E
f by (iii) of Theorem 392. This
implies lim
n→∞
R
E
f
n
≤
R
E
f. The result then follows from Theorem 393.
10
In this example, it was not necessary to actually take the liminf
180 CHAPTER 5. MEASURE SPACES
Example 397 Let f :[0,1] →R be deTned by
f(x)=
?
1
√
x
if x ∈ (0,1]
0 x =0
.
As in Example 390, while f is unbounded, consider a sequence of functions
h
n
:[0,1] →R given by
h
n
(x)=
?
f(x) if f(x) ≤ n
n if f(x) >n
.
or in other words
h
n
(x)=
?
1
√
x
if x ∈ [
1
n
2
,1]
nx∈ [0,
1
n
2
)
.
In this case h
n
≤ f (except at x =0butthisisasetofmeasure0)andh
n
is a bounded, L-measurable function which vanishes outside a set of Tnite
measure. Furthermore, h
n
is monotone since h
n
(x) ≤ h
n+1
(x),?x ∈ [0,1].
Then
R
[0,1]
h
n
(x)dx =
R
[0,
1
n
2
]
ndx+
R
(
1
n
2
,1]
1
√
X
dx = n·(
1
n
2
?0)+2(1?
1
n
)=2?
1
n
.
Then as n →∞,
R
[0,1]
h
n
(x)dx =2. By the Monotone Convergence Theorem,
R
[0,1]
f(x)=2.
Example397isknownasanimproperintegralwhenregardedasaRie-
mann integral since the integrand is unbounded.
11
On the other hand, it
is perfectly proper when regarded as a Lebesgue integral. In this example,
the two integrals are equal. Furthermore, we note that while Example 397
provides a case in which an unbounded nonnegative L-measurable function
is integrable, Example 390 provides an instance of a closely related function
which is not integrable.
Assumption 3: f is any function and m(E) ≤∞(general lebesgue
integral)
DeTnition 398 An L-measurable function f is integrable over E if f
+
and f
?
are both integrable over E. In this case,
R
E
f =
R
E
f
+
?
R
E
f
?
.
Theorem 399 Afunctionf is integrable over E i? |f| is integrable over E.
11
We also say that a Riemann integral is improper if its interval of integration is un-
bounded.
5.2. LEBESGUE INTEGRATION 181
Proof. (?)Iff is integrable over E,then f
+
and f
?
are both integrable over
E. Thus,
R
E
|f| =
R
E
f
+
+
R
E
f
?
by Theorem 392. Hence |f| is integrable.
(?)If
R
E
|f| < ∞,thensoare
R
E
f
+
and
R
E
f
?
.
Example 400 Consider a version of the Dirichlet function f :[0,1] →
{?1,1} given by
f(x)=
?
1 if x ∈Q∩[0,1]
?1 otherwise
.
Observe that |f| =1and hence Riemann integrable while f is not.
Lemma 401 Let A ∈ L, m(A)=0,and f be an L-measurable function.
Then
R
A
f =0.
Proof. We show it Trst for a simple function. Let f =
P
n
i=1
χ
E
i
where
{E
i
} is a collection of L-measurable sets that are disjoint. Then fχ
A
=
P
n
i=1
α
i
χ
A∩E
i
,
where A∩E
i
are disjoint and m(A∩E
i
)=0(sincem(A∩E
i
) ≤
m(A)=0. Thus
R
fχ
A
=
P
n
i=1
α
i
m(A∩E
i
)=0.
If f is a non-negative measurable function, then by Theorem 367 there is a
non-decreasing sequence <f
n
> of simple functions that converges pointwise
to f. Then by Theorem 396
Z
A
f = lim
n→∞
Z
A
f
n
= lim
n→∞
Z
fχ
A
=0.
Finally, iff is an arbitrary measurable function, thenfχ
A
= f
+
χ
A
?f
?
χ
A
and
R
A
f =
R
A
f
+
?
R
A
f
?
=0?0=0.
Lemma 402 Let f be an L-measurable function over E.Ifthereisanin-
tegrable function g such that |f|≤ g,thenf is integrable over E.
Proof. From f
+
≤ g,it follows that
R
E
f
+
≤
R
E
g,andsof
+
is integrable on
E. Similarly, f
?
≤ g implies integrability of f
?
. Hence f is integrable over
E.
Theorem 403 Let f and g be integrable functions deTned on a set E.Then
(i) the function cf where c is Tnite is integrable over E and
R
E
cf = c
R
E
f;
(ii) the function f +g is integrable over E and
R
E
(f +g)=
R
E
f +
R
E
g; (iii)
if g = f a.e., then
R
E
g =
R
E
f;(iv) if g ≥ f a.e., then
R
E
g ≥
R
E
f;and(v)If
E
1
and E
2
are disjoint L-measurable sets in E, then
R
E
1
∪E
2
f =
R
E
1
f +
R
E
2
f.
182 CHAPTER 5. MEASURE SPACES
Exercise 5.2.5 Prove Theorem 403.
In considering when we could interchange limits and integrals, we saw ei-
ther we had to impose bounds on functions (Bounded Convergence Theorem
386) or consider monotone sequences of nonnegative functions (Monotone
Convergence Theorem 396). In the general case, we simply must bound the
sequence of functions by another (possibly unbounded) function.
Theorem 404 (Lebesgue Dominated Convergence Theorem) Let g be
an integrable function on E and let <f
n
> be a sequence of L-measurable
functions such that |f
n
| ≤ g on E and lim
n→∞
f
n
= f a.e. on E. Then
R
E
f = lim
n→∞
R
E
f
n
.
Proof. (Sketch) By Lemma 402, f is integrable. We want to use Fatou?s
Lemma 393 which requires a sequence of non-negative functions, which is
not assumed in this theorem. However we can deTne two sequences, namely
h
n
= f
n
+ g and k
n
= g ? f
n
for which <h
n
>→ f + g and <k
n
>→
g ? f where both are non-negative. Hence by Fatou?s Lemma, we have
R
E
(f + g) ≤ liminf
n→∞
R
E
(f
n
+ g)and
R
E
(g ?f) ≤ liminf
n→∞
R
E
(g ?f
n
).
The Trst inequality implies
R
E
f ≤ liminf
n→∞
R
E
f
n
and the second implies
R
f ≥ limsup
n→∞
R
f
n
by Theorem 403. Combining these two we have
lim inf
n→∞
Z
E
f
n
≥
Z
f ≥ lim sup
n→∞
Z
f
n
which implies the desired result.
The above theorem requires that the sequence <f
n
> be uniformly dom-
inated by a Txed integrable function g. However, the proof does not need
such a strong restriction. In fact, the requirements can be weakened to con-
sider a sequence of integrable functions <g
n
> which converge a.e. to an
integrable function g and that |f
n
|≤ g
n
.
Example 405 Let f
n
:[0,1] → R be given by f
n
(x)=nx
n
.SeFigure
5.2.2.6. Then lim
n→∞
f
n
(x)=0a.e. and
R
[0,1]
f
n
(x)dx =
R
[0,1]
nx
n
dx =
n
n+1
x
n+1
|
1
0
=
1
1+
1
n
so that lim
n→∞
R
[0,1]
f
n
(x)dx =1. On the other hand,
R
[0,1]
f(x)dx =0.
Notice in the above example that the sequence of functions has no dom-
inating function.
5.2. LEBESGUE INTEGRATION 183
Example 406 Let f
n
:[0,2] →R be given by
f
n
(x)=
? √
n if
1
n
≤ x ≤
2
n
0 otherwise
See Figure 5.2.2.7. Then lim
n→∞
f
n
(x)=f(x)=0?x ∈ [0,2] so that
R
[
1
n
,
2
n
]
f(x)dx =0. Note that |f
n
(x)|≤ g(x) ?x ∈ [0,2] where
g(x)=
( q
2
x
if 0 <x≤ 2
0 if x =0
whichisintegrableover[0,2]. Itisalsosimpletosee
R
[
1
n
,
2
n
]
f
n
(x)dx =
√
n
?
2
n
?
1
n
¢
=
1
√
n
so that lim
n→∞
R
[
1
n
,
2
n
]
f
n
(x)dx =0.
Finally we state a convergence theorem that di?ers from the previous
ones in the sense that we don?t assume that <f
n
>→ f. Instead the
theorem guarantees the existence of a function f to which <f
n
> converges
a.e. given <f
n
> is a non-decreasing sequence of integrable functions with
corresponding sequence of their integrals
-R
f
n
dm
?
being bounded.
Theorem 407 (Levi) Let hf
n
i be a sequence on A ? R and f
1
≤ f
2
≤
..... ≤ f
n
≤ ...... where f
n
is integrable and
R
A
f
n
dm ≤ K. Then there exists
f s.t. f = lim
n?→∞
f
n
a.e. on A , f is integrable on A and
R
A
f
n
dm ?→
R
A
fdm .
Proof. (Sketch) Without loss of generality, assume f
1
≥ 0. DeTne f(x)=
lim
n→∞
f
n
(x).Sincehf
n
iis non-decreasing, f(x)iseitheranumberor+∞.Using
the Chebyshev?s inequality(Lemma 391), it is easyto showthatm({x ∈ A : f(x)=+∞})=
0, which implies f
n
→ f pointwise a.e. In order to use the Lebesgue Domi-
nated Convergence Theorem 404, we need to construct an integrable function
? on A that dominates f
n
(i.e. f
n
≤ f ≤ ? on A. See Figure 5.2.2.8. Let
A
r
= {x ∈ A : r?1 ≤ f(x) <r}and ?(x)=
P
∞
r=1
rχ
A
r
. Clearly f
n
≤ f(x) ≤
?. Is ? integrable on A = ∪
∞
r=1
A
r
(i.e. is
R
A
? =
P
∞
r=1
rm(A
r
) < ∞)? For
s ∈ N, deTne B
s
= ∪
s
r=1
A
r
.Since?(x) ≤ f (x)+1andbothf
n
and f are
bounded on B
s
, we have
s
X
r=1
rm(A
r
)=
Z
B
s
?dm ≤
Z
B
s
f (x)dm+m(A) = lim
n?→∞
Z
B
s
f
n
(x)dm+m(A) ≤ K+m(A).
184 CHAPTER 5. MEASURE SPACES
Boundedness of partial sums of an inTnite series
P
∞
r=1
rm(A
r
)
?
=
R
A
?dm
¢
guarantees integrability of ?.
We can also state a ?series version? of Levi?s theorem. It says that un-
der certain conditions on the series, integration and inTnite summation are
interchangeable.
Corollary 408 If <g
k
> isasequenceofnon-negativefunctionsdeTned on
A such that
P
∞
k=1
R
A
g
k
(x)dm < ∞ ,thentheinTnite series
P
∞
k=1
g
k
(x) con-
verges a.e. on A.Thatis,
P
∞
k=1
g
k
(x) → g(x) a.e. and
P
∞
k=1
R
A
g
k
(x)dm =
R
A
P
∞
k=1
g
k
(x)dm
?
=
R
A
g(x)dm
¢
.
Proof. Apply Levi?s Theorem to the functions f
n
(x)=
P
n
k=1
g
k
(x).
5.3 General Measure
In the preceding sections, we focussed onR (or subsets thereof) as the under-
lying set of interest. From this set, we constructed the Lebesgue σ-algebra
denoted L.ThenwedeTned the Lebesgue measure m on elements of L.That
is,westudiedthetriple(R, L, m) known as the Lebesgue measure space.
These ideas can be extended to general measure spaces.
DeTnition 409 The pair (X,X),whereX is any set and X is a σ-algebra
of its subsets is called a measurable space.AnysetA ∈ X is called a
(X-)measurable set.
DeTnition 410 Let (X,X) beameasurablespace. Ameasure is an ex-
tended real valued function μ : X → R∪{∞} such that: (i) μ(?)=0;
(ii) μ(A) ≥ 0,?A ∈ X; and (iii) μ is countably additive (i.e. if {A
n
}
n∈N
is a countable, disjoint sequence of subsets A
n
∈ X, then μ(∪
n∈N
A
n
)=
P
n∈N
μ(A
n
)).
DeTnition 411 A measure space is a triple (X,X,μ).
DeTnition 412 Let (X,X) beameasurablespace. Ameasureμ is called
Tnite if μ(X) < ∞.μis called σ-Tnite if there is a countable collection of
sets {E
i
}
∞
i=1
in X with μ(E
i
) < ∞ for all i and X = ∪
∞
i=1
E
i
.
5.3. GENERAL MEASURE 185
Example 413 (i) If X = R, then Lebesgue measure is not Tnite because
m(?∞,∞)=∞ but it is σ-Tnite because (?∞,∞)=∪
∞
n=1
(?n,n) and
m((?n,n)) = 2n.(ii) If X =[0,100], then Lebesgue measure is Tnite because
m([0,100]) = 100.
Exercise 5.3.1 Let X =(?∞,∞) , X = P (X) and μ(E)=
?
# of elements if E is Tnite
∞ if E is inTnite
.Show
that (X,X,μ) is a measure space and that μ is not σ-Tnite.
Lebesgue measure has one important property that follows from Exercise
5.1.2. That is, if E is a Lebesgue measurable set with measure 0 and ifA ? E,
then A is also Lebesgue measurable and m(A)=0. In general, however, we
canhaveasituationthatE ∈X (is X -measurable) with μ(E)=0,A? E
but A may not be in X.
Example 414 Let (X,X,μ) be a measure space deTned as follows: Let X =
{a,b,c} , X = {?,{a,b,c},{a,b},{c}} ; μ(?)=μ({a,b})=0, μ({c})=
μ({a,b,c})=1,{a}?{a,b} but {a} is not X-measurable.
DeTnition 415 Let (X,X,μ) be a measure space. μ is complete on X if
for any E ∈X with μ(E)=0and for A ? E then A ∈X and μ(A)=0.
That is, μ is complete in X if any subset of a zero measurable set is
measurable and has measure zero.
If we consider Lebesgue measure restricted to the Borel σ-algebra (i.e
(R,B,m)) then m is not complete on B.To show this would necessitate more
machinery (the Cantor set can be used to illustrate the idea). However, if
μ is not complete on X, then there exists a completion of X denoted by
e
X.
For example, the completion of (R,B,m)is(R,L,m).
ExactlythesamewaywebuiltthetheoryofLebesguemeasureand
Lebesgue integral in Sections 5.1 to 5.2.2, the theory of general measure
and integral can be constructed. The space (R,L,m) can be replaced by
(X,X,m)andinsteadofL -measurability and L-integrability we will have
X -measurability and X -integrability.
5.3.1 Signed Measures
Although we have introduced a general measure space (X,X,μ), the only
non-trivial measure space that we have encountered so far is the Lebesgue
186 CHAPTER 5. MEASURE SPACES
measure space (R,L,m),where Lebesgue measuremwas constructed through
the outer measure in Section 5.1. Can we construct other non-trivial measures
μ on a general measure space (X,X)? Consider a measure space (X,X,μ)
and let f be a non-negative X -measurable function. DeTne λ : X ?→ R
by λ(E)=
R
E
fdμ. Then the following theorem establishes that this set
function λ is a measure.
Theorem 416 If f is a non-negative X -measurable function then
λ(E)=
Z
E
fdμ (5.7)
is a measure. Moreover if f is X -integrable, then λ is Tnite.
Proof. λ(E)=
R
E
fdμ =
R
X
fχ
E
dμ ≥ 0 for all E ∈ X (where χ
E
is the
characteristic function of E). Let {E
i
}
∞
i=1
be a collection of mutually disjoint
sets and∪
∞
i=1
E
i
= E.Then χ
E
=
P
∞
i=1
χ
E
i
.
Let g
n
= fχ
E
n
and f
n
=
P
n
i=1
g
n
.
Since g
n
≥ 0, the sequencehf
n
iis non-decreasing. f
n
is also measurable for all
n because the sum and the product of measurable functions is measurable.
f
n
converges pointwise to f because
P
n
i=1
χ
E
i
→ χ
E
=
P
∞
i=1
χ
E
i
. Then
according to the monotone convergence theorem 396
λ(E)=
Z
E
fdμ=
Z
fχ
E
dμ =
Z
f
∞
X
i=1
χ
E
i
=
∞
X
i=1
Z
fχ
E
i
dμ =
∞
X
i=1
Z
E
i
fdμ=
∞
X
i=1
λ(E
i
)
Hence λ is σ-additive. If f is integrable, then λ(X)=
R
X
fdμ < ∞ and
hence f is Tnite.
This theorem provides us with a method of how to construct new mea-
sures on a measure space (X,X,μ). Actually any non-negative X-integrable
function represents a Tnite measure given by 5.7. Thus, given a measure space
(X,X,μ),there is a whole set of measures deTned on X.
Are all measures on (X,X,μ) of the type given by 5.7? In other words,
let E be the set of all measures on (X,X,μ).Can any measure ν ∈E be rep-
resented by an integrable function g such that ν(E)=
R
E
gdμ?Theanswer
is contained in a well-known result: the Radon-Nikodyn theorem. We could
pursue this problem in our current setting; namely we could deal with mea-
sures only (i.e. with non-negative σ-additive set functions deTned on X). We
can, however, work in an even more general setting. Instead of dealing with
non-negative σ-additive set functions (measures) we can drop the assumption
5.3. GENERAL MEASURE 187
of non-negativity and work with the σ-additive set functions which are ei-
ther positive or negitive or both. These functions are called signed measures.
This generalization is useful particularly when working with Markov pro-
cesses. ???NEED AN EXAMPLE WITH A MARKOV PROCESS HERELet
us now deTne the notion of signed measure rigorously.
DeTnition 417 Let (X,X) be a measurable space. Let μ : X ?→ R ∪
{?∞,∞} with the following properties: (i) μ(?)=0; (ii) μ obtains at
most one of the two symbols +∞,?∞ ; (iii) μ is σ-additive. Then μ is
called a signed measure on X.
In the text that follows when we refer to a ?measure? (without the preTx
?signed?) we mean measure in the sense of DeTnition 410(i.e. a non-negative
set function).
Example 418 Given a measure space (X,X,μ), an example of a signed
measure is the set function
ν (E)=
Z
E
fdμ (5.8)
where f is any X-integrable function. In Theorem 416 we showed that if f is
a non-negative integrable function then P is a Tnite measure. Here we just
assume that f is integrable and put no restrictions on non-negativity. We
can also assume that f is the ?only? measurable function for which
R
fdμ
exists (i.e. at least one of the functions f
+
,f
?
is integrable). Thus, if f in
(5.8)isanintegrablefunction,thenν is a Tnite signed measure and if f is
a X?measurable function for which
R
fdμexists, then ν is a signed measure
(though not necessarily Tnite).
Example 419 Let (R,L,m) be a Lebesgue measure space. In the Trst case,
let f : R→Rbe given by f (x)=
x
(1+x
2
)
2
.Then we have f
+
(x)=
?
x
(1+x
2
)
2
, x ≥ 0
0 , x<0
,
f
?
(x)=
?
0 , x ≥ 0
?
x
(1+x
2
)
2
, x<0
,
R
∞
?∞
f
+
dm =
R
∞
?∞
f
?
dm =
1
2
,fis integrable
and ν (E)=
R
E
fdm =
R
∞
?∞
x
(1+x
2
)
2
·χ
E
(x)dm is a Tnite signed measure.
188 CHAPTER 5. MEASURE SPACES
Example 420 Let g : R → R be given by g(x)=
?
x , x ≥ 0
x
(1+x
2
)
2
, x<0
.Then
we have g
+
(x)=
?
x , x ≥ 0
0 , x<0
,g
?
(x)=
?
0 , x ≥ 0
?
x
(1+x
2
)
2
, x<0
,
R
∞
?∞
g
+
dm =
+∞ ,
R
∞
?∞
g
?
dm =
1
2
,gis measurable but not integrable. However, the
integral exists since
R
∞
?∞
gdm =
R
∞
?∞
g
+
dm ?
R
∞
?∞
g
?
dm =+∞?
1
2
=+∞,
ν (E)=
R
E
g ·dm =
R
∞
?∞
g ·χ
E
dm is a signed measure (but not Tnite).
Example 421 Let h : R→R be given by h(x)=x. Then we have h
+
(x)=
?
x , x ≥ 0
0 , x<0
,h
?
(x)=
?
0 , x ≥ 0
?x , x<0
,
R
∞
?∞
h
+
dm =+∞ ,
R
∞
?∞
h
?
dm =
+∞. Hence
R
∞
?∞
hdm =
R
∞
?∞
h
+
dm?
R
∞
?∞
h
?
dm = ∞?∞is not deTned, the
Lebesgue integral doesn?t exist, and thus this function doesn?t deTne a signed
measure.
The previous examples show that if a signed measure ν is deTned by
expression (5.8) using an integral then it can be written as a di?erence of
two measures:
ν (E)=ν
1
(E)?ν
2
(E)where
ν (E)=
Z
E
fdμ, ν
1
(E)=
Z
E
f
+
dμ and ν
2
(E)=
Z
f
?
dμ
We now show that such a decomposition is possible for any arbitrary
signed measure. This decomposition is known as the Jordan decomposition
of a signed measure. First we need to prove some lemmas. However, in
order to avoid introducing complicated terminology which does not help to
understand the main ideas, in the remainder of this chapter we will deal
only with Tnite signed measures. All theorems and proofs can be adopted to
σ-Tnite signed measures.
Lemma 422 Let ν be a Tnite signed measure on X. Then any collection of
disjoint sets {E
i
} ,forwhichν (E
i
) > 0(ν (E
i
) < 0) is countable.
Proof. Let E ?X be a collection of disjoint sets {E
i
} for which ν (E
i
) > 0.
For n =1,2,... let E
n
=
?
E
i
∈E : ν (E
i
) >
1
n
a
. Then E = ∪
∞
n=1
E
n
. For each
n,E
n
is Tnite. If it were not, we would have a sequence hE
k
i
∞
k=1
of disjoint sets
from E with ν (E
k
) >
1
n
for k =1,2,....Then ν (∪
∞
k=1
E
k
)=
P
∞
i=1
ν (E
k
) ≥
5.3. GENERAL MEASURE 189
P
∞
k=1
1
n
= ∞ , which leads to a contradiction, that ν is Tnite. Then E
n
is
Tnite and hence E = U
∞
n=1
E
n
is countable.
Let ν be a signed measure on X and let ν (E) > 0. Let F ? E. What can
be said about the sign of ν(F)? As the next example shows, not much can
be said about the signed measure of a subset of a set whose signed measure
is positive.
Example 423 Let X = {1,2,3,......},X = P (X).For E ∈ X,deTne
ν (E)=
P
nDE
(?1)
n
1
2
n
. For E = {1,2,3} , ν (E)=?
3
8
< 0. If F =
{2} ? E , ν (F)=
1
4
> 0. But notice that each singleton subset C of
set B = {1,3,5,....} has ν (C) < 0 and each singleton subset D of the
set A = {2,4,6,......} has ν (D) > 0. Moreover A and B are disjoint and
A∪B = X.
Example 424 Let X =[?1,1] and X be all L-measurable subsets of [?1,1].
For E ? [?1,1], let v(E)=
R
E
xdx. For E =
£
?
1
2
,1
¤
, then ν (E)=
R
1
?
1
2
xdx =
3
8
. For F =
£
?
1
2
,0
¤
?
£
?
1
2
,1
¤
, ν (F)=
R
0
?
1
2
xdx = ?
1
8
. Thus
the sign measure of the set E is positive but its subset F has negative sign
measure. But for each subset C of the set B =[?1,0) , ν (C) < 0 and
for each subset D of the set A =[0,1] , ν (D) > 0 where A∩B = ? and
A∪B = X.
DeTnition 425 A X -measurable set E is positive (negative) with respect
toasignedmeasureν,ifforanyX -measurable subset F of E , ν (F) ≥ 0 ,
(ν (F) ≤ 0).
Thus sets A in Examples 423 and 424 are positive while B are negative.
Notice that ν(G) > 0(orν(G) < 0) doesn?t mean that G is positive (nega-
tive) as E in Examples 423 and 424 show. We will show that the existence of
sets A and B for a signed measure D in these examples is not a coincidence.
DeTnition 426 An ordered pair (A,B),whereA is a positive and B is a
negativeset,withrespecttoasignedmeasureν and A∩B = ? , A∪B = X
is called the Hahn decomposition with respect to ν of a measurable space
(X,X).
Theorem 427 (Hahn Decomposition) Let ν be a Tnite signed measure
on a measurable space (X,X). Then there exists a Hahn decomposition of
(X,X).
190 CHAPTER 5. MEASURE SPACES
Proof. (Sketch)Let S be a family of collections of subsets of X whose el-
ements A are collections of disjoint measurable sets E ? X with ν(E) < 0.
Since ??? is a partial ordering on S satisfying the assumptions of Zorn?s
lemma 46 (namely that every totally ordered subcollection {A
i
} has a max-
imal element A = ∪
i
A
i
), then there is a maximal element E of S.Moreover
E is countable (by Lemma 422). Let B = ∪{E ∈ E}. Then B is measurable
and negative (by construction all of its subsets have negative measure). Let
A = X\B. We have A ∩ B = ?,A∪ B = X,andA is measurable. If
we show that A is positive, then we would be done since (A,B)wouldbea
Hahn decomposition. Hence we need to show that A (as a complement of a
maximal negative set) is positive. The idea is that if we assume that A is not
positive, then we can construct a negatie set with negative measure outside
the set B. This would violate the maximality of B. While the construction
of such a set is given in the formal proof of this theorem in the appendix to
the chapter, see Figure 5.3.1.
In the special case where a signed measure ν is deTned by the integral
ν (E)=
R
E
fdμ, the Hahn decomposition is given by A = {x : f (x) ≥ 0}
and B = {x : f (x) < 0} as we have seen in Example 424. It is easily seen
that the Hahn decomposition is not unique. We can, for example, set A
1
=
{x : f (x) > 0} , B
1
= {x : f (x) ≤ 0}. But the following theorem shows that
the choice of a Hahn decomposition doesn?t really matter.
Theorem 428 Let (A
1
,B
1
) , (A
2
,B
2
) be two Hahn decompositions of a mea-
surable space (X,X) withrespecttoasignedmeasureν.Then for each E ∈X
we have ν (E ∩A
1
)=ν (E ∩A
2
) and ν (E ∩B
1
)=ν (E ∩B
2
).
Proof. From E ∩ (A
1
\A
2
) ? E ∩ A
1
we have ν (E ∩(A
1
\A
2
)) ≥ 0and
from E∩(A
1
\A
2
) ? E∩B
2
we have ν (E ∩(A
1
\A
2
)) ≤ 0. Combining these
two inequalities we have ν (E ∩(A
1
\A
2
)) = 0. Analogously we can show that
ν (E ∩(A
2
\A
1
)) = 0.Henceν (E ∩A
1
)=ν((E ∩(A
2
\A
1
))∪(E ∩(A
1
∩A
2
))) =
0+ ν (E ∩A
1
∩A
2
). If we start with ν (E ∩A
2
), we arrive by similar rea-
soning with 0+ ν (E ∩A
1
∩A
2
). Hence, v(E ∩A
1
)=ν(E ∩A
2
). Similarly
we can show that ν (E ∩B
1
)=ν (E ∩B
2
).
Theorem 429 Let ν be a Tnite signed measure on X and let (A,B) be an ar-
bitrary Hahn decomposition with respect to ν. Then ν (E)=ν
+
(E)? ν
?
(E)
for any E ∈ X where ν
+
(E)=ν (E ∩A) and ν
?
(E)=? ν (E ∩B) are
both measures on X and don?t depend on the choice of Hahn decomposition
(A,B).
5.3. GENERAL MEASURE 191
Proof. The independence of ν
+
and ν
?
onthechoiceofHahndecomposition
follows from Theorem 428. Since ν is a signed measure, ν is σ-additive and
thus ν
+
and ν
?
are as well. Since A is a positive set and E ∩A ? A, then
ν (E ∩A) ≥ 0. Since B is a negative set and E∩B ? B, then ν (E ∩B) ≤ 0
so that ? ν (E ∩B) ≥ 0. Thus ν
+
and ν
?
are measures on X. Since E =
E ∩X = E ∩ (A∪B)=(E ∩A) ∪ (E ∩B),we have ν (E)=ν (E ∩A)+
ν (E ∩B)=ν
+
(E)? ν
?
(E).
DeTnition 430 ν = ν
+
? ν
?
is called the Jordan decomposition of a
signed measure ν. The measure ν
+
(ν
?
) is called a positive (negative)
variation of ν. |ν|(E)=ν
+
(E)+ ν
?
(E) isalsoameasureonX and is
called the total variation ofasignedmeasureν.
Exercise 5.3.2 Let (X,X,μ) be a measure space and let f be X -integrable.
If ν (E)=
R
E
fdμ , show that ν
+
(E)=
R
E
f
+
dμ , |ν|(E)=
R
E
|f|dμ.
Exercise 5.3.3 Show that a countable union of positive (negative) sets is a
positive (negative) set.
If a signed measure ν is deTned as the integral of an integrable function
ν (E)=
R
E
fdμ then by Lemma 401 it has the following property. If E ∈ X
and μ(E)=0, then ν (E)=0. As we will soon see, this property of a signed
measure is very important and we formulate if for any signed measure (not
only the one given by an integral).
DeTnition 431 Let ν be a Tnite signed measure and let μ be a measure on
(X,X). If for every A ∈X , μ(A)=0implies ν (A)=0, then we say that
ν is absolutely continuous with respect to μ , written ν<<μ.
Hence by Lemma 401, ν (E)=
R
E
fdμ is absolutely continuous with re-
spect to μ. Now we prove two simple lemmas.
Lemma 432 Let ν be a Tnite signed measure and μ be a measure on X.
Then the following are equivalent:(i) ν<<μ,(ii) ν
+
<< μ, ν
?
<< μ, (iii)
|ν| << μ.
Proof. (i)=? (ii). Let (A,B) be a Hahn decomposition with respect to ν.
Let E ∈ X and μ(E)=0.Then μ(E ∩A)=0andbecauseν is absolutely
continuous with respect to μ, we have ν (E ∩A)=ν
+
(E)=0. This implies
192 CHAPTER 5. MEASURE SPACES
ν
+
<< μ. Similarlyν
?
<< μ. The other two implictations follow immediately
from these equalities:
|ν|(E)=ν
+
(E)+ν
?
(E)
ν (E)=ν
+
(E)?ν
?
(E).
Lemma 433 Let μ,λ be Tnite measures on X , λ<<μand λ(E
0
) 6=0
for at least one set E
0
∈ X . Then there exists ε>0 and a set E ∈ X
that is positive with respect to the signed measure λ? εμ and λ(E) > 0 and
μ(E) > 0.
Proof. Let (A
n
,B
n
)forn ∈N be a Hahn decomposition of X with respect
to λ?
1
n
μ. Set A = ∪
∞
n=1
A
n
, B = ∩
∞
n=1
B
n
. Since B ? B
n
and B
n
is a negative
set with respect to λ?
1
n
μ then
?
λ?
1
n
μ
¢
(B) ≤ 0 ?? 0 ≤ λ(B) ≤
1
n
μ(B)
,forn ∈ N. Thus λ(B)=0. Since λ(X) 6=0, then λ(A)=λ(X\B)=
λ(X) ? λ(B)=λ(X) > 0. As λ<<μwe have μ(A) > 0. Finally set
E = A
n
0
and ε =
1
n
0
.
Now we are ready to tackle the main problem of this section, which you
can think of as a representation theorem.
12
Given a measure space (X,X,μ),
consider the set function ν (E)=
R
E
fdμ where f is X -measurable and X
-integrable. Under certain conditions, this speciTcsignedmeasureonX
represents all signed measures (i.e. there are no other signed measures on
X that cannot be represented as the integral of the X -measurable function
f). This is established formally in the Radon-Nikodyn Theorem which states
that under certain conditions any signed measure on X can be represented
by the integral of a measurable function. The Radon-Nikodyn Theorem will
be used in the Riesz Representation Theorem in the next chapter.
Theorem 434 (Radon-Nikodyn) Let (X,X,μ) be a measure space, μ be
a σ-Tnite measure, ν be a Tnite signed measure on X and ν<<μ.Then
there exists a X -integrable function f on X such that ν (E)=
R
E
fdμ for
any E ∈X. Moreover f is unique in the sense that if g is any X -measurable
function with this property, then g = f a.e. with respect to μ.
12
In general, a representation theorem provides a simple way to characterize (or repre-
sent) a set of elements using certain properties that actually extends to the entire collection
of elements under given assumptions.
5.3. GENERAL MEASURE 193
Proof. (Sketch) By Theorem 429, a Tnite signed measure ν can be decom-
posed into ν
+
and ν
?
where ν
?
,ν
+
are both measures and by (ii) of Lemma
432 they are both absolutely continuous with respect to μ (if ν is). Hence it
su?ces to prove the theorem under the assumption that ν is a (non-negative)
measure. Also, since μ is σ-Tnite, then X can be decomposed into countably
many disjoint sets {E
i
} for which μ(E
i
) < ∞. Hence it su?ces to prove the
theorem with μ Tnite. In summary, we take μ,ν both Tnite measures with
ν<<μ.
Let G be the set of all non-negative X-measurable, integrable functions
g satisfying
Z
E
gdμ≤ ν(E), ?E ∈X. (5.9)
Among these functions g, we want to Tnd a function f which satisTes (5.9)
with equaility. Since
R
X
gdμ≤ ν(X), ?g ∈ G (because ν is Tnite), the set of
real numbers
?R
gdμ,g ∈ G
a
is bounded (by ν(X)) and hence its supremum
exists. Let α =sup
g∈G
R
gdμ. f is constructed (using Levi?s Theorem 407)
as a limit function of a sequence <f
n
> that attains this supremum (i.e.
α =
R
fdμ).
13
Because f ∈ G, we know that
R
E
fdμ≤ ν(E), ?E ∈ X. We claim that
R
E
fdμ = ν(E), ?E ∈ X. If this were not true, then there would exist a
set E such that
R
E
fdμ<ν(E). Then by Lemma 433, we could construct a
function g
0
= f +εχ
E
0
belonging to G for which
R
g
0
dμ > α. But this would
violate the fact that α is the supremum.
The assumption in the Radon-Nikodyn theorem that μ is σ-Tnite is im-
portant as the next exercise shows.
Exercise 5.3.4 Let X = R and let X be a collection of all subsets of
R that are countable or that have countable complement. DeTne μ(E)=
?
# of elements of E if E is Tnite
∞, otherwise
and ν (E)=
?
0 if E is countable
1 if X\E is countable
.(i)
Show that μ, ν are measures on X and that ν<<μ.(ii) Show that μ is not
σ-Tnite. (iii) Show that the Radon-Nikodyn theorem doesn?t hold.
13
In particular, by the supremum property, there exists a sequence <g
n
> from G such
that lim
n→∞
R
g
n
dμ = α.DeTne a sequence <f
n
> by f
n
=max{g
1
,...,g
n
},f
n
∈ G.
Since <f
n
> is a non-decreasing sequence of integrable functions with
R
f
n
dμ ≤ α,then
by Levi?s theorem there exists an integrable function f =limf
n
a.e. with
R
E
fdμ =
lim
n→∞
R
E
f
n
dμ ≤ ν(E) (because f
n
∈ G) and hence f ∈ G and
R
fdμ ≤ α.Onthe
other hand, because g
n
≤ f
n
we have
R
E
fdμ= lim
n→∞
R
E
f
n
dμ ≥ lim
n→∞
R
g
n
dμ = α.
Combining these two inequalities gives
R
fdμ= α.
194 CHAPTER 5. MEASURE SPACES
5.4 Examples Using Measure Theory
5.4.1 Probability Spaces
DeTnition 435 If μ(X)=1,then μ is a probability measure and (X,X,μ)
is called a probability space. In this case, X is called the sample space,
any measurable set A ∈ X is called an event, and μ(A) is called the prob-
ability of the event A. For a probability space, we say almost surely
(a.s.) interchangeably with almost everywhere (a.e.).
We next illustrate measure spaces through some basic properties of prob-
ability.
DeTnition 436 Let (X,X,P) be a probability space. Let Λ be an arbitrary
index set and let A
i
, i ∈Λ be events in X.TheA
i
are independent if and
only if for all Tnite collections {A
i
1
,A
i
2
,...,A
i
k
} we have
P(A
i
1
∩A
i
2
∩...∩A
i
k
)=P(A
i
1
)P(A
i
2
)···P(A
i
k
).
The next deTnition makes clear that a random variable is nothing other
than a measurable function.
DeTnition 437 A random variable Y on a probability space (X,X,P) is
a Borel measurable function from X to R (i.e. Y : X×X →R×B(R)). If Y
is a random variable on (X,X,P),theprobability measure induced by
Y is the probability measure P
Y
on B(R) given by P
Y
(B)={x ∈ X : Y(x) ∈
B}, B ∈B(R).
The numbers P
Y
(B), B ∈B(R), completely characterize the random vari-
able Y in the sense that they provide the probabilities of all events involving
Y. This information can be captured by a single function from R to R as
the next deTnition suggests.
DeTnition 438 The distribution function of a random variable Y is the
function F : R→R given by F(y)=P{x ∈ X : Y(x) ∈ B}.
DeTnition 439 If Y is a random variable on (X,X,P),theexpectation
of Y is deTned by E[Y]=
R
X
YdP provided the Lebesgue integral exists.
The next result gives a good illustration of simple functions, monotone
convergence theorem.
5.4. EXAMPLES USING MEASURE THEORY 195
Theorem 440 Let Y be a random variable on (X,X,P) with distribution
function F.Letg : R→R be a Borel measurable function. If Z = g ?Y,
then E[Z]=
R
R
g(y)dF(y)
?
=
R
R
gdP
Y
¢
Exercise 5.4.1 Prove Theorem 440 (Theorem 5.10.2 p. 223 in Ash)
One of the most remarkable results in probability is Kolmogorov?s strong
law of large numbers.
Theorem 441 (Strong Law of Large Numbers) If Y
1
,Y
2
,... are indepen-
dent and identically distributed random variables and E[|Y
1
|] < ∞ ,then
lim
n→∞
1
n
n
X
i=1
Y
i
= E[Y
1
] a.s.
5.4.2 L
1
Let us denote the collection of L-integrable functions f deTned on X ?R by
L
1
(X). For instance, X can be all of Rin which case L
1
(R) is any measurable
subset of R. Hence L
1
(X) is the collection of all L-measurable functions f
deTned on X for which
R
X
|f| < ∞. It is straightforward to see that L
1
(X)
is a vector space.
Exercise 5.4.2 Show that L
1
(X) is a vector space. Hint: Use Theorem 403.
Can L
1
(X) be equipped with a norm? Let us deTne a function k·k
1
=
L
1
(X) →R given by kfk
1
=
R
X
|f|. Does this function satisfy the properties
of a norm given in DeTnition 206?
Exercise 5.4.3 Show that k·k
1
satisTes properties (i) kfk
1
≥ 0, ?f ∈ L
1
(X),
(iii) kαfk
1
= |α|kfk
1
,?α ∈ R,f∈ L
1
(X), (iv) kf +gk
1
≤ kfk
1
+ kgk
1
,?
f,g ∈ L
1
(X) of the deTnition of a norm and the part of (ii) that f =0?
kfk
1
=0.
The next example makes it clear that the converse of part (ii) is not true.
Example 442 If f is the Dirichlet function of Example 360, then kfk
1
=0
but f 6=0everywhere.
196 CHAPTER 5. MEASURE SPACES
To overcome this problem, we will deTne a relation ?~ ?onthesetofall
integrable functions. Let f,g∈ L
1
, deTne f ~ g i?f = g a.e. This relation is
an equivalence and hence by Theorem 31 L
1
can be partitioned into disjoint
classes
e
f of equivalent functions (i.e. functions that are equal a.e.). Figure
5.4.1???
Exercise 5.4.4 Prove f ~ g i? f = g a.e is an equivalence relation using
DeTnition 26.
By Theorem 403, for any two functions from the same equivalence class,
the norm kfk
1
= kgk
1
≡
°
°
°
e
f
°
°
°
1
. Then the space L
1
consisting of equivalence
classes with the k·k
1
norm is a normed vector space.
To keep notation and terminology simple in what follows, we will refer to
the elements of L
1
as functions rather than equivalence classes of functions.
But you should keep in mind that when we refer to a function f we are
actually referring to all functions that are equal a.e. to f.
The most important question we must ask of our new normed vector
space is ?Is it complete?? The next theorem provides the answer.
Theorem 443 (L
1
,k·k
1
) is a complete normed vector space (i.e. a Banach
space).
Before proving completeness of L
1
,we note that one strategy used in pre-
vioussectionsisto:Trst, take a Cauchy sequence <f
n
> in a given function
space and note that for a given x ∈ X, <f
n
(x) > is Cauchy in R and
lim
n→∞
f
n
(x)=f(x)existsforeachx since R is complete; second, prove
that f
n
→ f with respect to the norm of the normed function space. Un-
fortunately, this procedure cannot be used in L
1
since for a Cauchy sequence
<f
n
> in L
1
apointwiselimitof<f
n
(x) > may not exist for any point x
as the following example shows.
Example 444 Let a sequence <f
n
> of functions on [0,1] be given by
f
1
= χ
[0,
1
2
]
,f
2
= χ
[
1
2
,1]
,
f
3
= χ
[0,
1
4
]
,f
4
= χ
[
1
4
,
2
4
]
,,f
5
= χ
[
2
4
,
3
4
]
,f
6
= χ
[
3
4
,1]
,
f
7
= χ
[0,
1
8
]
,f
8
= χ
[
1
8
,
2
8
]
,...,f
13
= χ
[
6
8
,
7
8
]
,f
14
= χ
[
7
8
,1]
,
f
15
= χ
[0,
1
16
]
,...
5.4. EXAMPLES USING MEASURE THEORY 197
See Figure 5.4.2. This sequence is Cauchy in L
1
([0,1]) but there is no point
x ∈ [0,1] for which lim
n→∞
f
n
(x) exists. In other words, <f
n
> doesn?t
converge pointwise at any point x ∈ [0,1].
Proof. (Sketch)Let <f
n
> be a Cauchy sequence in L
1
. In order to Tnd a
function to which the sequence converges, in light of example 444, we need to
take a more sophisticated approach. The fact that <f
n
> is Cauchy means
we can choose a subsequence <f
n
k
> such that the two consecutive terms are
so close to each other (i.e.
°
°
f
n
k+1
?f
n
k
°
°
1
<
1
2
n
)thattheirinTnite sum (i.e.
P
∞
k=1
R
|f
n
k+1
?f
n
k
|dm converges a.e. on X (i.e. the sum is Tnite). Then
by the Corollary of Levi?s Theorem 408, the inTnite sum
P
∞
k=1
|f
n
k+1
?f
n
k
|
also converges a.e. and because f
n
k+1
?f
n
k
≤ |f
n
k+1
?f
n
k
|, ?k,the inTnite
sum
P
∞
k=1
f
n
k+1
?f
n
k
convergesa.e.aswell.Butthesumofthedi?erences
of two consecutive terms in the subsequence itself
f
n
1
+(f
n
2
?f
n
1
)+(f
n
3
?f
n
2
)+... +(f
n
k
?f
n
k?1
)=f
n
k
.
Thus the subsequence <f
n
k
> converges a.e. on X.
Let f be the function <f
n
k
> which converges a.e. on X.We need to
show that <f
n
k
>→f with respect to k·k
1
and that f ∈ L
1
(X). To prove the
former we can use Fatou?s Lemma 393 (since <f
n
k
>→ f a.e.) and to prove
thelatterwecanusethetheestimatekfk
1
≤kf ?f
n
k
k
1
+kf
n
k
k
1
≤∞. The
Trst term in this inequality is bounded by Fatou?s Lemma and the second is
bounded since f
n
k
∈ L
1
,?k.
Then we have a Cauchy sequence <f
n
> in L
1
(X) whose subsequence
<f
n
k
>→f in L
1
(X). Then by Lemma 173, the whole sequence <f
n
>→ f
in L
1
(X).
Approximation in L
1
The next theorem establishes that the simple and continuous functions are
dense in L
1
(X).
Theorem 445 Let f be an L-integrable function on R and let ε>0.Then
(i)thereisanintegrablesimplefunction? such that
R
|f ??| <εand (ii)
there is a continuous function g such that g vanishes (i.e. g =0)outside
some bounded interval and such that
R
|f ?g| <
ε
2
.
Proof. (of i) Without loss of generality, we may assume that f ≥ 0(other-
wise f = f
+
?f
?
where f
+
and f
?
are non-negative). If f is L-integrable,
198 CHAPTER 5. MEASURE SPACES
then using the supremum property in DeTnition 388 for ε>0,there exists
a bounded L-measurable function h that vanishes outside a set E of Tnite
measure (i.e. m(E) < ∞)suchthath ≤ fand
?R
f
¢
?ε<h?
R
(f?h) <ε.
Then by Theorem 367 there is a non-decreasing sequence (since h is bounded)
of simple functions <h
n
> converging uniformly to h.Thenfor
ε
2m(E)
> 0,
?N such that |h
n
(x)?h(x)| <
ε
2m(E)
for all x ∈ E. Hence
Z
E
|f ?h
N
|≤
Z
E
|f ?h|+
Z
E
|h
N
?h|≤
ε
2
+
ε
2m(E)
m(E)=ε.
(of ii)Given: g,? continuous function h s.t. g(x)=R(x),exceptona
set ≤
ε
3
.f is integrable
Z
X
fdx=inf
ψ≥f
Z
X
ψdx (5.10)
,?d>0, ?
R
|f ?ψ|dx < d. Referring to equation (5.10).ψ is a simple
function i.e.ψ(x)=
P
n
i=1
a
i
χ
E
i
(x),μ(E
i
) < ∞, ?i if μ is Lebesgue mea-
sure. μ(M) < ∞, then for ε>0, ? F
M
closed and G
M
open such that
F
M
? M ? G
M
,μ(G
M
)?μ(F
M
) <ε.Let?s deTne
?
ε
(x)=
ρ(x,R\G
M
)
ρ(x,R\G
M
)+ρ(x,F
M
)
If
?
ε
(x)=0ifx
ε
R\G
M
?
ε
(x)=f if x
ε
F
M
?
ε
(x) is continuous because ρ(x,R\G
M
)andρ(x,F
M
)
are continuous and ρ(x,R\G
M
)+ρ(x,F
M
) 6=.Function
χ
M
?ρ
ε
=≤ x
ε
G
M
\F
M
χ
M
?ρ
ε
=0x
ε
R\(G
M
\F
M
) Hence
Z
|χ
M
(x)?ρ
ε
(x)|dμ < ε
Thus χ
M
is approximated by a continuous function ρ
ε
.
5.4. EXAMPLES USING MEASURE THEORY 199
Separability of L
1
(X)
In the next chapter we will show that ifX is compact, thenL
1
(X)isseparable
with the countable dense set being the set of all polynomials with rational
coe?cients. But if X is an abitrary L-measurable set including X with
m(X)=∞ (i.e. X = R) we need to Tnd a di?erent countably dense set. We
will show that the set M of all Tnite linear combinations of the form
n
X
i=1
c
i
χ
I
i
(5.11)
where the numbers c
i
, i =1,...,n are rational and I
i
are all intervals (open,
closed, and half-open) with rational endpoints is a countably dense set in
L
1
(X).
Countability of M is obvious. We need to show that M is dense in
L
1
(X). By Theorem 445 we know that the set of all integrable simple func-
tionsisdenseinL
1
(X). But every such function can be approximated ar-
bitrarily closely by a function of the same type taking only rational values.
Thus given f ∈ L
1
(X)andε>0 there is an integrable simple function
? =
P
n
i=1
y
i
χ
E
i
where y
i
are rational coe?cients, E
i
are mutually disjoint
L-measurable sets, and ∪
n
i=1
E
i
= X such that
R
X
|f ??|dm < ε.Ifthe
function ? were of the type (5.11) we would be done. Unfortunately it is not
because it requires E
i
to be intervals (recall that the collection of all intervals
with rational endpoints is countable whereas the collection of L-measurable
subsets of X may not be countable). Hence we need to show that every
simple integrable function ? can be approximated by functions of the form
(5.11).HereweusethefactthatifasetE is L-measurable then it can be
approximated by an interval (i.e. given ε>0, there is an interval I such
that
14
m((E\I)∪(I\E)) <ε. (5.12)
Nowusing (5.12) for sets{E
i
}
n
i=1
we can construct{I
i
}
n
i=1
suchthatm((E\I
i
)∪
(I
i
\E)) <εfor i =1,...,n.Let
b
I
i
= I
i
\∪
j<i
I
j
,i=1,...,n.Then
b
I
i
are mutu-
ally disjoint. DeTne a function
ψ(x)=
(
y
i
if x ∈
b
I
i
0ifx ∈ X\∪
n
i=1
b
I
i
.
14
We proved a similar result in Theorem 347 where a measurable set E is approximated
by open and closed sets.
200 CHAPTER 5. MEASURE SPACES
The function ? and ψ di?er from each other on a set B with su?ciently small
measure, namely m(B)=m(x ∈ X : ?(x) 6= ψ(x)) <nε. Hence
k??ψk
1
=
Z
X
|?(x)?ψ(x)|dm =
Z
B
|?(x)?ψ(x)|
≤ sup
n
|y
n
|m(B) < (nsup
n
|y
n
|)ε
canbe made arbitrarilysmall bychoosingεsu?cientlysmall. Thusψapproximates
? and ψ is of the form (5.11). Thus we have the following theorem.
Theorem 446 L
1
(X) is separable.
Proof. The countable dense set in L
1
(X)isthesetgivenby(5.11).
5.5 Appendix - Proofs in Chapter 5
Proof of Theorem 329. Take a closed Tnite interval [a,b]. Since [a,b] ?
(a ?
ε
2
,b+
ε
2
), then m
?
([a,b]) ≤ l(a ?
ε
2
,b+
ε
2
)=b ? a + ε,?ε>0so
that m
?
([a,b]) ≤ b ? a. Next we will show that m
?
[a,b] ≥ b ? a.But
this is equivalent to showing that if {I
n
}
n∈N
is an open covering of [a,b],
then
P
n∈N
l(I
n
) ≥ b?a. By the Heine-Borel Theorem 194 there is a Tnite
subcollection that also covers [a,b]. Since the sum of the lengths of the Tnite
subcollection can be no greater than the lengths of the original collection, it
su?ces to show
P
N
n=1
l(I
n
) ≥ b?a for N Tnite. It is possible to construct
a Tnite sequence of open intervals < (a
k
,b
k
) >
K
k=1
with a
k
<b
k?1
<b
k
such
that a ∈ (a
1
,b
1
)andb ∈ (a
K
,b
K
).
15
Thus
X
n∈N
l(I
n
) ≥
K
X
k=1
l(a
k
,b
k
)
= b
K
?(a
K
?b
K?1
)?(a
K?1
?b
K?2
)?...?(a
2
?b
1
)?a
1
≥ b
K
?a
1
≥ b?a.
or m
?
([a,b]) ≥ b?a. Thus m
?
([a,b]) = b?a.
15
Since a ∈∪
N
n=1
I
n
, ?(a
1
,b
1
) such that a ∈ (a
1
,b
1
). If b
1
≤ b,thensinceb
1
/∈ (a
1
,b
1
),
?(a
2
,b
2
) such that b
1
∈ (a
2
,b
2
). Continue by induction.
5.5. APPENDIX - PROOFS IN CHAPTER 5 201
To complete the proof we simply need to recognize that if I is any Tnite
interval, then for a given ε>0, there is a closed interval [a,b] ? I such that
l(I)?ε<l([a,b)]. Hence,
l(I)?ε<l([a,b]) = m
?
([a,b]) ≤ m
?
(I) ≤ m
?
?
I
¢
= l(I)=l(I)
where the Trst equality follows from the Trstpartofthistheorem,theTrst
weak inequality follows from monotonicity in Theorem 328, the second weak
inequality follows from the deTnition of closure, and the next two equalities
follow from the deTnition of length. Since l(I)?ε<m
?
(I) ≤ l(I)andε>0
is arbitrary, taking ε → 0givesm
?
(I)=l(I). If I is an inTnite interval,
m
?
(I)=∞.
Proof of Theorem 331. If m
?
(A
n
)=∞ for any n, then the inequality
holds trivially. Assume m
?
A
n
< ∞, ?n.Thengivenε>0, boundedness
implies that for each n, we can choose intervals {I
n
k
}
k∈N
such that A
n
?
∪
k∈N
I
n
k
(i.e. the intervals cover A
n
)and
P
k∈N
l(I
n
k
) ≤ m
?
(A
n
)+
ε
2n
.
16
But
the collection {I
n
k
}
n,k
=(∪
n∈N
{I
n
k
}
k∈N
) is countable, being the union of a
countable number of countable collections and covers∪
n∈N
A
n
(i.e. ∪
n∈N
A
n
?
∪
n∈N
∪
k∈N
I
n
k
). Hence
m
?
(∪
n∈N
A
n
) ≤
X
n∈N
X
k∈N
l(I
n
k
) ≤
X
n∈N
3
m
?
(A
n
)+
ε
2
n
′
=
X
n∈N
m
?
(A
n
)+ε.
Subadditivity follows since ε ≥ 0 was arbitrary and we can let ε → 0.
Proof of Theorem 341. .Corollary 338 already established that L is
an algebra. Hence it is su?cient to prove that if a set E = ∪
n∈N
E
n
where
each E
n
is L-measurable, then E is L-measurable. By Theorem 84, we may
assume without loss of generality that the E
n
are mutually disjoint sets.
Let A be any set and F
N
= ∪
N
n=1
E
n
.SinceL is an algebra and E
1
,...,E
N
are in L, the sets F
N
are L-measurable. For any set A,wehave
m
?
(A)=m
?
(A∩F
N
)+m
?
(A∩F
c
N
) (5.13)
≥ m
?
(A∩F
N
)+m
?
(A∩E
c
)
=
N
X
n=1
m
?
(A∩E
n
)+m
?
(A∩E
c
)
16
Existence of the countable collection follows from Theorem 108 and the inequality
holds by the property of the inTmum that x =infA ??ε>0,?x ∈ A such that x<x+ε.
202 CHAPTER 5. MEASURE SPACES
where the Trst equality follows by DeTnition 334, the inequality follows since
F
c
N
? E
c17
, and the last equality follows by Lemma 339. Since the left hand
side of (5.13) is independent of N,letting N →∞we have
m
?
(A) ≥
∞
X
n=1
m
?
(A∩E
n
)+m
?
(A∩E
c
) (5.14)
≥ m
?
(A∩E)+m
?
(A∩E
c
)
Here the second inequality follows from Theorem 331. But (5.14) is simply
the su?cient condition for E to be L-measurable.
Proof of Theorem 345. Let A be any set, A
1
= A∩(a,∞), and A
2
=
A∩(?∞,a]. Accordingto(5.1),itissu?cient to show m
?
(A) ≥ m
?
(A
1
)+
m
?
(A
2
). If m
?
A = ∞, the assertion is trivially true. If m
?
(A) < ∞, then for
each ε>0 there is countable collection {I
n
} of open intervals which cover A
and for which
P
∞
n=1
l(I
n
) ≤ m
?
(A)+ε by the inTmum property in DeTnition
327. Let I
0
n
= I
n
∩(a,∞)andI
00
n
= I
n
∩(?∞,a]. Then I
0
n
∪I
00
n
= I
n
∩R = I
n
and I
0
n
∩I
00
n
= ?. Therefore, l(I
n
)=l(I
0
n
)+l(I
00
n
)=m
?
(I
0
n
)+m
?
(I
00
n
). Since
A
1
? (∪
∞
n=1
I
0
n
), then m
?
(A
1
)=m
?
(∪
∞
n=1
I
0
n
) ≤
P
∞
n=1
m
?
(I
0
n
). Similarly, since
A
2
? (∪
∞
n=1
I
00
n
), then m
?
(A
2
)=m
?
(∪
∞
n=1
I
00
n
) ≤
P
∞
n=1
m
?
(I
00
n
). Thus,
m
?
(A
1
)+m
?
(A
2
) ≤
∞
X
n=1
[m
?
(I
0
n
)+m
?
(I
00
n
)]
≤
∞
X
n=1
l(I
n
) ≤ m
?
(A)+ε.
But since ε>0 was arbitrary, the result follows.
Proof of Measurable Selection Theorem 371. By induction, we will
deTne a sequence of measurable functions f
n
: X ?→ Y such that
(i) d(f
n
(z),Γ(z)) <
1
2
n
and
(ii) d(f
n+1
(z),f
n
(z)) ≤
1
2
n?1
on X for all n.
Then we are done, since from (ii) it follows that hf
n
i is Cauchy and
due to completeness of Y there exists a function f : X ?→ Y such that
f
n
(z) ?→ f (z)onX and by Corollary 358 the pointwise limit of a sequence
of measurable functions is measurable. Condition (i)guaranteesthatf (z) ∈
Γ(z),?z ∈ X where f is a measurable selection (here we use the fact that
Γ(z)isclosedandd(f(z),Γ(z)) = 0 implies f(z) ∈Γ(z)).
17
Recall by DeMorgan?s Law that F
c
N
=
£
∪
N
n=1
E
n
¤
c
= ∩
N
n=1
E
c
n
.
5.5. APPENDIX - PROOFS IN CHAPTER 5 203
Now we construct a sequence hf
n
i of measurable functions satisfying (i)
and (ii).Let {y
n
,n∈N} be a dense set in Y (since Y is separable such a
countable set exists). DeTne f
0
(z)=y
p
where p is the smallest integer such
that Γ(z)∩B
1
(y
p
) 6= ? ( f
0
(z)iswelldeTned because {y
n
,n∈N} is dense
in Y ). Since Γ is measurable then
f
?1
0
(y
p
)=
£
Γ
?1
(B
1
(y
p
))
¤
£
∪
m<p
Γ
?1
(B
1
(y
m
))
¤
∈L.
Let V be open in Y.Thenf
?1
0
(V)isinatmostacountableunionofsuch
f
?1
0
(y
p
). Hence f
?1
0
(V) is measurable so that f
0
is measurable. Suppose
we already have f
k
measurable. Then z ∈ f
?1
k
(y
i
) ≡ D
i
implies f
k
(z)=y
i
and d(f
k
(z),Γ(z)) <
1
2
k
(i.e. Γ(z) ∩ B
2
?k (y
i
) 6= ?). Thereforewecan
deTne f
k+1
(z)=y
p
for z ∈ D
i
where p is the smallest integer such that
Γ(z)∩B
2
?k (y
i
)∩B
2
?k?1 (y
p
) 6= ?.Thusf
k+1
is deTned on X = ∪
i≥1
D
i
, it is
measurable, and we have d(f
k+1
(z),Γ(z)) <
1
2
k+1
and d(f
k+1
(z),f
k
(z)) ≤
1
2
k
+
1
2
k+1
≤
1
2
k?1
on X.
Proof of Theorem 382. (?)Iff is L-measurable and bounded by M,
then we can construct sets
E
k
=
?
x ∈ E :
kM
n
≥ f(x) >
(k?1)M
n
?
,?n ≤ k ≤ n
which are measurable, disjoint, and have union E.Thus
P
n
k=?n
mE
k
= mE.
DeTne simple functionsψ
n
(x)=
M
n
P
n
k=?n
kχ
E
k
(x)and?
n
(x)=
M
n
P
n
k=?n
(k?
1)χ
E
k
(x). Then ψ
n
(x) ≥ f(x) ≥ ?
n
(x). Thus
L
u
Z
E
f(x)dx =inf
ψ≥f
Z
E
ψ(x)dx ≤
Z
E
ψ
n
(x)dx =
M
n
n
X
k=?n
kmE
k
(5.15)
and
L
l
Z
E
f(x)dx =sup
?≤f
Z
E
?(x)dx ≥
Z
E
?
n
(x)dx =
M
n
n
X
k=?n
(k?1)mE
k
. (5.16)
Then (5.15)-(5.16) implies
0 ≤ L
u
Z
E
f(x)dx?L
l
Z
E
f(x)dx ≤
M
n
n
X
k=?n
mE
k
=
M
n
mE.
Since mE < ∞ by assumption, lim
n→∞
M
n
mE =0.
204 CHAPTER 5. MEASURE SPACES
Proof of Bounded Convergence Theorem 386. Since f is the limit
of L-measurable functions f
n
it is L-measurable by Theorem 364 and hence
integrable. By Theorem 366 we know that given ε>0, ?N and an L-
measurable set A ? E with mA <
ε
4M
such that for n ≥ N and x ∈ E\A
we have |f
n
(x)?f(x)| <
ε
2mE
. Furthermore, since |f
n
(x)| ≤ M, ?n ∈ N
and ?x ∈ E, then |f(x)| ≤ M, ?x ∈ E and |f
n
(x)?f(x)| ≤ 2M, ?x ∈ A.
Therefore,
ˉ
ˉ
ˉ
ˉ
Z
E
f
n
?
Z
E
f
ˉ
ˉ
ˉ
ˉ
=
ˉ
ˉ
ˉ
ˉ
Z
E
(f
n
?f)
ˉ
ˉ
ˉ
ˉ
≤
Z
E
|f
n
?f|
=
Z
E\A
|f
n
?f|+
Z
A
|f
n
?f|
<
ε
2mE
m(E\A)+2MmA<ε,?n ≥ N.
where the Trst inequality follows by monotonicity (i.e. (iii) of Theorem 385)
and the second equality follows from (v) of Theorem 385). Hence
R
E
f =
lim
n→∞
R
E
f
n
.
Proof of Fatou?s Lemma 393. WLOG we may assume the convergence
is everywhere since integrals over sets of measure zero are zero. Let h be
aboundedL-measurable function such that h ≤ f and vanishes outside
asetH = {x ∈ E : h(x) 6=0} of Tnite measure (i.e. mH < ∞). De-
Tne a sequence of functions h
n
(x)=min{h(x),f
n
(x)}. Then h
n
is bounded
(by the bound for h)andvanishesoutsideH. Moreover, lim
n→∞
h
n
(x)=
lim
n→∞
min{h(x),f
n
(x)} =min{h(x),f(x)} = h(x)onH.Since<h
n
>
is a uniformly bounded sequence of L-measurable functions such that h
n
→
h,then lim
n→∞
R
H
h
n
=
R
H
h by the Bounded Convergence Theorem 386.
Since h vanishes outside H, then
R
E
h =
R
H
h.
While f
n
→ f a.e., we do not have that the sequence <
R
E
f
n
(x) > is
convergent. However,
Z
E
h =lim
n→∞
Z
H
h
n
=lim
n→∞
Z
H
h
n
≤ lim
n→∞
Z
E
f
n
where the second equality follows from the fact that liminf=limsup at a
limit point and the inequality follows since h
n
(x) ≤ f
n
(x)byconstruction.
18
18
We chose liminf rather than limsup since this gives a tighter bound.
5.5. APPENDIX - PROOFS IN CHAPTER 5 205
Taking the supremum over all h ≤ f we have
sup
h≤f
Z
E
h =
Z
E
f ≤ lim
n→∞
Z
E
f
n
where the equality uses DeTnition 388.
Proof of Lebesgue Dominated Convergence Theorem 404. By
Lemma 402, f
n
is integrable over E. Since lim
n→∞
f
n
= f a.e. on E and
|f
n
|≤ g,then|f|≤ g a.e. on E. Hence f is integrable over E.
Now consider a sequence <h
n
> of functions deTned by h
n
= f
n
+ g
which is nonnegative by construction and integrable for each n. Therefore,
by Fatou?s Lemma 393, we have
R
E
(f+g) ≤ lim
n→∞
R
E
(f
n
+g), which implies
R
E
f ≤ lim
n→∞
R
E
f
n
by (ii) of Theorem 403.
Similarly, construct the sequence <k
n
> of functions deTned by k
n
=
g?f
n
which is again nonnegative by construction and integrable for each n.
Therefore, by Fatou?s Lemma 393, we have
R
E
(g?f) ≤ lim
n→∞
R
E
(g?f
n
),
which implies
R
E
f ≥ lim
n→∞
R
E
f
n
by (ii) of Theorem 403.
Proof of Levi?s Theorem. 407Assume that f
1
≥ 0(otherwisewe
would consider
ˉ
f
n
= f
n
? f
1
). DeTne ? = {x ∈ A : f
n
(x) ?→ +∞}.
Then ? = ∩
∞
r=1
∪
∞
n=1
?
(r)
n
,where?
(r)
n
= {x ∈ A : f
n
(x) >r}. Using the
Chebyshev?s inequality (Lemma ??) m
3
?
(r)
n
′
≤
K
r
.Since ?
(r)
1
? ?
(r)
2
?
...... ? ?
(r)
n
? ...., this implies m
3
∪
∞
n=1
?
(r)
n
′
≤
K
r
. But for any r we have
? ?∪
∞
n=1
?
(r)
n
. Then m(?) ≤
K
r
. Since r was arbitrary, we have m(?)=
0. Thus we have proved that hf
n
i → f a.e. on A. Let?s deTne A
r
=
{x ∈ A : r?1 ≤ f (x) <r,r∈N} and let ?(x)=r on A
r
. If we prove that
? is integrable on A then using the Lebesgue Dominated Convergence Theo-
rem 404 we can conclude that Levi?s theorem holds. Let B
s
= ∪
s
r=1
A
r
.Since
on B
s
,f
n
and f are bounded and ?(x) ≤ f (x)+1, we have
R
B
s
?(x)dm ≤
R
B
s
f (x)dm + m(A) = lim
n?→∞
R
B
s
f
n
(x)dm + m(A) ≤ K + m(A)where
R
B
s
?(x)dm =
P
s
r=1
rm(A
r
). Hence we have
P
s
r=1
rm(A
r
) ≤ K + m(A)
for any s. Boundedness of partial sums of a series means that the inTnite
series
P
∞
r=1
rm(A
r
)existsandequals
R
A
?(x)dm.
Proof of Hahn Decomposition Theorem 427. Due to Zorn?s lemma
a maximal system E of disjoint measurable sets E with ν (E) < 0 exists.
Moreover E is countable (by Lemma 422). Put B = ∪{E ∈E} .Then
B is measurable and negative (because all of its subsets have negative sign
measure).
206 CHAPTER 5. MEASURE SPACES
Let A = X\B .WehaveA∩B = ?, A∪B = X and A is measurable.
We have to show that A is positive. By contradiction assume that A is not
positive. Then there exists
E
0
∈Xsuch that E
0
? A and ν (E
0
) < 0. (5.17)
Let E
0
denote the maximal collection of disjoint measurable sets F ? E
0
for
which ν (F) > 0 (at least one such set F exists because otherwise E
0
would
be negative and would contradict maximality of E ).DuetoLemma422the
collection E
0
is countable. Let F
0
= ∪{F ∈E
0
}. We have
ν (F
0
) > 0andF
0
? E
0
. (5.18)
It follows that E
0
\F
0
is negative because, by construction, it doesn?t contain
a positive measurable set. Then from the equality E
0
=(E
0
\F
0
) ∪ F
0
we
have
ν (E
0
)=ν (E
0
\F
0
)+ν (F
0
). (5.19)
From (5.17),(5.18), and (5.19) we have ν (E
0
\F
0
) < 0. The set E
0
\F
0
is then
negative, with negative measure, and (E
0
\F
0
)∩B = ?which contradicts the
maximality of set B.
Proof of Radon-Nikodyn Theorem 434. By (ii) of Lemma 432, it
su?ces to deal with ν which is non-negative (i.e. a measure). Also since μ is
σ-Tnite, the whole space X can be decomposed into countably many disjoint
sets {E
i
} for which μ(E
i
) < ∞. Hence in the proof we can assume that μ
and ν are both Tnite measures on X.Let G be the set of all non-negative,
X?measurable, integrable functions g for which
Z
E
gdμ≤ ν (E), ?E ∈X.
Setting E = X we have
Z
X
gdμ≡
Z
gdμ≤ ν (X), ?g ∈ G.
Hence the set of real numbers
?R
gdμ,g ∈ G
a
is bounded from above by
ν (X) and thus there exists a real number α such that
α =sup
g∈G
Z
gdμ.
5.5. APPENDIX - PROOFS IN CHAPTER 5 207
Then by the supremum property, there exists a sequence hg
n
i
∞
n=1
from G such
that
lim
n?→∞
Z
g
n
dμ = α.
Let us set for n ∈N,f
n
=max{g
1
,......,g
n
} . Clearly f
n
≤ f
n+1
for n ∈N.
Next we show that f
n
∈ G. It su?ces to show that if g
1
,g
2
∈ G,then
max{g
1
,g
2
}∈ G. Given a set E ∈X ,deTne
F = {x ∈ E :max(g
1
,g
2
)(x)=g
1
(x)}
G = {x ∈ E :max(g
1
,g
2
)(x)=g
2
(x) 6= g
1
(x)}.
Then becauseF andGare disjoint we have
R
E
max(g
1
,g
2
)dμ =
R
F∪G
max(g
1
,g
2
)dμ =
R
F
g
1
dμ+
R
G
g
2
dμ ≤ ν (F)+ν (G)=ν (E). Thus max(g
1
,g
2
) ∈ G and con-
sequently each function f
n
, n ∈N belongs to G.
Since α =sup
gDG
R
gdμ, then
R
f
n
dμ ≤ α ,forn ∈ N.Let f : X → R be
deTned as f (x) = lim
n?→∞
f
n
(x). This is a well deTned function because for
anyx ∈ X,the sequencehf
n
(x)iis non-decreasing and hence limf
n
(x) exists.
Then by Levi?s Theorem 407 f is integrable and
R
fdμ= lim
n→∞
R
f
n
dμ.
We next show that f ∈ G. For E ∈X and n ∈N we have χ
E
f
n
≤ χ
E
f
n+1
and lim
n?→∞
χ
E
(x)f
n
(x)=χ
E
(x)f (x)forallx ∈X. Then
Z
E
fdμ=
Z
fχ
E
dμ =
Z
lim
n?→∞
f
n
χ
E
dμ = lim
n?→∞
Z
f
n
χ
E
dμ = lim
Z
E
f
n
dμ ≤ ν (E).
(5.20)
This shows that f ∈ G and hence
R
fdμ≤ α because f
n
≥ g
n
for n ∈N we
have
R
fdμ = lim
n?→∞
R
f
n
dμ ≥ lim
n?→∞
R
g
n
dμ = α.Combining the last
two inequalities we have
R
fdμ= α.
Wenowshowthat
R
E
fdμ = ν (E) for all E ∈ X. By contradiction,
assume this equality doesn?t hold. Then due to (5.20),
R
E
fdμ< ν(E)and
then the set function
ν
0
(E)=ν (E)?
Z
E
fdμ (5.21)
is a Tnite measure not indentically equal to zero. Since ν
0
<< μ (because
ν<<μand
R
fdμ<<μ ) using Lemma 433 there exists ε>0andE
0
∈ X
such that μ(E
0
) > 0and
εμ(E
0
∩F) ≤ ν
0
(E
0
∩F),?F ∈X. (5.22)
208 CHAPTER 5. MEASURE SPACES
Set g
0
= f+ εχ
E
0
. We have
R
g
0
dμ =
R
fdμ+ εμ(E
0
) >
R
fdμ = α. If we
show that g
0
∈ G this would contradict the fact that α =sup
gDG
R
gdμ.For
anyF ∈X and using(5.21) and (5.22) we have
R
F
g
0
dμ =
R
F
?
f +εχ
E
0
¢
dμ =
R
F
fdμ+εμ(E
0
∩F) ≤
R
F
fdμ+ν
0
(E
0
∩F)=
R
F
fdμ+ν (E
0
∩F)?
R
E
0
∩F
fdμ=
R
F\(E
0
∩F)
fdμ+ ν (E
0
∩F) ≤ ν (F\(E
0
∩F))+ ν (E
0
∩F)=ν (F). Hence
g
0
∈ G, which leads to the contradiction. The uniqueness of f (except on a
set of measure zero) follows from Exercise 5.2.4.
Proof of Theorem 443. Let <f
n
> be a Cauchy sequence in L
1
so
that kf
m
?f
n
k
1
→ 0asm,n →∞. Then we can Tnd a sequence of indices
<n
k
> with n
1
<n
2
<...<n
k
< ... such that
°
°
f
n
k
?f
n
k?1
°
°
1
=
Z
X
|f
n
k
?f
n
k?1
|dm <
1
2
k
,k=1,2,... (5.23)
DeTne a sequence <g
k
> by g
k
= f
n
k
? f
n
k?1
for k =2,3,... with g
1
=
f
n
1
.Then (5.23) is simply
R
X
|g
k
|dm <
1
2
k
and taking the inTnite sum of both
sides yields
∞
X
k=1
Z
X
|g
k
|dm ≤
∞
X
k=1
1
2
k
=1.
Thus by the Corollary of Levi?s Theorem 408,
P
∞
k=1
|g
k
| converges a.e. on
X.Since g
k
≤ |g
k
|, then
P
∞
k=1
g
k
converges a.e. on X (i.e. there exists a
function f such that
P
∞
k=1
g
k
→ f a.e. on X). But
f =
∞
X
k=1
g
k
=lim
J→∞
J
X
k=1
g
k
= lim
J→∞
(g
1
+g
2
+...+g
J
)
=lim
J→∞
£
f
n
1
+(f
n
2
?f
n
1
)+(f
n
3
?f
n
2
)+... +(f
n
J
?f
n
J?1
)
¤
= lim
J→∞
f
n
J
.
Hence <f
n
k
>→f a.e. on X.
Nowweshowthat<f
n
k
>→ f with respect to k·k
1
and that f ∈ L
1
(X).
Since <f
n
k
> is Cauchy in L
1
(X) (and a subsequence of a Cauchy sequence
is Cauchy), given ε>0,
Z
|f
n
k
?f
n
l
|dm < ε (5.24)
for su?ciently large k and l. Hence by Fatou?s Lemma 393 we can take the
limit as l →∞behind the integral in (5.24) obtaining
Z
|f
n
k
(x)?f(x)|dm = kf ?f
n
k
k
1
≤ ε.
5.5. APPENDIX - PROOFS IN CHAPTER 5 209
Since
kfk
1
= kf ?f
n
k
+f
n
k
k
1
≤kf ?f
n
k
k
1
+kf
n
k
k
1
≤ ε+kf
n
k
k
1
< ∞
it follows that f ∈ L
1
(X)and<f
n
k
>→ f in L
1
(X). But by Lemma 173,
if a Cauchy sequence contains a subsequence converging to a limit, then the
sequence itself converges to the same limit. Hence <f
n
>→ f in L
1
(X).
210 CHAPTER 5. MEASURE SPACES
Figures for Sections X to X
5.6. BIBILOGRAPHY FOR CHAPTER 5 211
5.6 Bibilography for Chapter 5
This material is based on Royden (Chapters 3, 4, 11, 12) and Jain and Gupta
(1986) Lebesgue Measure and Integration, New York: Wiley (Chapters 3 to
5).
212 CHAPTER 5. MEASURE SPACES
Chapter 6
Function Spaces
In this chapter we will consider applications of functional analysis in eco-
nomics such as dynamic programming, existence of equilibrium price func-
tionals, and approximation of functions. In the Trst case, we can represent
complicated sequence problems such as the optimal growth model by a sim-
ple functional equation. In particular, a representative household?s lifetime
utility, conditional on an initial level of capital k is denoted by the function
v(k)whichsolves
v(k)= max
k
0
∈[0,K]
u(f(k)?k
0
)+βv(k
0
)
where k
0
denotes next period?s choice of capital which lies in some compact
set X =[0,K], u : R
+
→R is an increasing, continuous function representing
the household?s preferences over consumption which is just output that is
not saved for next period (i.e. f(k)?k
0
)andβ ∈ (0,1) represents the fact
that households discount the future. We can think of the above equation
as deTning an operator T which maps continuous functions deTnedona
compact set (the v(k
0
) on the right hand side of the above equation) into
continuous functions (the v(k) on the left hand side). If we let C(X)denote
the set of continuous functions deTnedonthecompactsetX,thenwehave
T : C(X) →C(X). In this chapter we analyse under what conditions solutions
to such functional equations exist. Another simple example of such operators
from mathematics are di?erential equations.
Let (X,d
X
)and(Y,d
Y
) be metric spaces. In Chapter 4 we studied func-
tions f that took points in a metric space (X,d
X
) into points in a metric
space (Y,d
Y
). Now let F(X,Y) denote the collection of all such functions
213
214 CHAPTER 6. FUNCTION SPACES
f : X →Y.LetB(X,Y) be a subset of F(X,Y) with the property that for
each pair f,g ∈ B(X,Y), the set {d
Y
(f(x),g(x)) : x ∈ X} is bounded. We
say B(X,Y) is the collection of all bounded functions.
DeTnition 447 Given B(X,Y),deTne a metric d : B×B→R by
d(f,g)=sup{d
Y
(f(x),g(x)) : x ∈ X}.
This metric is called the sup (supremum) metric. SeeFigure6.1
Exercise 6.0.1 Show that d is a metric. To do so, see DeTnition 125.
Example 448 The existence of the metric d (i.e. of the supremum) is guar-
anteed only if the space is bounded. It should be clear that there are many
functions which do not belong to B(X,Y) and hence that there are many
functions upon which d cannot be applied. For one example, let f :(0,1) →R
be given by f(x)=
1
x
and g(x)=0.
We saw in chapter 4 that a fundamental property of a metric space is its
completeness. What can be said about the completeness of (B,d)?
Theorem 449 Let (X,d
X
) and (Y,d
Y
) be metric spaces. If (Y,d
Y
) is com-
plete, then the metric space (B,d) of all bounded functions f : X → Y with
sup norm d is complete.
Exercise 6.0.2 Prove Theorem 449.Hint: To prove the theorem, the follow-
ing method of constructing f is useful. Let <f
n
> be a Cauchy sequence
in (B,d).Then?x ∈ X,the sequence <f
n
(x) > is Cauchy in Y and since
(Y,d
Y
) is complete we have <f
n
(x) > converges to f(x) in Y,?x ∈ X. Show
that f : X → Y deTned by lim
n→∞
f
n
(x),?x ∈ X, is the function that <f
n
>
converges to with respect to the sup norm.
If Y is a vector space, then the metric space (Y,d
Y
)isalsoanormed
vector space by Theorem 207. Then F(X,Y) is a vector space where (f +
g)(x)=f(x)+g(x)and(αf)(x)=αf(x). Its subspace of all bounded
functions B(X,Y) ? F(X,Y) is a normed vector space (B,k·k)withthe
norm kfk = d(f,0) = sup{kf(x)k,x∈ X}. This norm is called the sup
norm. Theorem 449 states that if (Y,k·k
Y
) is complete, then (B,k·k)isalso
complete.
215
A sequence <f
n
> converges to f : X → Y in B with respect to the sup
norm if
kf
n
?fk =sup{kf
n
(x)?f(x)k,x∈ X}→ 0asn →∞.
In Chapter 4 we introduced two types of convergence of a sequence of func-
tions: pointwise and uniform. These two types of convergence are deTned in
terms of the metric (norm) of the space Y (e.g. if (Y,d
Y
)=(R,|·|),then in
terms of the absolute value). Now we introduced (B,d)whered is the metric
on B(X,Y). As in any metric space we deTne convergence of elements (in
this case functions) with respect to its metric (in this case d).The question is,
?Is convergence with respect to d related to pointwise or uniform convergence
respectively??. The following theorem addresses this question.
Theorem 450 Let <f
n
> be a sequence of functions in B(X,Y).Then
<f
n
>→f ∈B(X,Y) with respect to the sup norm if and only if <f
n
>→ f
uniformly on X.
Exercise 6.0.3 Prove Theorem 450.
In light of Theorem 450, one might wonder if there exists a metric d on
F(X,Y) or on a subspace such that convergence of <f
n
> with respect to
this metric would be equivalent to pointwise convergence. Unfortunately, no
such metric exists.
Before proceeding, we list the principal results of the chapter. Here we
introduce two important function spaces: the space of bounded continuous
functions (denoted C(X)) and p-integrable functions (denoted L
p
(X)). We
give necessary and su?cient conditions for compactness in C(X) in Ascoli?s
Theorem 458. Then we deal with the problem of approximating continuous
functions. The fundamental result is given in a very general set of Theorems
by Stone and Weierstrass (the lattice version is 464 and the algebraic version
is 468) which provide the conditions for a set to be dense in C(X). Next the
Brouwer Fixed Point Theorem 302 of Chapter 4 on Tnite dimensional spaces
is generalized to inTnite dimensional Banach spaces in the Schauder Fixed
Point Theorem 475. Next we introduce the L
p
(X) space and show that it
is complete in the Riesz-Fischer Theorem 481. Among L
p
spaces, we show
that L
2
is a Hilbert space (i.e. that it is a complete normed vector space
with the inner product) and consider the Fourier series of a function in L
2
.
Then we introduce linear operators and functionals, as well as the notion of
216 CHAPTER 6. FUNCTION SPACES
a dual space of a normed vector space. We construct dual spaces for most
common spaces: Euclidean, Hilbert, !
p
, and most importantly for L
p
in the
Riesz Representation Theorem 532. Next we show that one can construct
bounded linear functionals on a given space X in the Hahn-Banach Theorem
539 which is used to prove certain separation results such as the fact that two
disjoint convex sets can be separated by a linear functional in Theorem 549.
Such results are used extensively in economics; for instance, it is employed
to establish the Second Welfare Theorem. The chapter ends with nonlinear
operators. First we introduce the weak topology on a normed vector space
and develop a variational method of optimizing nonlinear functions in The-
orem 572. Then we consider another method of Tnding the optimum of a
nonlinear functional by dynamic programming.
6.1 The set of bounded continuous functions
Let C(X,Y) denote the set of all continuous functions f : X → Y. In order to
deTne a normed vector space, we need to equip this set with a norm. We Trst
consider the sup norm. Since a continuous function can be unbounded (e.g.
f :(0,1] →R given by f(x)=
1
x
), the sup norm may not be well deTned on
the whole set C(X,Y). Hence we will restrict attention to a subset of C(X,Y)
that contains only bounded continuous functions, which we denote BC(X,Y).
Next we consider important properties of this space (BC(X,Y),k·k
∞
)where
k·k
∞
is the sup norm.
6.1.1 Completeness
Even if (Y,d
Y
) is complete, we cannot directly use Theorem 449 to prove
that (BC(X,Y),k·k
∞
)iscompletebecause(BC(X,Y),k·k
∞
)isasubspace
ofthecompletespace(B(X,Y),k·k
∞
). But if we show that BC(X,Y)is
closed in B(X,Y), then the fact that a closed subspace of a complete space
is complete by Theorem 177 in Chapter 4 would imply that (BC(X,Y),k·k
∞
)
is complete.
Lemma 451 BC(X,Y) is closed in B(X,Y); that is, if a sequence <f
n
>
of functions from BC(X,Y) converges to a function f : X → Y,thenf is
continuous.
6.1. THE SET OF BOUNDED CONTINUOUS FUNCTIONS 217
Proof. (Sketch) We need to show that f is continuous at any x
0
∈ X
(i.e. if ?ε>0, ?δ>0 such that ?x ∈ X with d
X
(x,x
0
) <δwe have
d
Y
(f(x),f(x
0
)) <ε). By the triangle inequality
d
Y
(f(x),f(x
0
)) ≤ d
Y
(f(x),f
n
(x)) +d
Y
(f
n
(x),f
n
(x
0
)) +d
Y
(f
n
(x
0
),f(x
0
)).
The Trst and third terms on the right hand side are arbitrarily small (i.e.
ε
4
)
since f
n
→f with respect to the sup norm and the second term is arbitrarily
small (
ε
2
)sincef
n
is continuous.
The next theorem establishes that (BC(X,Y),k·k
∞
) is a Banach space.
Theorem 452 The normed vector space (BC(X,Y),k·k
∞
) is complete.
Proof. Follows from Lemma 451 and Theorem 177. That (BC(X,Y),k·k
∞
)
is a Banach space follows from DeTnition 208.
In the remainder of this section, we will assume that X is a compact set
and (Y,d
Y
)=(R,|·|). In this case f ∈C(X,R) is bounded by Theorem 261.
Hence, instead of (BC(X,R),k·k
∞
) we will simply use the notation C(X).
Just remember, whenever you see C(X) we are assuming that X is compact,
Y is R, and we are considering the sup norm.
While uniform convergence implies pointwise convergence, we know the
the converse does not hold (e.g. f
n
:[0,1] → R given by f
n
(x)=x
n
). In
C(X), however, there is a su?cient condition for uniform convergence (and
hence for convergence with respect to the sup norm) in terms of pointwise
convergence.
Lemma 453 (Dini?s Theorem) Let <f
n
> be a monotone sequence in
C(X) (e.g. f
n+1
≤ f
n
,?n). If the sequence <f
n
> converges pointwise to a
continuous function f ∈C(X), it also converges uniformly to f.
Proof. (Sketch)Let <f
n
> be decreasing, f
n
→ f pointwise, and deTne
f
n
= f
n
?f.Then
-
f
n
?
is a decreasing sequence of non-negative functions
with f
n
→ 0pointwise.Givenε>0, for each x ∈ X, pointwise convergence
guarantees we can Tnd an index N(ε,x)forwhich0≤ f
N(ε,x)
(x) <ε.Due
to continuity of f
N(ε,x)
there is a δ(x) neighborhood around x such that
0 ≤ f
N(ε,x)
(x
0
) <εfor each x
0
of this neighborhood and due to monotonicity
of
-
f
n
?
we have 0 ≤ f
n
(x
0
) <εfor n ≥ N(ε,x). Since X is compact, there
are Tnitely many points x
i
∈ X whose neighborhoods B
δ(x
i
)
(x
i
)coverX.
218 CHAPTER 6. FUNCTION SPACES
From Tnitely many corresponding indices N(ε,x
i
)wecanTnd the minimum
N(ε) required for uniform convergence.
It is clear from the above example f
n
(x)=x
n
that the requirement that
f be continuous is essential. In the above case, f is clearly not continuous
since
f(x)=
?
0ifx ∈ [0,1)
1ifx =1
.
6.1.2 Compactness
While Theorem 193 established (necessary and) su?cient conditions (com-
pleteness and total boundedness) for compactness in a general metric space
(and hence a general normed vector space), total boundedness was di?cult
to establish. As in the case of the Heine-Borel Theorem 194 (which provided
simple su?cient conditions for a set in R
n
to be compact), here we develop
the notion of equicontinuity which will be included as a su?cient condition
for compactness.
DeTnition 454 Let (X,d
X
) and (Y,d
Y
) be metric spaces. Let D be a sub-
set of the function space BC(X,Y). If x
0
∈ X,thesetD of functions is
equicontinuous at x
0
if ?ε>0, ?δ(x
0
,ε) such that ?x ∈ X, d
X
(x,x
0
) <δ
implies d
Y
(f(x),f(x
0
)) <ε,?f ∈D.IfthesetD is equicontinuous at x
0
for
each x
0
∈ X, then it is equicontinuous on X.
Notice that the primary di?erence between the deTnition of equicontinuity
and that of continuity in 244 is that here d
Y
(f(x),f(x
0
)) <εmust hold for
all f ∈ D, while in the former this condition must hold only for the given
function f.
Example 455 Let f
n
:[0,1] → R be given by f
n
(x)=x
n
and D = {f
n
}.
At what points is D equicontinuous and at what points does it fail to be
equicontinuous? It fails at x =1. To see this, let x
0
=1.Givenε>
0,there exists N ∈ N such that for x ∈ [0,1] with d
X
(x,1) <δwe have
|f
N
(x) ?f
N
(1)| =1?x
N
≥ ε . Take the logs of both sides of 1?ε ≥ x
n
and notice that log(x) < 0 on [0,1] to yield n ≥
ln(1?ε)
ln(x)
(by the Archimedean
property such an n exists) so that we can take N = w
3
ln(1?ε)
ln(x)
′
+1.
In general δ in DeTnition 454 depends on both ε and x. If, however, the
choice of δ is independent of x we say that the set of functions D is uniformly
6.1. THE SET OF BOUNDED CONTINUOUS FUNCTIONS 219
equicontinuous on X if ?ε>0, ?δ(ε)suchthat?x,x
0
∈ X, d
X
(x,x
0
) <δ
implies d
Y
(f(x),f(x
0
)) <ε,?f ∈D.
If X is compact, then these two notions are equivalent as the following
lemma shows.
1
Lemma 456 Let X be compact. A subset D ?C(X) is equicontinuous i? it
is uniformly equicontinuous.
Proof. (Sketch) (?)Since D ? C(X) is equicontinuous at x ∈ X,then
given ε we can Tnd δ(ε,x). Then the collection {B
δ(ε,x)
(x),x∈ X} covers
X and because X is compact, there exists a Tnite subcollection covering X
and a corresponding Tnite set of {δ(ε,x
i
),i=1,...,k}. Then there exists a
smallest δ(ε)thatdoesn?tdependonx.
Equicontinuity is related to total boundedness when both X and Y are
compact as the following lemma shows.
Lemma 457 Let X be compact and Y ?R be compact. Let D be a subset of
C(X,Y). Then D is equicontinuous i? D is totally bounded in the sup norm.
Proof. (Sketch) (?)LetD be totally bounded. Given ε>0, choose
ε
1
> 0andε
2
> 0suchthat2ε
1
+ ε
2
<ε. Then for given ε
1
, there are
Tnitely many functions {f
i
,i=1,...,k} such that ε
1
balls around them cover
D. Since any Tnite collection of continuous functions is equicontinuous (see
Exercise 6.1.1), given x
0
and ε
2
, there exists δ such that if d
X
(x,x
0
) <δ,
then d
Y
(f
i
(x),f
i
(x
0
)) <ε
2
for i =1,...k. We make a similar ?estimate? for
any f ∈ D. But because there is an f
i
which is ?ε
1
-close? to f,then we split
d
Y
(f(x),f(x
0
)) into three parts (using the triangle inequality)
d
Y
(f(x),f(x
0
)) ≤ d
Y
(f(x),f
i
(x)) +d
Y
(f
i
(x),f
i
(x
0
)) +d
Y
(f
i
(x
0
),f(x
0
))
≤ ε
1
+ε
2
+ε
1
<ε.
The Trst and third terms are su?ciently small because f
i
is ?ε
1
-close? to f.
The second term is su?ciently small because {f
i
,i=1,...,k} is equicontin-
uous.
2
1
This lemma is similar to the result that a continuous function on a compact set is
uniformly continuous.
2
Notice that we haven?t used compactness of X nor Y in this direction. Thus total
boundedness always implies equicontinuity.
220 CHAPTER 6. FUNCTION SPACES
(?)SinceX is compact and D is equicontinuous, then by Lemma 456 D
is uniformly equicontinuous. Then given ε
1
there exists δ(ε
1
)andTnitely
many points {x
i
∈ X,i =1,...,k} such that {B
δ(ε
i
)
(x
i
)} covers X and
d
Y
(f(x
i
),f(x)) <ε
1
for x ∈ B
δ(ε
i
)
(x
i
) for all f ∈ D.Since Y is compact,
then Y is totally bounded by Theorem 198. Then given ε
2
there exist Tnitely
many points {y
i
∈ Y,i =1,...,m} such that {B
ε
2
(y
i
),i=1,..,m} covers Y.
Let J be the set of all functions α such that α : {1,...,k} → {1,...,m}. The
set J is Tnite (it contains m
k
elements). For α ∈ J,choosef ∈ D such
that f(x
i
) ∈ B
ε
2
(y
α(i)
)andlabelitf
α
(for example, the index α of the func-
tion f
α
in Figure 6.1.1 is α(1,2,3,4) = (2,3,1,1) because f(x
1
) ∈ B
ε
2
(y
2
),
f(x
2
) ∈ B
ε
2
(y
3
),f(x
3
) ∈ B
ε
2
(y
1
),f(x
4
) ∈ B
ε
2
(y
1
)). Then the collection of
open balls {B
ε
(f
α
),α∈ J} with ε ≤ 2ε
1
+ε
2
is a Tnite ε-covering of D.Let
f ∈ D.Thenf(x
i
) ∈ B
ε
2
(y
α(i)
)fori =1,...,k. Choose this index α and
corresponding f
α
.Then one must show that f ∈ B
ε
(f
α
). Let x ∈ X. Then
there exists i such that x ∈ B
δ(ε
1
)
(x
i
)where
d
Y
(f(x),f
α
(x)) ≤ d
Y
(f(x),f(x
i
)) +d
Y
(f(x
i
),f
α
(x)) +d
Y
(f
α
(x
i
),f
α
(x))
≤ ε
1
+ε
2
+ε
1
≤ ε.
The Trst and third terms are su?ciently small becauseD isuniformlyequicon-
tinuous and the second term is su?ciently small because f(x
i
),f
α
(x
i
) ∈
B
ε
2
(y
α(i)
).
Exercise 6.1.1 Show that a set which contains Tnitely many continuous
functions is equicontinous. Hint: Since the collection of f
i
is Tnite, there are
Tnitely many δ
i
associated with each one and hence the minimum of those δ
i
is well deTned.
Thus d
Y
3
f (x),f
j
(i)
(x)
′
<εholds for any x ∈ X.Hence there are Tnitely
many open balls {B
ε
(f
j(i)
),i=1,...,k} covering D. Before stating the main
theorem of this subsection, we point out something about boundedness in
C(X). In a normed vector space (X,k·k)asubsetA is said to be bounded if
it is contained in a ball (i.e. ?M such that kfk≤ M,?f ∈A). Since in C(X)
we have kfk =sup
x
|f(x)|, this is equivalent to ?M such that |f(x)| ≤ M,
?f ∈ D, ?x ∈ X. This is sometimes called uniform boundedness of a set of
functions D. However, in terms of the normed vector space C(X)itisjust
the normal deTnition of boundedness.
Analogous to the Heine-Borel theorem in R, we now state necessary and
su?cient conditions for compactness in C(X).
6.1. THE SET OF BOUNDED CONTINUOUS FUNCTIONS 221
Theorem 458 (Ascoli) Let X be a compact space. A subset D of C(X) is
compact i? it is closed, bounded, and equicontinuous.
Proof. Step 1:IfD is bounded, then |f(x)| ≤ M, ?f ∈ A, ?x ∈ X.Then
D is a subset of the ball B
M
(0). Let Y be the closure of B
M
(0). Then Y is
a closed, bounded subset of R and hence compact. Then D ?C(X,Y)with
both X and Y compact.
Step 2:(?) Suppose D ?C(X,Y) is compact. Then by Lemma 189,
D is closed. By Lemma 187, D is bounded. By Theorem 198, D is to-
tally bounded. But with X and Y compact (step 1) total boundedness is
equivalent to equicontinuity.
Step 3:(?)SupposeD is closed, bounded and equicontinuous (or to-
tally bounded). By Theorem 177 a closed subset of a complete normed
vector space C(X) is complete. By Theorem 198, completeness and total
boundedness is equivalent to compactness.
Example 459 Is the unit ball in C(X) a compact set? Without loss of gen-
erality we can take X =[0,1]. The unit ball B
1
in C([0,1]) is B
1
(0) = {f ∈
C([0,1]) : kfk≤ 1}.B
1
(0) is clearly bounded and closed. Is it equicontinuous?
In Example 455 we showed that {x
n
,n∈ N} was not equicontinuous. But
since kx
n
k =sup
x∈[0,1]
|x
n
| =1for each n ∈ N,then{x
n
,n∈ N} ? B
1
(0).
Thus B
1
(0)contains a subset which is not equicontinuous so that B
1
(0) is not
equicontinuous. Then by Ascoli?s Theorem 458, B
1
(0)is not compact.
In the previous example, how can the unit ball be closed if it contains a
sequence <x
n
> converging to a function that doesn?t belong to B
1
(0)? It
is because <x
n
> isnotconvergentinC([0,1]).
6.1.3 Approximation
For many applications it is convenient to approximate continuous functions
by functions of an elementary nature (e.g. functions which are piecewise
linear or polynomials).
DeTnition 460 Let f ∈ F(X,Y) with norm k·k
F
. Given ε>0, we say
g (ε-)approximates f on X with respect to k·k
F
if kf ?gk
F
<ε.If
f ∈C(X), thensinceweareusingthesupnorm,thisisequivalentto?ε>0,
sup
x∈X
{|f(x)?g(x)|} <εin which case it is clear that the approximation is
uniform. See Figure 6.1.2.
222 CHAPTER 6. FUNCTION SPACES
The concept of approximation can be stated in terms of dense sets. Let
H be a subset of C(X). Recall from DeTnition 153 that H is dense in C(X)
if the closure of H,denotedH,satisTes H = C(X). But by Theorem 148,
f ∈ H i? for any ε>0, there exists g ∈ H such that kf ?gk <ε.In other
words, a function f ∈ C can be approximated by a function g ∈ H ?C(X)
if H = C(X).
An alternative way to see this, is suppose we are trying to approximate a
continuous function f : X→R with X compact and suppose we know that
the set of all polynomials is dense (we will prove this later). Then we could
think about starting with a large degree of approximation error (say ε
1
=1)
and ask what polynomial function (call it P
1,n
(x)=a
0
+a
1
x+a
2
x
2
+...+a
n
x
n
)
bounds the error within ε
1
(i.e. kP
1,n
(x)?f(x)k <ε
1
). If this error is
too large, we could choose a smaller one, say ε
2
(=
1
2
) <ε
1
and look for
another polynomial function P
2,n
(x) such that kP
2,n
(x)?f(x)k <ε
2
. We
could let H = {P
1
,P
2
,...}?C(X). Approximation is essentially constructing
a sequence of polynomials <P
n
> that converges to f with respect to the
sup norm (i.e. uniformly).
The most generalapproximation theoremis known as the Stone-Weierstrass
Theorem which provides conditions under which a vector subspace of C(X)
is dense in C(X). There are two versions of this result: one uses lattices and
the other is algebraic.
We begin by noting that the space C(X) has a lattice structure. If f,g∈
C(X), so are the ?meet? and ?join? functions f ∧ g and f ∨ g deTned as
(f ∧g)(x)=min[f(x),g(x)] and (f ∨g)(x)=max[f(x),g(x)]. To see that
f ∧g and f ∨g are continuous, note that
(f ∧g)(x)=
1
2
(f +g)?
1
2
|f ?g| =
?
1
2
(f +g)?
1
2
(f ?g)iff>g
1
2
(f +g)+
1
2
(f ?g)iff<g
and
(f ∨g)(x)=
1
2
(f +g)+
1
2
|f ?g|.
But linear combinations of continuous functions are continuous by Theorem
251. Recall from DeTnition 42, a subset H of C(X) is a lattice if for every
pair of functions f,g∈H,wealsohavef ∧g and f ∨g in H.
DeTnition 461 AsubsetH of C(X) is called separating (or H separates
points) if for any two distinct points x,y ∈ X,?h ∈H with h(x) 6= h(y).
6.1. THE SET OF BOUNDED CONTINUOUS FUNCTIONS 223
Example 462 H
1
= {all constant functions f : X → R} is a lattice but is
not separating. To see this, let f(x)=κ,κ ∈R be a constant function. Then
H
1
is a totally ordered set (since for any two distinct elements κ
1
and κ
2
in
R we have, say, κ
1
<κ
2
). Furthermore, these two elements have a maximum
and a minimum. However, this set is not separating since f(x)=f(y)=κ
for x 6= y.
The lattice version of the Stone-Weierstrass theorem is the consequence
of the following lemma.
Lemma 463 Suppose X has at least two elements. Let H be a subset of
C(X) satisfying: (i) H is a lattice; (ii) Given x
1
,x
2
∈ X, x
1
6= x
2
,α
1
,α
2
∈R,
there exists h ∈H such that h(x
1
)=α
1
and h(x
2
)=α
2
. Then H is dense in
C(X).
Proof. (Sketch) Take f ∈ C(X)andε>0. We want to Tnd an element of
H that is within ε of f.
First, Tx x ∈ X. By assumption (ii), ?y 6= x, ?η
y
∈ H such that η
y
(x)=
f(x)andη
y
(y)=f(y). For y 6= x,setO
y
= {x
0
∈ X : η
y
(x
0
) >f(x
0
)?ε}.
This set is open since η
y
and f are continuous and as y varies, ∪
y6=x
O
y
is an
open covering of X. Since X is compact, there exists Tnitiely many sets O
y
such that X = ∪
N
j=1
O
y
j
with y
j
6= x, ?j. Then let v
x
=max{η
y
1
,...,η
yN
}.
Since H is a lattice, v
x
∈ H with the same properties as the η
y
?s; namely,
v
x
(x)=f(x)andv
x
(x
0
) >f(x
0
)?ε,?x
0
∈ X.
Second, let x vary. For each x ∈ X,let?
x
= {x
0
∈ X : v
x
(x
0
) <f(x
0
)+ε}.
By exactly the same argument as the Trst step, there exists Tnitely many
sets ?
x
1
,...,?
x
J
covering X.Setv =min{v
x
1
,...,v
x
J
}. Then v ∈ H and
f(x
0
)?ε<v<f(x
0
)+ε, ?x
0
∈ X. This means kf ?vk≤ ε.
Assumption (ii) in Lemma 463 appears hard to verify. But as we will
show, if H separates points in X and if H contains all constant functions,
then H satisTes assumption (ii) of Lemma 463.
Theorem 464 (Stone-Weierstrass L) If: (i) H is a separating vector
subspace of C(X); (ii) H is a lattice; (iii) H contains all constant functions.
Then H is dense in C(X).
Proof. To apply the previous lemma, we must show that assumptions (i)
and (iii) of the Theorem imply assumption (ii) of Lemma 463.
224 CHAPTER 6. FUNCTION SPACES
Let x
1
,x
2
∈ X with x
1
6= x
2
. Since H is separating, ?h ∈ H such that
h(x
1
) 6= h(x
2
). Let α
1
,α
2
∈R, then the system of linear equations
α
1
= μ+λh(x
1
)
α
2
= μ+λh(x
2
)
has a unique solution (μ,λ) ∈R
2
since
rank
·
h(x
1
)1
h(x
2
)1
?
=2
because h(x
1
) 6= h(x
2
). Set g(x)=μ + λh(x). Since H is a vector subspace
containing constant functions, g ∈H.Moreover,weseethatg(x
1
)=α
1
and
g(x
2
)=α
2
so that assumption (ii) of Lemma 463 is satisTed.
3
The Stone-Weierstrass theorem is very general and covers many classes
of elementary functions to approximate continuous functions. We now state
the algebraic version of the Stone-Weierstrass theorem. Following this, we
will apply whichever version is more suitable to some concrete examples.
DeTnition 465 We call a vector subspace H ?C(X) an algebra of func-
tions (not to be confused with an algebra of sets) if it is closed under multi-
plication.
Hence H ?C(X) is an algebra of functions if: (i) ?f,g∈H and α,β ∈R,
we have αf +βg∈H; (ii) ?f,g∈H,wehavef·g ∈H (where f·g is deTned
as (f ·g)(x)=f(x)·g(x),?x ∈ X).
Before stating the algebraic version of the Stone-Weierstrass Theorem,
we prove the following set of lemmas.
Lemma 466 AvectorsubspaceH ?C(X) is a lattice i? for every element
h ∈H,thefunction|h|∈H as well.
Proof. (?)Leth ∈ H.Then|h| =max(h,0)?min(h,0) and since H is a
lattice as well as a vector subspace, then the r.h.s. is from H.Thus |h|∈H.
(?)Wecanwritemax(f,g)=
1
2
£
(f +g)+
1
2
|f ?g|
¤
and min(f,g)=
1
2
£
(f +g)?
1
2
|f ?g|
¤
. The right hand sides hold since f,g∈H,theabsolute
value is fromH,andH is a vector space.
3
Instead of assuming that H contains all constants it is su?cient to assume that H
contains just the constant c =1.SinceH is a vector space, it contains all scalar multiples
of 1.
6.1. THE SET OF BOUNDED CONTINUOUS FUNCTIONS 225
Note that Lemma 466provides a very convenient way of checking that a
subset of functions is a lattice. In the next lemma we construct a sequence
of polynomials <P
n
> converging uniformly to |x| on [?1,1].
Lemma 467 There exists a sequence of polynomials <P
n
> that converges
uniformly to f(x)=|x| on [?1,1].
Proof. (Sketch) We construct the sequence <P
n
(x) > on [?1,1] by in-
duction: for n =1,P
1
(x) = 0; given P
n
(x), we deTne P
n+1
(x)=P
n
(x)+
1
2
(x
2
?P
2
n
(x)),?n ∈ N. Then show that (i) P
n
(x) ≤ P
n+1
(x), ?x ∈ [?1,1]
(i.e. P
n
is nondecreasing) and (ii) P
n
(x) →|x| pointwise on [?1,1].Since the
limit function |x| is continuous, by Dini?s Theorem 453 <P
n
(x) > converges
to |x| uniformly on [?1,1].
Theorem 468 (Stone-Weierstrass A) Every separating algebra of func-
tions H ?C(X) containing all the constant functions is dense in C(X).
Proof. If H isaseparatingsubalgebraofC(X) containing constant func-
tions, then so is its closure H. Therefore it su?ces to show that H is a lattice
and apply Theorem 464.
Let f ∈ H be nonzero. By Lemma 467, ? <P
n
> of polynomials that
converges uniformly on [?1,1] to f(x)=|x|. Since ?1 ≤
f
kfk
≤ 1,the se-
quence of fuctions
D
P
n
3
f
kfk
′E
converges uniformly to
ˉ
ˉ
ˉ
f
kfk
ˉ
ˉ
ˉ =
|f|
kfk
. But
?
P
n
μ
f
kfk
?à
→
|f|
kfk
?
?
kfkP
n
μ
f
kfk
?à
→|f|.
Since H is an algebra, all terms in this sequence are in H (because an algebra
is closed under linear combination and multiplication). Since H is closed,
|f|∈H. By Lemma 466, H is a lattice.
Exercise 6.1.2 Prove that if H is a separating subalgebra of C(X), then H
is as well.
Both versions of the Stone-Weierstrass Theorem are a very general state-
ment about the density of a subset H in C(X)or equivalently about approx-
imation in C(X). As some of the next examples show, it covers all known
approximation theorems of continuous functions.
226 CHAPTER 6. FUNCTION SPACES
Example 469 Let H
2
be the set of Lipschitz functions h : X → R given
by |f(x)?f(y)| ≤ Cd
X
(x,y), ?x,y ∈ X and C ∈ R.First,wemust
establish H
2
is a vector subspace of C(X) containing constant functions.
That is we must establish H
2
is closed under addition and scalar multipli-
cation. To see this, suppose f,g ∈ H
2
so that |f(x)?f(y)| ≤ C
1
d
X
(x,y)
and |g(x)?g(y)|≤ C
2
d
X
(x,y).Then
|(f +g)(x)?(f +g)(y)| = |f(x)?f(y)+g(x)?g(y)|
≤ |f(x)?f(y)|+|g(x)?g(y)|
≤ (C
1
+C
2
)d
X
(x,y)
so that H
2
is closed under addition. Similarly, for α ∈R,
|αf(x)?αf(y)| = |α||f(x)?f(y)|≤|α|C
1
d
X
(x,y)
so that H
2
is closed under scalar multiplication. Second, we must establish
that H
2
is a lattice. To see this, notice
||h(x)|?|h(y)||≤|h(x)?h(y)|
by the triangle inequality. Finally, we must establish that H
2
is separating.
To see this, for x 6= y, the function h(z)=d
X
(x,z) is Lipschitz with constant
1 satisfying h(x)=d
X
(x,x)=0and h(y)=d
X
(x,y) > 0.ThusH
2
is dense
in C(X) by Theorem 464.
Example 470 Let H
3
be the set of continuous piecewise linear functions
h :[a,b] →R given by h(x)=b
k
+a
k
x for c
k?1
≤ x<c
k
, k =0,1,...,n with
a = c
0
<c
1
<...<c
n
= b and where a
k?1
c
k
+ b
k?1
= a
k
c
k
+ b
k
, ?k keeps
h continuous. It is easy to show that H
3
: is a vector subspace of C([a,b])
containing constant functions; is a lattice because |g(x)|∈ H
3
i? g ∈ H
3
; and
is separating since g(x)=x ∈ H
3
. Thus H
3
is dense in C([a,b]).
Example 471 Let H
4
be the set of all polynomials h : X → R where X is
a compact subset of R
n
. It is easy to show that H
4
is a subalgebra of C(X)
containing the constants and is separating. Thus H
4
is dense in C(X).To see
H
4
is a subalgebra, note that if we multiply two polynomials, the product is
still a polynomial.
A special case of Example 471 is X =[a,b] known as the Weierstrass
Approximation Theorem. Notice that all the previous examples guarantee
6.1. THE SET OF BOUNDED CONTINUOUS FUNCTIONS 227
the existence of a dense set in C(X) but don?t present a constructive method
of approximating a continuous function. The next example shows how to Tnd
asequenceofpolynomialsh
n
(x;f) converging uniformly to f(x)on[0,1].
4
Example 472 Let H
5
be the set of Bernstein polynomials b
n
(f):[0,1] →R
for a function f :[0,1] →R where
b
n
(x;f)=
n
X
k=0
f
μ
k
n
?
·
μ
n
k
?
x
k
(1?x)
n?k
with
μ
n
k
?
=
n!
k!(n?k)!
and n!=n·(n?1)·...·2·1.
Example 473 Let H
6
be the set of all continuous functions di?erentiable to
order p = ∞ on X ? R
n
(denoted C
∞
(X)). It is easy to show that C
∞
(X)
is a separating algebra containing constant functions. Thus C
∞
(X) is dense
in C(X).
6.1.4 Separability of C(X)
To see that C(X) is separable we must show that there exists a countable
subset S ? C(X)thatisdenseinC(X) (i.e. S = C(X)). Consider the set
S of all polynomials deTned on X with rational coe?cients. From Example
471, we know that the set of all polynomials is dense in C(X). But any
polynomial can be uniformly approximated by polynomials with rational
coe?cients since Q is dense in R.
Corollary 474 if X is compact, the set S of all polynomials in X with
rational coe?cients (which is a countable set) is dense in C(X).Hence, C(X)
is separable.
6.1.5 Fixed point theorems
In Chapter 4 we proved Brouwer?s Txed point Theorem 302 for continuous
functions deTnedonacompactsubsetofR
n
. But this theorem holds true in
a more general setting. In particular, we don?t need to restrict it to a Tnite
dimensional vector space; it can be extended to inTnite dimensional vector
spaces (i.e. function spaces).
4
See Carothers p. 164 for a proof of uniform convergence.
228 CHAPTER 6. FUNCTION SPACES
Theorem 475 (Schauder) Let K be a non-empty, compact, convex subset
ofanormedvectorspaceandletf : K → K be a continuous function. Then
f has a Txed point.
Proof. (Sketch) Since K is compact, K is totally bounded. Hence, given
ε>0, there exists a Tnite set {y
i
,i =1,...,n} such that the collection
{B
ε
(y
i
),i=1,...,n} covers K. Let K
ε
= co({y
1
,...,y
n
}). Since K is convex
and y
i
∈ K for all i =1,...,n, then K
ε
? K (by Exercise 4.5.4). Note that
K
ε
is Tnite dimensional and since it is also closed and bounded, it is compact
(by Heine-Borel).
DeTne the ?projection? function p
ε
: K → K
ε
by P
ε
(y)=
P
n
i=1
θ
i
(y)y
i
such that the functions θ
i
: K →R are continuous for i =1,...,n, θ
i
(y) ≥ 0,
and
P
n
i=1
θ
i
=1.The construction of θ
i
isgivenintheproofintheap-
pendix to this chapter. By construction, P
ε
(y)isanε-approximation of y
(i.e. kP
ε
(y)?yk <ε,?ydK). Now for the function f : K → K deTne
f
ε
: K
ε
→K
ε
by f
ε
(x)=P
ε
(f(x)),for all x ∈ K
ε
. The function f
ε
satisTes all
the assumptions of Brouwer?s Fixed Point Theorem 302. Hence there exists
x
ε
∈ K
ε
such that x
ε
= f
ε
(x
ε
).
Set f(x
ε
)=y
ε
and choose a sequence <ε
i
> converging to zero. We
must show that the approximating sequence <x
ε
> and <y
ε
> converge to
the same point. By construction <y
ε
i
> is a sequence in K and since K is
compact, there exists a convergent subsequence
<y
g(ε
i
)
>→y ∈ K. (6.1)
Allthat?sleftistoshowthat<x
ε
>=<f
ε
(x
ε
) > also converges to y.
kx
ε
?yk = ky
ε
+x
ε
?y
ε
?yk = ky
ε
+f
ε
(x
ε
)?y
ε
?yk
= ky
ε
+P
ε
(y
ε
)?y
ε
?yk≤kP
ε
(y
ε
)?y
ε
k+ky
ε
?yk.
The Trst term is su?ciently small because P
ε
(y) approximates y and the
second term is su?ciently small since <y
ε
>→ y.Hence<x
ε
>→ y. Since
f is continuous, then f (x
ε
) → f (y). Combining this and (6.1) we have
f (y)=y or that y is a Txed point of f.
Schauder?s Fixed Point Theorem requires compactness of a subset K of
the function space C(X). We will now state it in a slightly di?erent form that
is more suitable for applications in function spaces (i.e. the assumptions of
the following theorem are easier to verify).
6.2. CLASSICAL BANACH SPACES: L
P
229
Theorem 476 Let F ? C(X) be a nonempty, closed, bounded, and convex
set with X compact. If the mapping T : F → F is continuous and if the
family T(F) is equicontinuous, then T has a Txed point in F.
Proof. T(F) ? F by assumption. Set H = co
3
T(F)
′
(i.e. H is the convex
hull of the closure of T(F)). By deTnition H is closed and convex. If we show
that H is equicontinuous we are done since then by Ascoli?s Theorem 458 H
is compact and T is continuous. By Schauder?s Theorem 475 T : H → H
has a Txed point.
We need to show that if T(F) is equicontinuous, then co
3
T(F)
′
is
equicontinuous. Let f ∈ co
3
T(F)
′
.Then f =
P
k
i=1
λ
i
f
i
such that f
i
∈
T(F), λ
i
> 0, and
P
k
i=1
λ
i
=1(whichobviouslyimpliesλ
i
≤ 1fori =
1,...,k). f
i
∈ T(F)impliesthat? <f
i
n
>
∞
n=1
→f
i
where f
i
n
∈ T(F). Since
kf(x)?f(y)k =
°
°
°
°
°
k
X
i=1
λ
i
f
i
(x)?
k
X
i=1
λ
i
f
i
(y)
°
°
°
°
°
≤
k
X
i=1
°
°
f
i
(x)?f
i
(y)
°
°
≤
k
X
i=1
£°
°
f
i
(x)?f
i
n
(x)
°
°
+
°
°
f
i
n
(x)?f
i
n
(y)
°
°
+
°
°
f
i
n
(y)?f(y)
°
°
¤
.
This expression is arbitrarily small for x,y close because it is the sum of
Tnitely (k) many expressions which are arbitrarily small. In particular, the
Trst and third terms are arbitrarily small because <f
i
n
>→ f
i
and the
second term is su?ciently small becuase f
i
n
∈ T(F)andT(F)isuniformly
equicontinuous.
6.2 Classical Banach spaces: L
p
In the previous section we analysed the space of all bounded continuous
functions f : X →R equipped with the (sup) norm kfk
∞
=sup{|f(x)|,x∈
X}. There we showed that (BC(X,R),k·k
∞
) is complete.
There are some potential problems using this normed vector space. Con-
vergence with respect to the sup norm in the set BC(X,R) is uniform con-
vergence (by Theorem 450), which is quite restrictive. For example, the
230 CHAPTER 6. FUNCTION SPACES
sequence <f
n
(x) >=<x
n
> on X =[0,1] is not convergent in the space
C([0,1]. That is, in Example 455 we showed that <x
n
> does not converge
uniformly. We also mentioned that a metric (and hence a norm) that would
induce pointwise convergence does not exist.
Does there exist a norm on the set C([0,1]) for which <x
n
> would be
convergent? Since x ∈ [0,1],<x
n
> is bounded and x
n
→ 0 pointwise a.e.
(i.e. except at x = 1). The sequence <
R
[0,1]
x
n
> also converges (to 0) since
lim
n→∞
kx
n
?0k
1
=lim
n→∞
Z
[0,1]
x
n
=
Z
lim
n→∞
x
n
=
Z
[0,1]
0=0
where the second equality follows from the Bounded Convergence Theorem
386. Thus <x
n
> on [0,1] converges with respect to the norm k·k
1
to f =0.
While we have deTned a norm on C(X) that does not require strong
convergence restrictions on a given sequence, we must establish whetherC(X)
equipped with k·k
1
is complete. The next example shows this is not the case.
Example 477 Take C([?1,1]) with f
n
:[?1,1] →R given by
f
n
(x)=
?
?
?
1 if x ∈ [?1,0]
1?nx if x ∈ (0,
1
n
)
0 if x ∈ [
1
n
,1]
.
See Figure 6.2.1. The sequence <f
n
(x) > is Cauchy. To see this, we must
show kf
n
(x)?f
m
(x)k
1
→ 0, with n ≥ m and m su?ciently large. Is it
convergent in C([?1,1]) with respect to the norm k·k
1
? Let f(x) be its limit.
Then we must show
kf
n
?fk
1
=
Z
[?1,1]
|f
n
(x)?f(x)|dx
=
Z
[?1,0]
|1?f(x)|dx +
Z
(0,
1
n
)
|1?nx?f(x)|dx+
Z
[
1
n
,1]
|0?f(x)|dx
vanishes as n →∞. Since all the integrands on the right hand side are
nonnegative, so is each integral. Hence kf
n
?fk
1
→ 0 would imply each
integral on the right hand side approaches zero as n →∞. Consequently
lim
n→∞
Z
[?1,0]
|1?f(x)|dx =0and lim
n→∞
Z
[
1
n
,1]
|0?f(x)|dx =0
6.2. CLASSICAL BANACH SPACES: L
P
231
implies
f(x)=
?
1 if x ∈ [?1,0]
0 if x ∈ (0,1]
.
But then f(x) is not continuous on [?1,1] and hence f(x) /∈ C([?1,1]) so
that <f
n
(x) > is not convergent. This proves that (C([?1,1]),k·k
1
) is not a
complete normed vector space.
To summarize, we have seen that the function spaceC(X) can be equipped
with two norms: the sup norm and k·k
p=1
.Intheformercase,(C(X),k·k)
is complete but in the latter case, (C(X),k·k
1
)isnotcomplete. Nowwe
will introduce the space of all L?measurable functions f : X → R that are
p?integrable. We will show that this space, known as the L
p
space, is the
completion of C(X) with respect to the k·k
p
norm (just as, for instance, Rwas
the completion of Q). Consider, then, the measure space (R,L,m)whereL is
a σ-algebra of all Lebesgue measurable sets and m istheLebesguemeasure.
While we will work here with (R,L,m), it can be extended to more general
measure spaces (X,X,μ).
DeTnition 478 For any p ∈ [1,∞),we deTne L
p
(X) with X ? R to be the
space of all L-measurable functions f : X → R such that
R
X
|f(x)|
p
dx < ∞
and L
∞
(X) to be the space of all essentially bounded L-measurable functions
(i.e. functions which are bounded almost everywhere - See Figure 6.2.2)
Furthermore, deTne the function k·k
p
: L
p
(X) →R as
kfk
p
=
(
?R
X
|f|
p
¢1
p
p ∈ [1,∞)
esssup|f| p = ∞
.
We shall establish that k·k
p
deTnes a norm on L
p
(X). For p ∈ [1,∞), this
norm is called the L
p
-norm or simply the p?norm or Lebesgue norm. To
show that k·k
p
satisTes the triangle inequality property required of a norm,
we use the same procedures as we used in !
p
spaces in Chapter 4.
Theorem 479 (Riesz-Holder Inequality) Let p,q be nonnegative conju-
gate real numbers (i.e.
1
p
+
1
q
=1). If f ∈ L
p
(X) and g ∈ L
q
(X),then fg∈ L
1
and
R
X
|fg| ≤ kfk
p
kgk
q
, with equality i? α|f|
p
= β|g|
q
a.e. where α,β are
nonzero constants.
232 CHAPTER 6. FUNCTION SPACES
Proof. When p =1,thenq = ∞.Takef ∈ L
1
and g ∈ L
∞
. Then g is
bounded a.e. so that |g| ≤ M a.e. Now |fg| ≤ M|f| a.e. so that fg∈ L
1
.
Integrating we have
Z
X
|fg|≤ M
Z
X
|f| = kfk
1
kgk
∞
.
Next assume p,q ∈ (1,∞). If either f = 0 a.e. or g =0a.e.,wehave
equality. Let f 6=0a.e. andg 6= 0 a.e. Substituting for a =
|f(x)|
kfk
p
and
b =
|g(x)|
kgk
q
in Lemma 235, we have
|f(x)g(x)|
kfk
p
kgk
q
≤
μ
1
p
?
|f(x)|
p
3
kfk
p
′
p
+
μ
1
q
?
|g(x)|
q
3
kfk
q
′
q
.
Then fg∈ L
1
and by integrating we get
R
X
|f(x)g(x)|
kfk
p
kgk
q
≤
μ
1
p
?
R
X
|f(x)|
p
3
kfk
p
′
p
+
μ
1
q
?
R
X
|g(x)|
q
3
kfk
q
′
q
=
1
p
+
1
q
=1
or
R
fg ≤ kfk
p
kgk
q
. By Lemma 235 equality holds when a
p
= b
q
, which
means
3
kgk
q
′
q
|f|
p
=
3
kfk
p
′
p
|g|
q
.
Now we need to show that k·k
p
satisTes the triangle inequality property
of a norm.
Theorem 480 (Riesz-Minkowski) For p ∈ [1,∞] and f,g∈ L
p
, kf +gk
p
≤
kfk
p
+kgk
p
.
Proof. For p =1andp = ∞, it follows trivially from |f +g|≤|f|+|g|. Let
p ∈ (1,∞)andleth = |f +g|
p?1
. Since p?1=
p
q
, it follows that h ∈ L
q
and
3
khk
q
′
q
=
R
|f +g|
p
=
3
kf +gk
p
′
p
. Now
3
kf +gk
p
′
p
=
Z
|f +g|
1+
p
q
=
Z
|f +g||f +g|
p
q
≤
Z
|f|h+
Z
|g|h
≤
3
kfk
p
+kgk
p
′
khk
q
=
3
kfk
p
+kgk
p
′3
kf +gk
p
′
p
q
6.2. CLASSICAL BANACH SPACES: L
P
233
where the second inequality follows from Theorem 479. Since p ?
p
q
=1,
dividing both sides by
3
kf +gk
p
′
p
q
6=0,wehavekf +gk
p
≤ kfk
p
+ kgk
p
.
Finally, if kf +gk = 0, then the inequality holds trivially.
As in L
1
considered in section 5.4, we stress that the function k·k
p
satisTes
all properties of a norm except the zero property (i.e. kfk
p
=0doesnot
imply f = 0 everywhere). Using equivalence classes of functions rather than
functions themselves, it can be shown as in the previous section that kfk
p
is
anormonL
p
.
Again, the most important question we must ask of our new normed
vector space is ?Is it complete?? The next theorem provides the answer.
Theorem 481 (Riesz-Fischer) For p ∈ [1,∞], (L
p
,k·k
p
) is a complete
normed vector space (i.e. a Banach space).
Proof. (Sketch) The proof for p = 1 was already given in Theorem 443.
The proof for p ∈ (1,∞) is virtually identical. Finally, let p = ∞ and let
<f
n
> be a Cauchy sequence in L
∞
. For x ∈ X,
|f
k
(x)?f
n
(x)|≤kf
k
?f
n
k
∞
(6.2)
except on a set A
k,n
? X with mA
k,n
=0byDeTnition 368 of the es-
sential supremum . If A = ∪
k,n
A
k,n
, then mA =0and|f
k
(x) ? f
n
(x)| ≤
kf
k
?f
n
k
∞
,?k,n ∈ N with k>nand ?x ∈ X\A.Since<f
n
(x) > is
Cauchy in R, there exists a bounded function f(x)that<f
n
(x) > converges
to ?x ∈ X\A. Moreover this convergence is uniform outside A as (6.2) indi-
cates.
Now we would like to establish how L
p
spaces are related to one another
and also howthey are related to the set of continuous functionsC(X,Y).Before
doing that, however, we present an example which shows that continuity does
not guarantee that a function is an element of L
p
.
Example 482 Let f :(0,1) → R be given by f(x)=
1
x
. The function f is
continuous on (0,1) but is not p-integrable for any p.Hencef ∈C((0,1),R)
but f/∈ L
p
((0,1)).
Lemma 483 BC(X,R) ? L
∞
(X).
234 CHAPTER 6. FUNCTION SPACES
Proof. If a function is bounded and continuous, it must be essentially
bounded.
Note that from DeTnition 478, if the function is continuous, then the ess
sup is just the sup. That is, if f ∈ BC(X), then kfk
∞
is the extension (in
the sense of DeTnition 56) from BC(X)toL
∞
(X)ofthesupnorm. This
also justiTes why we used the prior notation k·k
∞
for the sup norm. Thus
convergence in L
∞
(X) is equivalent to uniform convergence oustide a set of
measure zero. Note that if X is compact, then C(X) ? L
∞
(X).
If we make additional assumptions about the domain X,however,there
are inclusion relations among the L
p
(X) spaces and their associated norms.
Theorem 484 If m(X) < ∞, then for 1 <p<q<∞ we have L
∞
(X) ?
L
q
(X) ? L
p
(X) ? L
1
(X) and kfk
1
≤ c
1
kfk
p
≤ c
2
kfk
q
≤ c
3
kfk
∞
where c
i
are constants which are independent of f.
Proof. L
∞
(X) ? L
q
(X)for1<q<∞and m(X) < ∞since if f is bounded
a.e. (i.e. f ∈ L
∞
) and measurable, then since mX is Tnite we have that f is
integrable by Theorem 382.
Assume 1 <p<q<∞.Letf ∈ L
q
.Thenf
p
∈ L
q
p
.
Set λ =
q
p
. Since
q>p,we have λ>1. Choose μ such that
1
λ
+
1
μ
=1. Then
Z
|f|
p
=
Z
|f|
p
·1 ≤
μZ
|f|
pλ
?1
λ
·
μZ
1
?1
μ
=
μZ
|f|
q
?
p
q
·(mX)
1
μ
< ∞
where the Trst inequality follows from Holder?s Inequality (Theorem 479)
and taking the p-th root of both sides of the above inequality we obtain
kfk
p
≤ [m(X)]
1
pμ
kfk
q
.Hence f ∈ L
p
(X).
Note, for instance, the proof gives a constructive way to obtain the con-
stant c
2
=[m(X)]
1
pμ
. Thus stating that for p<q,L
q
(X) ? L
p
(X)means
that if f is q-integrable, then f is also p-integrable and kfk
p
≤ ckfk
q
where c
is some constant. This inequality implies that if a sequence <f
n
> converges
in L
q
(X), then <f
n
>? L
p
(X)andconvergesalsoinL
p
(X). Note also that
in this theorem we compare normed vector spaces with di?erent norms.
Putting the two previous lemmas together we have the following result.
Corollary 485 If m(X) < ∞, BC(X) ? L
p
(X).
6.2. CLASSICAL BANACH SPACES: L
P
235
Example 486 The inclusions in Theorem 484 are strict. For instance,
let 1 ≤ q<∞ and take f :(0,1) → R given by f(x)=
1
x
1
q
. Then
f ∈ L
p
((0,1)) for p<qbut f/∈ L
q
((0,1)). In particular, take f(x)=
1
√
x
.
Then
R
(0,1)
1
√
x
dx =2
√
x|
1
0
=2so f ∈ L
1
((0,1)) but
R
(0,1)
3
1
√
x
′
2
dx =
R
(0,1)
1
x
dx =ln(x)|
1
0
= ∞ so f/∈ L
2
((0,1)).
Example 487 The assumption that m(X) < ∞ is important. Take f :
[1,∞) → R given by f(x)=
1
x
1
p
with 1 ≤ p<∞. Then f ∈ L
q
([1,∞))
if q>pbut f/∈ L
p
([1,∞)). In particular, f(x)=
1
x
∈ L
2
([1,∞)) but
f/∈ L
1
([1,∞)).
Comparing Theorem 239 (in !
p
) and Theorem 484 (in L
p
), one may won-
der why the order of !
p
spaces is exactly opposite that of L
p
(X)spaceswith
m(X) < ∞. That is, for 1 <p<q<∞
!
1
? !
p
? !
q
? !
∞
L
1
? L
p
? L
q
? L
∞
.
!
p
spaces are spaces of sequences and we know that a sequence is just a
function f : N→R;thatis,afunctiondeTnedonanunboundedset. If
<x
i
>∈ !
p
, then
P
∞
i=1
|x
i
|
p
< ∞.ThisinTnite sum can only be Tnite if
|x
i
|
p
decreases ?rapidly enough? to zero. Now if p<q,then|x
i
|
q
decreases
?more rapidly? than |x
i
|
p
(i.e. |x
i
|
q
< |x
i
|
p
). Hence if <x
i
>∈ !
p
,then
<x
i
>∈ !
q
. In the case of L
p
(X)withmX < ∞, while X is bounded,
f : X →R may not be bounded. If f ∈ L
q
(X), then
R
X
|f|
q
< ∞. For p<q,
|f|
p
< |f|
q
?
R
X
|f|
p
<
R
X
|f|
q
< ∞ and hence f ∈ L
p
(X).
6.2.1 Additional Topics in L
p
(X)
Approximation in L
p
(X)
For L
p
(X),p∈ (1,∞) we have the following result which is similar to The-
orem 445 in L
1
(X).
Theorem 488 Let 1 <p<∞,X?R,f∈ L
p
(X) and ε>0.Then (i) there
is an integrable simple function ? such that kf ??k
p
<ε; and (ii) there is
a continuous function g such that g vanishes (g =0)outsidesomebounded
interval and such that kf ?gk
p
<ε.
236 CHAPTER 6. FUNCTION SPACES
Exercise 6.2.1 Prove Theorem 488. Hint: See Carothers p.350.
Note that here X can be equal to R so that the theorem also covers sets
of inTnite measure.
Corollary 489 The set of all integrable simple functions is dense in L
p
(X).
The set of all continuous functions vanishing outside a bounded interval is
dense in L
p
(X).
Now consider the case where p = ∞.Let f ∈ L
∞
(X)inwhichcasef
is bounded a.e. on X (i.e. there is a set E such that m(E)=0andf is
bounded on X\E.Then by Theorem 367, there exists a sequence of simple
functions <?
n
> converging uniformly to f on X\E.In other words, ?
n
→ f
uniformly a.e. on X.Thus,?
n
→f in L
∞
(X).
Corollary 490 The simple functions are dense in L
∞
(X).
If m(X) < ∞, then any simple function is integrable. Thus we have:
Corollary 491 If m(X) < ∞,then the integrable simple functions are dense
in L
∞
(X).
Notice that the condition m(X) < ∞ is critical here. For example, f =
1 ∈ L
∞
(R) cannot be approximated by an integrable simple function.
Separability of L
p
(X)
If X is compact, then the set of all polynomials with rational coe?cients
P
Q
(X)isdenseinC(X) and because C(X)isdenseinL
p
(X)(byCorollary
474), then P
Q
(X)isalsodenseinL
p
(X).Thus L
p
(X)withX compact is
separable.
If X is not compact, then as we showed in L
1
(X), the set M of all Tnite
linear combinations of the form
P
n
i=1
c
i
χ
I
i
where c
i
are rational numbers and
I
i
are intervals with rational endpoints is a countably dense set in L
p
(X).
Theorem 492 Corollary 493 L
P
(X) is separable for 1 <p<∞ .
Corollary 494 L
∞
(X) is not separable for any X (either compact or not).
6.2. CLASSICAL BANACH SPACES: L
P
237
Proof. Take two bounded functions χ
[a,c]
andχ
[a,d]
. Sincekχ
[a,c]
?χ
[a,d]
k
∞
=1
for c 6= d, then
B1
2
?
χ
[a,c]
¢
∩B1
2
?
χ
[a,d]
¢
= ? for c 6= d
where
B1
2
?
χ
[a,c]
¢
=
?
fdL
∞
: kf ?χ
[a,c]
k
∞
<
1
2
?
.
Let F be an arbitrary set which is dense in L
∞
([a,c]). Then for each c
with a<c<bthere is a function f
c
∈ F such that kχ
[a,c]
?f
c
k
∞
<
1
2
since
χ
[a,c]
∈ L
∞
(X)andF is dense in L
∞
(X). Because f
c
6= f
d
for c 6= d.and there
are uncountably many real numbers between [a,b], F must be uncountable.
6.2.2 Hilbert Spaces (L
2
(X))
As we mentioned in Chapter 4, a Hilbert space is a Banach space equipped
with an inner product. Hence, a Hilbert space is a special type of Banach
space which posesses an additional structure: an inner product. This addi-
tional structure allows us, apart from measuring length of vectors (norms),
to measure angles between vectors. In particular it enables us to introduce
the notion of orthogonality for two vectors.
DeTnition 495 We say that two vectors x and y of M are othogonal
(perpendicular) if their inner product <x,y>=0andwedenoteitx ⊥ y.
The set N ? H is called an orthogonal set (or orthogonal system) if
any two di?erent elements ? and ψ of N are orthogonal, that is <?,ψ>=0.
An orthogonal set N is called orthonormal if it is orthogonal and k?k =1
for each ? in N.
Example 496 R
n
with the inner product deTned by <x,y>= x
1
y
1
+....+
x
n
y
n
=
P
n
i=1
x
i
y
i
is a Hilbert space. The set N = {e
i
,i=1,...,n} where
e
i
=(0,...,0,1,0,...,0) is orthonormal.
Example 497 !
2
with the inner product deTned by <x,y>=
P
∞
i=1
x
i
y
i
where x = hx
i
i,y= hy
i
i is a Hilbert space. The set N = {e
i
,idN} is
orthonormal.
238 CHAPTER 6. FUNCTION SPACES
Example 498 L
2
([0,2π]) with inner product deTned by <f,g>=
R
2π
o
f (x)g(x)dx
where f,gdL
2
([0,2π]) is a Hilbert space. The set N =
n
1
√
2π
,cos
nx
√
π
,sin
nx
π
,ndN
o
is an orthonormal.
Exercise 6.2.2 Show that N in Example 498 is an orthonormal system.
Notice that the distance between any two distinct elements of an or-
thonormal system is
√
2. That is, k??ψk
2
= < (??ψ),(??ψ) > =
<?,?>? <?,ψ>? <ψ,?>+ <ψ,ψ>= k?k
2
+kψk
2
=1+1=2.
Lemma 499 If H is a separable Hilbert space, then each orthonormal set is
countable.
Proof. Let U = {e
α
,α∈ A}, where A is an index set, be an uncountable
orthonormal set in H. Then the collection of balls around each element e
α
with radius
1
2
(i.e. {B1
2
( e
α
),α∈ A would be an uncountable collection of
disjoint balls and hence H could not be separable.
DeTnition 500 An orthonormal set {e
α
,α∈ A} is said to be complete if
it is maximal.
In other words it is not possible to adjoin an additional element e ∈ H
with e 6=0to{e
α
,α∈ A} such that {e, e
α
,α∈ A} is an othonormal set in
H. The existence of a complete orthonormal set in any Hilbert space H is
guaranted by Zorn?s lemma because the collection {N} of all orthonormal
sets in H is partially ordered by set inclusion. Thus we have the following.
Theorem 501 Every separable Hilbert space contains a countable complete
orthonormal system.
The following theorem can be used to check if the orthonormal set is
complete.
Theorem 502 {e
α
,α∈ A} is a complete orthonormal set in H i? x ⊥ e
α
,
?α ∈ A implies x =0.
6.2. CLASSICAL BANACH SPACES: L
P
239
Proof. By contradiction. (?)Let{e
α
, ?α ∈ A} be complete and ? x 6=0
in H such that x ⊥ e
α
, ?α ∈ A. DeTne e =
x
kxk
so that kek =1.Hence
{e,e
α
,α∈ A}is orthonormal whichcontradicts the assumption that{e
α
,α∈ A}
is complete (maximal).
(?) Assume that x ⊥e
α
,?α ∈ A ? x =0and{e
α
, ?α ∈ A} is not com-
plete. Then ? e ∈ H such that {e,e
α
, ?α ∈ A} is an orthonormal sysytem
and e/∈ {e
α
, ?α ∈ A}. Since e ⊥ e
α
, ?α ∈ A and e 6=0(becausekek =1),
the assumption is contradicted.
Exercise 6.2.3 Show that the orthonormal systems in R
n
,!
2
,andL
2
([0,2π])
deTned in examples 496 to 498 are complete.
Consider now a separable Hilbert space and let {e
i
} be an orthonormal
system in H. We know that {e
i
} is either a Tnite or countably inTnite set.
We deTne the Fourier coeTcients with respect to {e
i
} of an element x ∈ H
to be a
i
=<x,e
i
>.
Theorem 503 (Bessel?s Inequality) Let {e
i
} be an orthonormal system
in H and let x ∈H.. Then
I
X
i=1
a
2
i
≤kxk
2
where a
i
=<x,e
i
> are the Fourier coe?cients of x and I = N if {e
i
} is
Tnite or I = ∞ otherwise.
Proof. 0 ≤kx?
P
n
i=1
a
i
e
i
k
2
= kxk
2
?2
P
n
i=1
a
i
<x,e
i
> +
P
n
i=1
P
n
i=1
a
i
a
j
<
e
i
,e
j
> = kxk
2
?
P
n
i=1
a
2
i
. Thus
P
n
i=1
a
2
i
≤ kxk
2
and since n was arbitrary,
we have
P
∞
i=1
a
2
i
≤kxk
2
.
Now let a
i
beFouriercoe?cients ofxwith respect to{e
i
}and let
P
∞
i=1
a
2
i
<
∞ (i.e.
P
∞
i=1
a
2
i
converges). Then consider a sequence hzi deTned by
z
n
=
n
X
i=1
a
i
e
i
,
for m ≥ n
z
m
?z
n
=
m
X
i=n
a
i
e
i
,
240 CHAPTER 6. FUNCTION SPACES
and we have kz
m
?z
n
k
2
=
P
m
i=n
P
m
j=n
a
i
a
j
<e
i
,e
j
>=
P
m
i=n
a
2
i
. This term
canbemadesu?ciently small for m,n large enough because
P
∞
i=1
a
2
i
is con-
vergent. Hence hz
n
i is a Cauchy sequence. Because H is a Hilbert space
(thus complete), there exists y ∈H such that y =
P
∞
i=1
a
i
e
i
. Since the inner
product is continuous, by the Cauchy-Schwartz inequality we have
<y,e
i
>= lim
n?→∞
<z
n
,e
i
>=<
∞
X
j=1
a
j
e
j
,e
i
>= a
i
.
Thus a
i
are Fourier coe?cients of y, as well as of x (which we started with).
When does x equal y ? In other words, when are the elements with the
same Fourier coe?cients equal? Let a
i
be the Fourier coe?cients of two
elements x and y (i.e. a
i
=<y,e
i
>=<x,e
i
>.Butthisisequivalentto
0=<y?x,e
i
>,?i =1,2... This implies x = y i? the orthonormal system
{e
i
} is complete by Theorem 502.Hence we proved the following:
Theorem 504 (Parseval Equality) If {e
i
} is a complete orthonormal sys-
tem in a Hilbert space H then for each x ∈H,x=
P
∞
i=1
a
i
e
i
3
or x =
P
N
i=1
a
i
e
i
′
where a
i
=<x,e
i
>. Moreover kxk
2
=
P
∞
i=1
a
2
i
.
To summarize the previous Tndings, let {e
i
} be a complete orthonormal
system of a Hilbert space H and let a
i
=<x,e
i
>, i =1,2,... be Fourier coef-
Tcients of x with respect to {e
i
}. Then the Fourier series
P
∞
i=1
a
i
e
i
converges
to x (with respect to the norm of H). That is,
lim
n→∞
n
X
i=1
a
i
e
i
=lim
n→∞
n
X
i=1
<x,e
i
>e
i
= x
or equivalently
°
°
°
°
°
n
X
i=1
<x,e
i
> ?x
°
°
°
°
°
→ 0.
This implies that ifx,y ∈HhavethesameFouriercoe?cients, thenkx?yk
H
=
0whichmeansx = y. Depending on the space we deal with this may mean
that x = y a.e.
Example 505 Add L
2
([0,2π])
6.3. LINEAR OPERATORS 241
6.3 Linear operators
In the previous two sections on C(X)andL
p
(X) we studied normed vector
(linear) spaces whose elements were functions. In this section, we study
functions that operate between two normed vector spaces. We call these
functions operators (to distinguish them from the functions that are elements
of the normed vector spaces).
We will focus primarily on operators that preserve the algebraic structure
of vector (linear) spaces. These functions are called linear operators. Because
normed vector spaces are also metric spaces, we will also address the issue
of how linearity relates to continuity.
DeTnition 506 Let (X,k·k
X
) and (Y,k·k
Y
) be normed vector spaces. A
function T : X → Y is called a linear operator if T(αx + βx
0
)=αTx +
βTx
0
, ?x,x
0
∈ X and α,β ∈R.
Example 507 Consider the Banach space (C([0,1]),k·k
∞
). Assume that a
function g :[0,1] ×[0,1] → R is continuous. DeTne T : C([0,1]) → C([0,1])
by (Tx)(t)=
R
[0,1]
g(t,s)x(s)ds. For instance, g(t,s) couldbeajointdensity
function and x(s)=s. Then Tx(t) is the mean of s conditional on t.Itis
easy to show that T is linear (due to the linearity of the integral).
We would like to characterize continuous linear operators. First we prove
an important fact about continuity of linear operators.
Theorem 508 Let X,Y be normed vector spaces and T : X → Y be a linear
operator. Then T is continuous on X i? T is continuous at any one element
in X.
Proof. (?)BydeTnition.
(?)Let T be continuous at x
0
∈ X and x ∈ X be arbitrary. Let <x
n
>?
X and x
n
→ x. Then <x
n
?x + x
0
>→ x
0
. Therefore, by Theorem 248,
T(x
n
?x+x
0
) →Tx
0
(because T is continuous at x
0
). But if T(x
n
?x+x
0
)=
Tx
n
?Tx+ Tx
0
→ Tx
0
(where the equality follows from linearity of T ),
then Tx
n
?Tx→ 0 ?Tx
n
→Tx.Hence T is continuous at x.
Here we stress that all one needs to establish continuity is that T is
continuity at one point. The result is a simple consequence of the linearity
of the operator T (justasweprovedinearlierchaptersthatalinearfunction
is continuous). But one should not be confused; it is not the case that all
242 CHAPTER 6. FUNCTION SPACES
linear operators are continuous since it may not be continuous at any points
in X (See Example 514).
Just as we considered restricting the space of all functions F(X,Y)from
ametricspaceX to a metric space Y to the subset B(X,Y) of all bounded
functions in the introduction to this chapter, now we introduce a bounded
linear operator and deTne a new norm.
DeTnition 509 Let X,Y be normed vector spaces and T : X → Y be a
linear operator. T is said to be bounded on X if ?K ∈ R
++
such that
kTxk
Y
≤ K ·kxk
X
, ?x ∈ X.
We note that this type of boundedness is di?erent from that in DeTnition
163. In that case, we would say ?M such that kf(x)k
Y
≤ M,?x ∈ X. The
next example shows how di?erent they are.
Example 510 Let (X,k·k
X
)=(R,|·|)=(Y,k·k
Y
) and T : X →Y be given
by Tx=2x. T as a linear function is not bounded on R with respect to k·k
Y
since 2x can be arbitrarily large. But T as a linear operator is bounded in
the sense of DeTnition 509 since kTxk
Y
≤ K ·kxk
X
?|2x|≤ 2|x|,?x ∈ X.
In the remainder of the book, when we say that a linear operator is
bounded, we mean it in the sense of DeTnition 509.
The following result shows that a bounded linear operator is equivalent
to a continuous operator.
Theorem 511 Let T : X → Y be a linear operator. Then T is continuous
i? T is bounded.
Proof. (?) Assume that T is bounded and let kx
n
k
X
→ 0.Then ?K such
that kTx
n
k
Y
≤ Kkx
n
k
X
→ 0asn →∞. But this implies kTx
n
k
Y
→ 0so
that T is continuous at zero and hence continuous on X by Theorem 508.
(?) By contraposition. In particular, we will prove that if T is not
bounded, then T is not continuous. If T is not bounded, then ?n ∈ N,
?x
n
∈ X with x
n
6=0suchthatkTx
n
k
Y
>nkx
n
k
X
. But this implies
°
°
°
°
Tx
n
nkx
n
k
X
°
°
°
°
Y
> 1.
Setting y
n
=
x
n
n·kx
n
k
X
, we know ky
n
k
X
→ 0asn →∞.But kTy
n
k
Y
> 1,
?n ∈ N.ThusTy
n
cannot converge to 0 and T is not continuous at 0 (and
hence not continuous).
6.3. LINEAR OPERATORS 243
Example 512 Consider the linear operator T : C([0,1]) → C([0,1]) deTned
in Example 507 by (Tx)(t)=
R
[0,1]
g(t,s)x(s)ds.Sinceg :[0,1]×[0,1] →R
is a continuous function on a compact domain, it is bounded or |g(x
1
,x
2
)|≤
M
1
,?(x
1
,x
2
) ∈ [0,1] × [0,1].Also,x(t):C([0,1]) → C([0,1]) is bounded by
virtue of being in C([0,1]) or |x(t)|≤ M
2
, ?t ∈ [0,1]. Thus,
(Tx)(t)=
Z
[0,1]
g(t,s)x(s)ds ≤ M
1
Z
[0,1]
|x(s)|ds ≤ M
1
M
2
.
DeTnition 513 Let L(X,Y) be the set of all linear operators T : X → Y
where X,Y arenormedvectorspaces.LetBL(X,Y) be the set of all bounded
linear operators in L(X,Y).
The next example shows thatBL(X,Y) isa proper subset ofL(X,Y).Coupled
with Theorem 511 it also shows that not all linear operators are continuous.
Example 514 Consider the normed vector space of all polynomials P :
[0,1] → R with the sup norm k·k
∞
. DeTne T : P([0,1]) → P([0,1]) by
(Tx)(t)=
dx(t)
dt
,t∈ [0,1].Tis called the di?erentiation operator. It is easy to
check that T is linear (since the derivative of a sum is equal to the sum of the
derivatives). But T is not bounded. To see why, let <x
n
(t) >= t
n
,?n ∈N.
Then kx
n
k
∞
=sup{|t
n
|,t∈ [0,1]} =1and (Tx
n
)(t)=
dx
n
(t)
dt
= n · t
n?1
.
Therefore, kTx
n
k
∞
=sup{n|t|
n?1
,t∈ [0,1]} = n,?n ∈ N. Then T is not
bounded since there is not a Txed number K such that
kTx
n
k
∞
kx
n
k
∞
= n ≤ K. The
sequence of functions x
n
(t)=t
n
converges to
x
0
(t)=
?
0 t ∈ [0,1)
1 t =1
but the sequence of their derivatives x
0
n
(t)=nt
n?1
doesn?t converge to the
derivative of x
0
(t) (which actually doesn?t exist).
In the introduction to this Chapter we deTnedthesupnormonB(X,Y) ?
F(X,Y). What would be the consequences of equipping BL(X,Y)withthe
sup norm? More speciTcally, how large would (BL(X,Y),k·k
∞
)be? The
next example shows it would be very, very small.
244 CHAPTER 6. FUNCTION SPACES
Example 515 Take X = Y = R.
5
All linear functions f : R→R are of
the form Tx = ax but these are not bounded with respect to the sup norm.
Hence the only element that would belong to BL(X,Y) of all bounded linear
operators equipped with the sup norm would be Tx=0, ?x ∈R.
DeTnition 516 Let T ∈ BL(X,Y). Then T isboundedbyassumptionso
?K such that kTxk
Y
≤ K ·kxk
X
, ?x ∈ X. We call the least such K the
(operator) norm of T and denote it kTk where
kTk =inf{K : K>0 and kTxk
Y
≤ Kkxk
X
,x∈ X}. (6.3)
Exercise 6.3.1 Prove that the function kTk in (6.3) is a norm onBL(X,Y).
What is the relation between the sup norm and this new operator norm?
In the introduction to this Chapter we deTned the sup norm on B(X,Y) ?
F(X,Y) of all bounded (linear and nonlinear) fuctions f : X →Y.Nowwe
have deTned the operator norm on BL(X,Y) of all linear operators (func-
tions) T : X → Y. We show in the next example that these two norms are
very di?erent.
Example 517 In Example 510 we had (X,k·k
X
)=(R,|·|)=(Y,k·k
Y
)
and T : X → Y be given by Tx =2x. T is not bounded on R with re-
spect to the sup norm since k2xk
∞
=sup{|2x|,x ∈ R} = ∞. However
the operator norm is bounded in the sense of DeTnition 509 since k2xk =
inf{K : K>0 and |2x|≤ K|x|,x∈R} =2.
In the remainder of this section and the next, when we refer to the norm of
a linear operator we mean the norm given in (6.3) if not speciTed otherwise.
In the following theorem, we show that the norm of a linear operator can
be expressed in many di?erent ways.
Theorem 518 The norm of a bounded linear operator T : X → Y can be
expressed as: (i) kTk =inf{K : K>0 and kTxk
Y
≤ Kkxk
X
,x∈ X}; (ii)
kTk =sup{kTxk
Y
,x∈ X,kxk
X
≤ 1}; (iii)kTk =sup{kTxk
Y
,x∈ X,kxk
X
=1};
(iv)kTk =sup
n
kTxk
Y
kxk
X
,x∈ X,x 6=0
o
.
5
We cannot take [?3,3] ?R since Xissupposed to be a vector subspace but [?3,3] is
not because, for instance, it is not closed under scalar multiplication (e.g. if we take the
scalar 4 we have [?12,12] " [?3,3]).
6.4. LINEAR FUNCTIONALS 245
Proof. Denote the right hand sides of expressions (i), (ii), (iii), and (iv) as
M
1
,M
2
,M
3
,M
4
. We want to show that M
1
= M
2
= M
3
= M
4
.
From (i), we have kTxk
Y
≤ M
1
kxk
X
,?x ∈ X. Now if kxk
X
≤ 1,then
kTxk
Y
≤ M
1
. Since M
2
is the supremum of such a set, then M
2
≤ M
1
.
Since sup{kTxk
Y
,x∈ X,kxk
X
=1}? sup{kTxk
Y
,x∈ X,kxk
X
≤ 1},then
M
3
≤ M
2
.
Next, since
kTxk
Y
kxk
X
=
°
°
°T
3
x
kxk
X
′°
°
°
Y
for x 6=0,if we let z =
x
kxk
X
,then
kzk
X
=
°
°
°
x
kxk
X
°
°
°
X
=
kxk
X
kxk
X
= 1 and hence M
3
= M
4
.
From the deTnition of M
4
,it follows that if kxk
X
6=0,then
kTxk
Y
kxk
X
≤ M
4
or
kTxk
Y
≤ M
4
kxk
X
. Since M
1
is the inTmum, we have M
1
≤ M
4
.
Thus we have M
1
≤ M
4
= M
3
≤ M
2
≤ M
1
, which implies the desired
result.
Corollary 519 Let X,Y be normed vector spaces and let T : X → Y be a
bounded linear operator. Then kTxk
Y
≤kTk·kxk
X
.
The next theorem establishes the most important result of this section;
namely that BL(X,Y) is a complete normed vector space provided that
(Y,k·k
Y
) is complete. We cannot use the previous result on completeness
in function spaces (Theorem 449) because BL(X,Y) is equipped with a
di?erent norm. However, the proof is similar to that used to establish that
B(X,Y) is complete whenever (Y,k·k
Y
)iscomplete.
Theorem 520 The space BL(X,Y) of all bounded linear operators from
anormedvectorspaceX to a complete normed vector space Y is itself a
complete normed vector space.
Proof. (Sketch) Let <T
n
> be a Cauchy sequence in BL(X,Y). For
Txed x ∈ X , <T
n
(x) > is Cauchy in Y. Since Y is complete, <T
n
(x) >
converges to an element in Y, call it Tx.ThuswecandeTne an operator
T : X → Y by Tx = lim
n→∞
T
n
(x). It is easy to show that T is bounded
and that <T
n
>→T in BL(X,Y).
6.4 Linear Functionals
In this section we study the special case of linear operators that map elements
(in this case functions) from a normed vector space to R.
246 CHAPTER 6. FUNCTION SPACES
DeTnition 521 Let (X,k·k
X
) be a normed vector space. A linear operator
F : X → R is called a linear functional. That is, a linear functional is
areal-valuedfunctionF on X such that F(αx + βx
0
)=αF(x)+βF(x
0
),
?x,x
0
∈ X and α,β ∈R
We note that if X is a Tnite dimensional vector space (e.g. R
n
), then F
is usually called a function. The functional nomenclature is typically used
when X is an inTnite dimensional vector space (e.g. !
p
,C(X),L
p
).
DeTnition 522 F : X →R is said to be bounded on X if ?K ∈R
++
such
that |F(x)|≤ K ·kxk
X
, ?x ∈ X.
Since a bounded linear functional is a special case of a bounded linear
operator, everything we proved in the previous Section 6.3 is also valid for
linear functionals. We summarize it in the following Theorem.
Theorem 523 Let F be a linear functional on a normed vector space X.
Then: (i) F is continuous i? F is continuous at any point in X; (ii) F
is continuous i? F is bounded; (iii) The set of all bounded linear func-
tionals is a complete vector space with the norm of F deTned by kFk =
sup{|F(x)|,x∈ X,kxk
X
≤ 1} or by any other equivalent formula from The-
orem 518.
Proof. Follows proofs in the previous section. Part (iii) uses fact that
(R,|·|) is complete (so that the set of all bounded linear functionals is
always complete).
We note that the set of all bounded linear functionals on X has a special
name.
DeTnition 524 Given a normed vector space X, the set of all bounded linear
functionals on X is called the dual of X, denoted X
?
.
The next set of examples illustrate functionals on Tnite and inTnite di-
mensional vector spaces.
Example 525 Let R
n
be n-dimensional Euclidean space with the Euclidean
norm. Let a =(a
1
,...,a
n
) be a Txed non-zero vector in R
n
.DeTne the ?inner
6.4. LINEAR FUNCTIONALS 247
(or dot) product? functional F
1
: R
n
→R by F
1
=<a,x>= a
1
x
1
+...+a
n
x
n
.
6
It is clear that F
1
is linear since
F
1
(αx+βx
0
)=<a,(αx+βx
0
) >
= α<a,x>+β<a,x
0
>
= αF
1
(x)+βF
1
(x
0
).
It is also easily established that F
1
is bounded since by the Cauchy-Schwartz
inequality we have |F
1
(x)| = | <a,x>| ≤ kak
X
kxk
X
,?x ∈ R
n
. Finally,
since kF
1
k =sup{|F
1
(x)|,x∈ X,kxk
X
≤ 1} ≤ kak
X
and kF
1
k ≥
|F
1
(a)|
kak
X
=
kak
2
X
kak
X
= kak
X
, we have kF
1
k = kak
X
. Figure 6.4.1 illustrates such functionals
in R
2
.
Example 526 Consider the Banach space (!
1
,k·k
1
). DeTne the linear func-
tional F
2
: !
1
→ R by F
2
(x)=
P
∞
i=1
x
i
where x =<x
i
>
∞
i=1
. Then
|F
2
(x)|≤
P
∞
i=1
|x
i
| = kxk
1
,?x ∈ !
1
. This implies that F
2
is bounded and that
kF
2
k ≤ 1. Also for x = e
1
=(1,0,...) ∈ !
1
we have kF
2
k ≥
|F
2
(e
1
)|
ke
1
k
1
=
1
1
=1.
Combining these two inequalities yields kF
2
k =1.
Example 527 Let X = C([a,b],k·k
∞
). DeTne the functional F
3
: X → R
by F
3
(x)=
R
[a,b]
x(ω)dω, x ∈ X. We can interpret this as the expectation of
a random variable drawn from a uniform distribution on support [a,b]. It is
clear that F
3
is linear. To see that F
3
is bounded, note that
|F
3
(x)| =
ˉ
ˉ
ˉ
ˉ
Z
[a,b]
x(ω)dω
ˉ
ˉ
ˉ
ˉ
≤
Z
[a,b]
|x(ω)|dω
≤ sup
ω∈[a,b]
|x(ω)|·(b?a)=(b?a)·kxk
∞
,?x ∈ X.
On the other hand, if x = x
0
where x
0
(ω)=1?ω ∈ [a,b], then kx
0
k
∞
=1
and |F
3
(x
0
)| =
R
[a,b]
1dω = b?a. Hence kF
3
k≥
|F
3
(x
0
)|
kx
0
k
∞
= b?a and combining
these inequalities kF
3
k = b?a.
Example 528 Reconsider Example 526 with a di?erent norm. In particular,
let X =(!
1
,k·k
∞
) andletthelinearfunctionalF
4
: !
1
→ R by F
4
(x)=
P
∞
i=1
x
i
where x =<x
i
>
∞
i=1
∈ !
1
. In this case, F
4
is unbounded. To see this
6
We introduced this notation in DeTnition 209.
248 CHAPTER 6. FUNCTION SPACES
deTned the sequence <x
n
>∈ !
1
as a sequence of 1?s in the Trst n places
and zeros otherwise (i.e. <x
n
>=< 1,...,1,0,0,...>where the last 1 occurs
in the n
th
place). Then kx
n
k
∞
=sup{|x
i
|,i∈N} =1 and F
4
(x
n
)=n. Thus,
|F
4
(x
n
)| = n·kx
n
k
∞
, with kx
n
k
∞
=1,?n ∈N. Therefore, kF
4
k = ∞.
6.4.1 Dual spaces
As you may have noticed from Examples 525 to 528, it is quite simple to
determine whether ?something? is a bounded linear functional. But now we
move on to tackle the converse. Given a normed vector space X, is it possible
to represent (or characterize) all bounded linear functionals on X?Inother
words, we want to determine the dual of X. Here we simply consider the dual
of some of the most common normed vector spaces.
The dual of the euclidean space R
n
In Example 525 of this section, we showed that a functional F
1
: X → R
deTned by F
1
(x)=<a,x>where x ∈ X = R
n
and a ∈ R
n
is a bounded
linear functional with kF
1
k = kak
X
. The functional F
1
is represented by the
point a;thatis,ifwevarya,wevaryF
1
. Let F be the set of all such F
1
.
InthecasewhereX = R
2
, F
1
are just planes and F is the set of all planes.
Obviously, F? X
?
. Nowweshowthattherearenoothers.Thatis,F? X
?
so that F = X
?
.
Theorem 529 The dual space of R
n
is R
n
itself. That is, each bounded
linear functional G on R
n
can be represented by an element b ∈ R
n
such that
G(x)=<b,x>for all x ∈R
n
.
Proof. (Sketch)Let G ∈ (R
n
)
?
(i.e. Gis a bounded linear functional onR
n
).
Let {e
1
,...,e
n
} be the natural basis in R
n
.DeTne b
i
= G(e
i
)fori =1,...,n.
Then the point b =(b
1
,...,b
n
) ∈ R
n
represents G.Thatis,forx ∈ R
n
we
have
G(x)=<x,b>. (6.4)
By the Cauchy-Schwartz inequality we have kGk ≤ kbk
X
and by plugging
x = bin (6.4) we obtain kGk≥kbk
X
so that kGk = kbk
X
.
This equality establishes that an operator T : X
?
→X deTned by T(G):
(G(e
1
),...,G(e
n
)) = b is an isometry (see DeTnition 171). This means that
6.4. LINEAR FUNCTIONALS 249
T preserves distances and hence it preserves topological properties of spaces
(X,k·k
X
)and(X
?
,k·k).
It is easy to verify that T is a bijection and that T is a linear operator.
Hence T preserves the algebraic (in this case linear) structure of these two
spaces.InthiscasewesaythatT is an isomorphism.
Putting these two together we have that T is an isometric isomorphism
between (X,k·k
X
)and(X
?
,k·k) and hence these two spaces are indistin-
guishable from the point of view of the number of elements, as well as the
algebraic and topological structure. Hence they are e?ectively the same space
with di?erently named elements.
The dual of a separable hilbert space
Since Euclidean space is a separable, complete inner product space, one might
like to know if there is a similar result to Theorem 529 for any separable
Hilbert space. The answer is yes.
Theorem 530 ThedualofaseparableHilbertspaceH is H itself. That is,
for every bounded linear functional F on a separable, complete inner product
space H , there is a unique element y ∈ H such that: (i) F(x)=<x,y>,
?x ∈H; and (ii) kFk = kyk.
Proof. (Sketch) By Theorem 501a separable Hilbert space contains a
countable, complete orthonormal basis {e
i
,i ∈ N}. Let F be a bounded
linear functoinal on H.Setb
i
= F(e
i
),i=1,2,... It is easy to show that
P
∞
i=1
b
2
i
≤kFk < ∞. Hence by Parseval?s Theorem 504 there exists a b ∈H
such that b =
P
∞
i=1
b
i
e
i
where b
i
are the Fourier coe?cients of b.Moreover,
kbk
H
≤ kFk. Let x =
P
∞
i=1
x
i
e
i
where x
i
are the Fourier coe?cients of x.
Then by Parseval?s equality x
n
(=
P
n
i=1
x
i
e
i
) → x(=
P
∞
i=1
x
i
e
i
)asn →∞.
Furthermore because F is continous and linear
F(x)= lim
n→∞
F
?
n
X
i=1
x
i
e
i
!
= lim
n→∞
n
X
i=1
x
i
F(e
i
) = lim
n→∞
n
X
i=1
x
i
b
i
=
∞
X
i=1
x
i
b
i
=<x,b>.
so that kFk≤kbk
H
.
A similar result can be proven for a nonseparable, complete inner product
space. Thus we can conclude that the dual space of any Hilbert space is a
Hilbert space itself (i.e. H
?
= H).
250 CHAPTER 6. FUNCTION SPACES
Since !
2
and L
2
([a,b]) are separable Hilbert spaces, we have !
?
2
= !
2
and
L
?
2
([a,b]) = L
2
([a,b]). InthecaseofL
2
([a,b]), we can claim that for any
bounded, linear functional F : L
2
([a,b]) →R there exists a unique function
g ∈ L
2
([a,b]) such that F(f)=
R
b
a
gfdx,?f ∈ L
2
([a,b]).This will be shown
in Theorem 532.
The dual space of !
p
While the previous section applied to inner product spaces, what about the
dual space to a complete normed vector space that is not a Hilbert space?
In this section, we consider the dual of !
p
for p 6=2.
Let p ∈ [1,∞)andletz ∈ !
q
where p,q are conjugate. Then F : !
p
→R
given by
F(x)=
∞
X
i=1
x
i
z
i
for x =<x
i
>
∞
i=1
∈ !
p
(6.5)
is a bounded linear functional on !
p
. This follows immediately from Holder?s
inequality (Theorem 479).
We now show that all bounded linear functionals on !
p
are of the form
(6.5).
Theorem 531 Let p ∈ [1,∞) and q satisfy
1
p
+
1
q
=1.IfF ∈ !
?
p
, there exists
an element z =<z
i
>∈ !
q
such that
F(x)=
∞
X
i=1
x
i
z
i
for all x =<x
i
>
∞
i=1
∈ !
p
and kFk = kzk
q
.
Proof. (Sketch) Let F be a bounded linear functional on !
p
.Let{e
i
,idN}
be the set of vectors having the i-th entry equal to one and all other entries
equal to zero. Set F(e
i
)=z
i
,i∈N.Givenx =<x
1
,x
2
,... >∈ !
p
,lets
n
bethe
vector consisting of the Trst n coordinates of x (i.e. s
n
=
P
n
i=1
x
i
e
i
). Then
s
n
∈ !
p
and kx?s
n
k
p
p
=
P
∞
i=n+1
|x
i
|
p
?→ 0asn →∞. Due to linearity
and continuity of F, F (x)=
P
∞
i=1
x
i
F (e
i
)=
P
∞
i=1
x
i
z
i
and kFk≤kzk
q
. By
plugging x =<x
i
> where
x
i
=
?
|z
i
|
q?2
z
i
when z
i
6=0
0whenz
i
=0
6.4. LINEAR FUNCTIONALS 251
we get that kzk
q
≤kFk. This shows that kFk = kzk
q
.
Theorem 531 establishes that !
?
p
= !
q
where p and q are conjugate. Thus,
the dual space of !
1
is !
∞
. However, the reverse is not true. That is, the dual
space of !
∞
is not !
1
or !
?
∞
! !
1
. We show this in the next section.
The dual space of L
p
An important theorem, known as the Riesz representation theorem, estab-
lishes a result similar to Theorem 531 for L
p
. Let X ?R,1≤ p<∞ and q
conjugate to p (i.e.
1
p
+
1
q
=1). Let g ∈ L
q
(X).DeTne a functional F : L
p
→R
by
F(f)=
Z
X
fgdm
for all f ∈ L
p
(X). It is easy to see that F is a bounded linear functional
on L
p
(X). Linearity follows from linearity of the integral and boundedness
follows from the Holder inequality. Then we have the result that each linear
functional on L
p
(X) can be obtained in this manner (i.e. L
?
p
= L
q
).
Theorem 532 (Riesz Representation) Let F be a bounded linear func-
tional on L
p
(X) and 1 ≤ p<∞. Then there is a function g ∈ L
q
(X) such
that
F(f)=
Z
X
fgdm
and kFk = kgk
q
.
Proof. (Sketch) Let F be a bounded linear functional on L
p
.Inalthe
previous cases, in Tnding the element b that represents a given functional we
used the same procedure; we set b
i
= F(e
i
)where{e
i
} is a basis. In L
p
,we
use indicator functions.
First assume that m(X) < ∞ (later we relax this assumption). For any
E ? X which is L-measurable (i.e. E ∈ L), χ
E
∈ L
p
(X). Thus given F we
deTne a set function ν : L→R by ν(E)=F(χ
E
)forE ? L. ν is a Tnite
signed measure which is absolutely continuous with respect to m.Thenby
the Radon Nikdodyn Theorem 434 there is an L?integrable function g that
represents ν (i.e. ν(E)=
R
E
gdm =
R
X
χ
E
gdm. By linearity of F we have
F(?)=
Z
X
g?dm
252 CHAPTER 6. FUNCTION SPACES
for all simple functions ? ∈ L
p
(x)and|F(?)| ≤ kFkk?k
p
. Then it can
be shown that g ∈ L
q
(X). Because the set of simple functions is dense in
L
p
(X),then F(f)=
R
X
gfdm for all f ∈ L
p
(X).
If m(X)=∞, since m is σ-Tnite, there is an increasing sequence of L-
measurable sets <X
n
> with Tnite measure whose union is X. Thus we
apply the result proven inthe TrstpartoftheprooftodeTne <g
n
> on
X
n
.Then show that g
n
→g and F(f)=
R
fg
n
dm for all f ∈ L
p
.
Here we note that the dual of L
∞
(X)isnotL
1
(X). That is, not all
bounded functionals on L
∞
([a,b]) can be represented by F (f)=
R
X
fg,
where g ∈ L
1
(X).The proof of this result is easier to see after some future
results on separation, so we wait until then.
6.4.2 Second Dual Space
In the previous section we showed that the dual space X
?
of all bounded
linear functionals deTned on a normed linear space X is a normed linear
space itself. Then it is possible to speak of the space (X
?
)
?
of bounded linear
functionals deTned on X
?
which is called the second dual space X
??
of X. Of
course X
??
is also a normed vector space.
Let us try to deTne some elements of X
??
. Given a Txed element x
0
in
X we can deTne a functional ψ : X
?
→R by ψ
x
0
(f)=f (x
0
)wheref runs
through all of X
?
. Notice that ψ assigns to each element f ∈ X
?
its value at
a certain Txed element of X. We have ψ
x
0
(αf
1
+βf
2
)=(αf
1
+βf
2
)(x
0
)=
αf
1
(x
1
)+βf
2
(x
2
)=αψ
x
0
(x
1
)+βψ
x
0
(x
2
)(sincef
1
,f
2
are linear functionals)
and
ˉ
ˉ
ψ
x
0
(f)
ˉ
ˉ
= |f (x
0
)| ≤ kfkkxk (since f is bounded). Hence ψ
x
0
is a
bounded linear functional on X
?
.
Besides the notation f (x),which indicates the value of the functional
f at a point x, we will Tnd it useful to employ the symmetric notation
f (x) ≡hf,xi. It is not a coincidence that a value of a functional is denoted
thesamewayasthescalarproductbecause any bounded linear functional
g deTned on a Hilbert space can be represented by a scalar product (i.e.
?y ∈H such that g(x)=hy,xi , ?x ∈H by Theorem 530).
For Txed f ∈ X
?
we can consider hf,xi as a functional on X and for Txed
x ∈ X as a functional on X
?
(i.e. as an element of X
??
). Let us deTne a new
norm k·k
2
on X by the following
kxk
2
=sup
?
|hf,xi|
kfk
,f∈ X
?
,f6=0
?
6.4. LINEAR FUNCTIONALS 253
Howisthisnormrelatedtotheoriginalnormk·k
X
on X ?Weshallshow
that kxk
X
= kxk
2
. Let f be an arbitrary non zero element in X
?
. Then
|hf,xi| ≤kfkkxk
X
? kxk
X
≥
|hf,xi|
kfk
.
Since this inequality is true for any f then
kxk
X
≥ sup
?
|hf,xi|
kfk
,fdX
?
,f6=0
?
= kxk
2
. (6.6)
Now to the converse inequality. By Theorem 540 of the next section, for
any element x ∈ X , x 6= 0 there is a bounded linear functional f
0
such that
|hf
0
,xi| = kf
0
kkxk
X
?
|hf
0
,xi|
kf
0
k
= kxk
X
.
Consequently
kxk
2
=sup
?
|hf,xi|
kfk
,f∈ X
?
,f6=0
?
≥kxk
X
(6.7)
Combining inequalities (6.6) and (6.7), we have that kxk
2
= kxk
X
.
Since hf,xi for Txed x ∈ X is a linear functional on X
?
, then by (iv)
of Theorem 518, the expression sup
n
|hf,xi|
kfk
,f∈ X
?
,f6=0
o
is the norm of
this functional. But this expression is identical to k·k
2
. If we now deTne a
mapping J : X ?→ X
??
by J (x)=hf,xi,f ∈ X
?
, then by the virtue of
the identity kxk
X
= kxk
2
= kJ (x)k, the space X is isometric with some
subset F of X
??
. See Figure 6.4.2.1. Thus X and F ? X
??
are isometrically
isomorphic (i.e. they are indistinguishable so we may write X = F and
X ? X
??
).
There is a class of normed vector spaces X for which the mapping J :
X → X
?
is onto (i.e. X = X
?
).
DeTnition 533 The space X is said to be re?exive if X = X
??
.
As we will see later this property plays a very important role in optimiza-
tion theory. Let us check some known vector spaces for re?exivity.
Example 534 The Euclidean space R
n
is re?exive. Why ? We showed in
the previous section (Theorem 529) that even the Trst dual of R
n
is R
n
(i.e.
(R
n
)
?
= R
n
). Hence (R
n
)
??
=((R
n
)
?
)
?
=(R
n
)
?
= R
n
.
254 CHAPTER 6. FUNCTION SPACES
Example 535 InTnite dimensional Hilbert spaces (!
2
,L
2
) are re?exive. By
Theorem 530 we have that H
?
= H which means that any Hilbert space is
re?exive. That is, !
??
2
= !
2
, L
??
2
= L
2
.
What about !
p
, L
p
when p 6=2?
Example 536 If 1 <p<∞ , then by Theorem 531, !
?
p
= !
q
where
1
p
+
1
q
=1.
Thus !
??
p
=
?
!
?
p
¢
?
= !
?
q
= !
p
because p,q are mutally conjugate. Similarly
L
??
p
= L
p
.
Thus if 1 <p<∞ ,then !
p
,L
p
are re?exive. If p =1byTheorem531
!
?
1
= !
∞
. But !
?
∞
) !
1
. Hence (!
?
1
)
?
= !
?
∞
) !
1
so that !
1
is not re?exive. !
∞
is also not re?exive. Similarly L
1
, L
∞
are not re?exive. We will show this
in the next section.
Example 537 It can be shown that the space C ([a,b]) of all continuous
functions on [a,b] is not re?exive.
Exercise 6.4.1 Show that X is re?exive i? X
?
is re?exive.
6.5 Separation Results
In this section we state and prove probably the most important theorem in
functional analysis; the Hahn-Banach theorem. It has numerous applications.
We will concentrate on a geometric application and will formulate it as a
separation result for convex sets. Using this theorem we prove the existence
of a competitive equilibrium allocation in a general setting.
First we deTne a new notion.
DeTnition 538 Let X be a normed vector space. A functional P : X →R
is called sublinear if: (i) P(x + x
0
) ≤ P(x)+P(x
0
), ?x,x
0
∈ X; and (ii)
P(αx)=αP(x), ?x ∈ X and α ∈R
++
.
Exercise 6.5.1 Let X be a normed vector space. Show that the norm k·k
X
:
X →R is a sublinear functional.
6.5. SEPARATION RESULTS 255
The Hahn-Banach theorem provides a method of constructing bounded
linear functionals on X with certain properties. One Trst deTnes a bounded
linear functional on a subspace of a normed vector space where it is easy
to verify the desired properties. Then the theorem guarantees that this
functional can be extended to the whole space while retaining the desired
properties.
Theorem 539 (Hahn-Banach) Let X be a vector space and P : X → R
be a sublinear functional on X .LetM be a subspace of X and let f : X →R
be a linear functional on M satisfying
f(x) ≤ P(x),?x ∈ M. (6.8)
Then there exists a linear functional F : X →R on the whole of X such that
F(x)=f(x),?x ∈ M (6.9)
and
F(x) ≤ P(x),?x ∈ X. (6.10)
Proof. (Sketch)Choosex
1
∈ X\M.DeTne a linear subspaceM
1
={x : x = αx
1
+y, y ∈ M}.Let
F bean extension off to M
1
. Since F is linear, then F (αx
1
+y)=αF (x
1
)+
F (y)=αF(x
1
)+f(y). Thus F is completely determined by F (x
1
).
Next we derive lower and upper bounds for F(x
1
)inorderforF to satisfy
(6.9) and (6.10) for x ∈ M
1
.ThusF is an extension of f from M to M
1
where
M ? M
1
. This process can be repeated and Zorn?s lemma guarantees that
F canbeextendedtothewholespaceX.
In order to apply Zorn?s lemma we deTne a partial order on the set
S = {all linear functionals g : D → R where D is a subspace of X and
g(x) ≤ P(x), ?x ∈ D} in the following way. Let g
1
,g
2
. Then g
1
<g
2
if
D(g
1
) ? D(g
2
)andg
1
(x)=g
2
(x), ?x ∈ D(g
1
). Then we must check that
every totally ordered subset of S has an upper bound (in which case the
assumptions of Zorn?s lemma are satisTed).
At Trst sight the Hahn-Banach Theorem 539 doesn?t look like a ?big
deal?. Its signiTcance in functional analysis, however, becomes apparent
through its wide range of applications, many of them involving a clever choice
of the subadditive functional P. We will state just three propositions.
The Trst result says that a bounded linear functional deTnedonavector
subspace can be extended on the whole vector space.
256 CHAPTER 6. FUNCTION SPACES
Theorem 540 Let M be a vector subspace of a normed vector space X. Let
f be a bounded linear functional on M , then there exists a bounded linear
functional F on X s.t. F (x)=f (x), ?x ∈ M and kFk = kfk.
Proof. The function P (x)=kfkkxk is a sublinear functional on X and
|f (x)| ≤ kfkkxk = p(x) (See Exercise 6.5.1). Then by Hahn-Banach The-
orem, ?F (an extention of f on X) with property that |F (x)| ≤ p(x)=
kfkkxk, ?x ∈ X. This means that F is bounded on X and also kFk≤kfk.
Because F is an extension of f, then kfk≤kFk. Hence kFk = kfk.
Exercise 6.5.2 Carefully compare the assumptions of the Hahn-Banach The-
orem 539 and Theorem 540.
Theorem 540 can be used to show that the dual of L
∞
([a,b]) is not
L
1
([a,b]).
Lemma 541 Not all bounded functionals on L
∞
([a,b]) can be represented
by F (f)=
R
[a,b]
fg, where g ∈ L
1
([a,b]). That is, (L
∞
([a,b]))
?
! L
1
([a,b]).
Proof. C ([a,b]) is a vector subspace of L
∞
([a,b]). Let F
1
: C ([a,b]) ?→ R
be a linear functional which assigns to each f ∈ C ([a,b]) the value f (a) (i.e.
F
1
(f)=f (a)). SincekF
1
k =sup
n
|F
1
(f)|
kfk
c([a,b])
,kfk 6=0
o
=sup
n
|f(a)|
sup{|f(x)|,xD[a,b]}
o
≤
1,F
1
is bounded and by Theorem 540 F
1
can be extended to a bounded linear
functional F on the whole set L
∞
([a,b]). Let?s assume, by contradiction, that
there is g ∈ L
1
([a,b]) such that F can be represented by F (f)=
R
b
a
fgdx,
?f ∈ C ([a,b]). Let hh
n
i be a sequence of continuous functions on [a,b]which
are bounded by 1, have h
n
(a)=1, andsuchthath
n
(x) → 0 for all x 6= a.
For example set h
n
(x)=
£
1
b?a
(b?x)
¤
n
. Then for each g ∈ L
1
,
R
b
a
h
n
g → 0
by the Bounded Convergence Theorem 386). Since F(h
n
)=
R
b
a
gh
n
by as-
sumption, we have F(h
n
) → 0. But F (h
n
)=h
n
(a) = 1 for all n,whichisa
contradiction.
Corollary 542 L
1
(X) and L
∞
(X) are not re?exive.
Proof. We know by Theorem 532 that L
?
1
= L
∞
and by Lemma 541
that L
?
∞
! L
1
. Combining these two results we have the (L
?
1
)
?
= L
?
∞
!
L
1
.Furthermore since L
1
is not re?exive, neither is L
∞
by Exercise 6.4.1.
The second result states that given a normed vector space X,itsdualX
?
has ?su?ciently? many elements (i.e. at least as many elements as X itself).
6.5. SEPARATION RESULTS 257
Theorem 543 Let X be a normed vector space and let x
0
6=0be any element
of X. Then there exists a bounded linear functional F on X such that kFk =1
and F(x
0
)=kx
0
k.
Proof. Let M be the subspace consisting of all multiples of x
0
(i.e. M =
{αx
0
,α∈R}. DeTne f : M ?→ R by f (αx
0
)=αkx
0
k. Then f is a linear
functional on M. DeTne P : X ?→ R by P (y)=kyk.Pis a sublinear
functional on X satisfying f(x) ≤ P(x)forx ∈ M. Then by the Hahn-
Banach theorem there exists a linear functional F : X ?→ R that is an
extention of f and F (x) ≤ P(X)=kxk,?x ∈ X. For ?x, we have F (?x) ≤
k?xk = kxk in which case |F (x)| ≤ kxk, ?x ∈ X. Thus F is bounded and
kFk =sup
n
|F(x)|
kxk
,?x ∈ X,x 6=0
o
=1. Also, since F is an extension of f,
F (x
0
)=F (1·x
0
)=1·kx
0
k = kx
0
k.
The third proposition is a geometric version of the Hahn-Banach theorem.
It is a separation result for convex sets. Before stating it we have to introduce
a few geometric concepts.
DeTnition 544 Let K ? X be convex. A point x ∈ K is an internal
point of a convex set K if given any y ∈ X, ?ε>0 such that x + δy ∈ K
for all δ satisfying |δ| <ε.
Geometrically, the statement that x is an internal point of K means that
the intersection of K with any line L through x contains a segment with x
as a midpoint. See Figure 6.5.1.
DeTnition 545 Let 0 (a zero vector) be an internal point of a convex set
K.Thenthesupport function P : X → R
++
of K (with respect to 0)is
given by
P(x)=inf
n
λ :
x
λ
∈ K,λ > 0
o
.
The support function has a simple geometric interpretation. Let x ∈ X.
Draw the line segment (a ray) from 0 through x.Thereisapointy on this
segmentthatisaboundarypointofK . Then the scalar λ for which λy = x
is P(x)sothatP(x)y = x. See Figure 6.5.2.
We have the following properties for this support function.
Lemma 546 If K is a convex set containing 0 as an internal point then the
support function P has the following properties:(i) P (αx)=αP (x) for α ≥
0;(ii) P (x+y) ≤ P (x)+P (y); (iii) {x : P (x) < 1}? K ?{x : P (x) ≤ 1}
258 CHAPTER 6. FUNCTION SPACES
Proof. (i) Let α>0,
P (αx)=inf
n
λ :
αx
λ
∈ K,λ > 0
o
=inf
(
α
λ
α
:
x
λ
α
∈ K,
λ
α
> 0
)
=inf
?
αβ :
x
β
∈ K,β > 0
?
= αinf
?
β :
x
β
∈ K,β > 0
?
= αP(x)
where β =
λ
α
> 0.
(ii) Letα =inf
?
λ :
x
λ
∈ K,λ > 0
a
= P(x)andβ =inf
n
μ :
y
μ
∈ K,μ > 0
o
=
P(y). Take λ and μ such that
x
λ
∈ K and
y
μ
∈ K.Then α ≤ λ and β ≤ μ.
Since K is convex,
μ
λ
λ+μ
?
3
x
λ
′
+
μ
μ
λ+μ
?μ
y
μ
?
∈ K
because
λ
λ+μ
+
μ
λ+μ
=1.Then
x+y
λ+μ
∈ K.ThusP (x+y) ≤ λ + μ (because
P (x+y)istheinTmum of such scalars). Hence
P (x+y) ≤ λ+μ ≤ α +β = P (x)+P (y).
(iii) It follows from the deTnition of P.
Example 547 Let X = R
2
with the Euclidean norm. Let K = {(x
1
,x
2
) ∈
R
2
: k(x
1
,x
2
)k≤ 1}. Obviously K (the unit ball) is convex. Consider a point
x
1
=(2,2) outside the ball. Then P((2,2)) =
?
λ :(
2
λ
,
2
λ
) ∈ K,λ > 0
a
. But
(
2
λ
,
2
λ
) ∈ K i?
°
°
(
2
λ
,
2
λ
)
°
°
≤ 1 ??
4
λ
2
+
4
λ
2
≤ 1 or λ ≥
√
8,soP((2,2)) =
√
8 >
1. Now consider a point x
2
=(
1
2
,
1
2
) inside the ball. Then P((
1
2
,
1
2
)) =
q
1
2
<
1.See Figure 6.5.3.
DeTnition 548 Two convex sets K
1
,K
2
are separated by a linear func-
tional F if ?α ∈R such that F(x) ≤ α, ?x ∈ K
1
and F(x) ≥ α, ?x ∈ K
2
.
6.5. SEPARATION RESULTS 259
Theorem 549 (Separation) Let K
1
,K
2
be two convex sets of a normed
vector space X. Assume that K
1
has at least one internal point and that K
2
contains no internal point of K
1
. Then there is a nontrivial linear functional
separating K
1
and K
2
.
Proof. (Sketch) Let K
1
and K
2
be two convex subsets of X and without
loss of generality let 0 ∈ K
1
and x
0
∈ K
2
.DeTne K = x
0
+ K
1
? K
2
.See
Figure 6.5.4. 0 is an internal point of K and x
0
is not an internal point of
K (this latter fact follows since K
2
contains no internal points of K
1
). Thus
by (iii) of Lemma 546 P(x) ≤ 1 for all x ∈ K and P(x
0
) ≥ 1whereP is a
support function of K.
Let M be a vector subspace (i.e. M = {x : x = αx
0
,α∈R}).DeTne f :
M → R by f(x)=f (αx
0
)=αP (x
0
).fis a linear functoinal that satisTes
f(x) ≤ p(x) ?x ∈ M. Hence by the Hahn-Banach Theorem 539 there exists
an extension of f (i.e. a linear function F : X →R satisfying F(x) ≤ P(x)
?x ∈ X. This functional F separates K
1
and K
2
.Why?Takex ∈ K with
x = x
0
+y?z where y ∈ K
1
and z ∈ K
2
.Then F(x) ≤ P(x) ≤ 1forx ∈ K.
Since F is linear
F(x
0
)+F(y)?F(z) ≤ 1 ?? F(y)+(F(x
0
)?1) ≤ F(z) (6.11)
Since x
0
∈ M,
F(x
0
)=f(x
0
)=p(x
0
) ≥ 1 ?? F(x
0
)?1 ≥ 0 (6.12)
Combining (6.11) and (6.12) we have F(y) ≤ F(z)foranyy ∈ K
1
and z ∈ K
2
.
Hence
sup
y∈K
1
F (y) ≤ inf
z∈K
2
F (z).
Thus F separates K
1
,K
2
and F isanon-zerofunctional(sinceF (x
0
)=1).
There are several corollaries and modiTcationsofthisimportantsepara-
tion theorem.
Corollary 550 (Separation of a point from a closed set) If K is a nonempty,
closed, convex set and x
0
/∈ K, then there exists a continuous linear func-
tional F not identically zero such that F(x
0
) < inf
x∈K
F(x).
260 CHAPTER 6. FUNCTION SPACES
Proof. By translating by ?x
0
, we reduce the Corollary to the case where
x
0
=0. Sincex
0
/∈ K and K is closed, then by Exercise 4.1.3 we have
0 <d=inf
x∈K
kx
0
?xk. Let Bd
2
(x
0
) be the open ball around x
0
with radius
d
2
. By the Separation Theorem 549 there exists a linear functional f such
that
sup
x∈B
d
2
(0)
f(x) ≤ inf
y∈K
f(y)=α.
Thus f(x) ≤ α for x ∈ Bd
2
(x
0
). If x ∈ Bd
2
(x
0
), then ?x ∈ Bd
2
(x
0
)which
implies that f(?x)=?f(x) ≤ α. Hence |f(x)|≤ α for all x ∈ Bd
2
(x
0
). This
implies continuity at 0 and by Theorem 508 continuity everywhere.
To show strict inequality, take x ∈ Bd
2
(x
0
)andλ>0 such that λx ∈
Bd
2
(x
0
) (this is possible since 0 is an internal point of Bd
2
(x
0
)). We have
0 <λf(x)=f(λx) ≤ α. Thus we have f(0) = 0 <α=inf
y∈K
f(y). See
Figure 6.5.5.
Corollary 551 (Strict Separation) Suppose that a nonempty, closed, con-
vex set K
1
and a nonempty, compact convext set K
2
are disjoint. Then there
exists a continuous linear functional F, not identically zero , that strictly
separates them (i.e. sup
x∈K
1
F(x) < inf
x∈K
2
F(x)).
Proof. If K
1
,K
2
are convex, then K
1
? K
2
is convex. Since K
1
is closed
and K
2
is compact, then K
1
? K
2
is closed. Since K
1
∩ K
2
= ?,then
0 /∈ K
1
?K
2
.Now apply Corollary 550 with x
0
=0andK = K
1
?K
2
. See
Figure 6.5.6.
It doesn?t su?ce to assume both sets K
1
and K
2
are closed. One of them
has to be compact. For an example of this, see Aliprantis and Border Exam-
ple 5.51. This doesn?t contradict the Separation Theorem as it might seem
because Theorem 549 requires the additional assumption of the existence of
an internal point of at least one of the sets.
Exercise 6.5.3 Show that if K
1
is closed and K
2
is compact, then K
1
?K
2
is closed.
6.5.1 Existence of equilibrium
Let S be a Tnite dimensional Euclidean space with norm k·k =(
P
n
i=1
|x
2
i
|)
1
2
.
There are I consumers, indexed by i =1,...,I. Consumer i chooses among
6.5. SEPARATION RESULTS 261
commodity points in a set X
i
? S and maximizes utility given by u
i
: X
i
→
R.ThereareJ Trms, indexed by j =1,...,J.Firm j chooses among points in
asetY
j
? S describing its technological possibilities and maximizes proTts.
We say that an (I + J)-tuple ({x
i
}
I
i=1
,{y
j
}
J
j=1
) describing the consump-
tion x
i
of each consumer and the production y
j
of each producer is an allo-
cation for this economy. An allocation is feasible if: x
i
∈ X
i
, ?i; y
j
∈ Y
j
,
?j;and
P
I
i=1
x
i
?
P
J
j=1
y
j
≤ 0 (where there is free disposal). An allocation is
Pareto Optimal if it is feasible and if there is no other feasible allocation
({x
0
i
}
I
i=1
,{y
0
j
}
J
j=1
) such that u
i
(x
0
i
) ≥ u
i
(x
i
),?i and u
i
(x
0
i
) >u
i
(x
i
)forsomei.
An allocation ({x
?
i
}
I
i=1
,{y
?
j
}
J
j=1
) together with a continuous linear functional
φ : S →R is a competitive equilibrium if: (a) ({x
?
i
}
I
i=1
,{y
?
j
}
J
j=1
)isfeasi-
ble; (b) for each i, x ∈ X
i
,and φ(x) ≤ φ(x
?
i
) implies u
i
(x) ≤ u
i
(x
?
i
); and (c)
for each j, y ∈ Y
j
implies φ(y) ≤ φ(y
?
j
).
Theorem 552 (Second Welfare Theorem) Let: (A1) X
i
is convex for
each i; (A2) if x,x
0
∈ X
i
,u
i
(x) >u
i
(x
0
) and α ∈ (0,1), then u
i
(αx +(1?
α)x
0
) >u
i
(x
0
) for each i;(A3) u
i
: X
i
→ R is continuous for each i; (A4)
the set Y =
P
J
j=1
Y
j
is convex.
7
Under (A1) ?(A4),let({x
?
i
}
I
i=1
,{y
?
j
}
J
j=1
)
be a Pareto Optimal allocation. Assume that for some h ∈ {1,...,I}, ?bx
h
such that u
h
(bx
h
) >u
h
(x
?
h
). Then there exists a continuous linear functional
φ : S →R, not identically zero on S,suchthat:
?i,x ∈ X
i
and u
i
(x
i
) ≥ u
i
(x
?
i
) ? φ(x) ≥ φ(x
?
i
) (6.13)
and
?j,y ∈ Y
j
? φ(y) ≤ φ(y
?
j
). (6.14)
If
?i,?x
0
i
such that φ(x
0
i
) <φ(x
?
i
), (6.15)
then
?
({x
?
i
}
I
i=1
,{y
?
j
}
J
j=1
),φ
a
is a competitive equilibrium.
Proof. (Sketch) Since S is Tnite dimensional and the aggregate technolog-
ical possibilities set is convex (A4), for the existence of φ it is su?cient to
show that the set of allocations preferred to {x
?
i
}
I
i=1
given by A =
P
I
i=1
A
i
is
convex where A
i
= {x ∈ X
i
: u
i
(x) ≥ u
i
(x
?
i
)},?i and that A does not contain
any interior points of Y. Then apply Theorem 549. To complete the proof,
7
The assumption that S is Tnite dimensional is also important, but can be weakened
in the inTnite dimensional case to assume that Y has an interior point.
262 CHAPTER 6. FUNCTION SPACES
it is su?cient to show (b) holds in the deTnition of a competitive equilibrium
which follows from contraposition of (6.13).
You should recognize that φ(x)=<p,x>canbeconsideredaninner
product representation of prices.
6.6 Optimization of Nonlinear Operators
In this chapter we have dealt with linear operators and functionals. While
we showed very deep results in linear functonal analysis - the Riesz Rep-
resentation Theorem and the Hahn Banach Theorem to name just a few -
there are many problems in economics that involve nonlinear operators. For
instance, the operator in most dynamic programing problems, such as the
growth example suggested in the introduction to this chapter, does not satisfy
the linearity property of an operator. In particular, an operator T : X → Y
as simple as T(x)=a + bx does not possess the linearity property since
T(αx+βx
0
)=a+b(αx+βx
0
) 6= αTx+βTx
0
. Such a function does possess
a monotonicity property (i.e. if x ≤ x
0
, then Tx≤ Tx
0
).
Nonlinear functional analysis is a very broad area covering topics such
as Txed points of nonlinear operators (which we touched on a subsection of
6.1), nonlinear monotone operators, variational methods and optimization of
nonlinear operators. In this section, we show how variational methods and
Txed point theory (in the form of dynamic programming) can be used to
prove the existence of an optimum of a nonlinear operator.
6.6.1 Variational methods on inTnite dimensional vec-
tor spaces
Most books of economic analysis dealing with optimization focus on Tnding
necessary conditions for a function deTnedonagivensetwhichisasubset
of a Tnite dimensional Euclidean space R
n
. These conditions are called Trst
order conditions (in the case of inequality or mixed constraints they are called
Kuhn-Tucker conditions). Our main focus of this chapter is the optimization
of functions deTnedonaninTnite dimensional vector space (i.e. optimization
of functionals).
While we have already encountered linear functonals in Section 6.4, in
this section we will consider a broader class of functionals than linear ones;
we will consider continuous functionals which are concave (or convex as the
6.6. OPTIMIZATION OF NONLINEAR OPERATORS 263
case may warrant). Our main concern is existence theory (i.e. given an opti-
mization problem consisting in maximizing (minimizing) a concave (convex)
functional over some feasible set, usually deTned by constraints, we want to
know whether an optimal solution can be found. Hence we will deal with
su?cient conditions. In the second part of the section we also touch upon the
problem of Tnding this optimal solution which means stating the necessary
conditions for an optimun.
Example 553 The types of problems we can consider are: the existence of
a Pareto-optimal allocation of an economy with an inTnite commodity space;
the existence of an optimal solution of an inTnite horizon growth model.
Su?cient Conditions for an Optimal Solution
In this subsection we address the fundamental question ?Does a functional
have a maximum (or minimum) on a given set?? In Chapter 4, we proved a
very important result; the Extreme Value Theorem 262 stated that a con-
tinuous function deTned on a compact subset of a metric space attains its
minimum and maximum. Does this theorem apply to functionals (functions
whose domain is a subset of an inTnite-dimensional vector space)? Clearly
the answer is yes since a vector space is a metric space and dimensionality is
not mentioned in the theorem at all. Consider the following example.
Example 554 Let a functional f be deTned on C ([0,1]) by f (x)=
R 1
2
0
x(t)dt?
R
1
1
2
x(t)dt. We want to solve the optimization problem maxf (x) subject to
kxk≤ 1.To establish continuity of f (x), we need only establish boundedness
since f (x) is a linear functional and Theorem 511 establishes that bounded-
264 CHAPTER 6. FUNCTION SPACES
ness is su?cient (and also necessary) for continuity. Hence
|f (x)| =
ˉ
ˉ
ˉ
ˉ
ˉ
Z 1
2
0
x(t)dt?
Z 1
2
0
x(t)dt
ˉ
ˉ
ˉ
ˉ
ˉ
≤
Z 1
2
0
|x(t)|dt +
Z
1
1
2
|x(t)|dt
=
Z
1
0
|x(t)|dt
≤
Z
1
0
sup
t[0,1]
|x(t)|dt
=sup
tD[0,1]
|x(t)|
Z
1
0
dt
=sup
tD[0,1]
|x(t)| = kxk.
Suppose now we want to Tnd the maximum of this continuous functional on
theclosedunitballinC ([0,1]). But the maximum cannot be attained. Why?
Our problem means maximizing the shaded area. See Figure 6.6.2.??? The
steeper the middle part of x(t) is the larger is the area. But this line cannot be
vertical because x(t) wouldn?t be a function. The steepest line clearly doesn?t
exist.
Exercise 6.6.1 Show the non-existence of a maximum in Example 554 rig-
orously. Hint: Use the geometric insight provided above.
What went wrong with establishing a maximum in Example 554? We
have a continuous functional on a closed unit ball that doesn?t attain its
maximum. Recall however by Example 459 that a closed unit ball is not
acompactsetinC ([0,1]). In fact there is a theorem saying that a closed
unit ball in a normed vector space X is compact if and only if X is Tnite
dimensional.
8
Ifa?nice?setlikeaclosedunitball is not compact, then compactness
must be an extremly restrictive assumption in inTnite dimensional vector
spaces. And it really is. Compact sets in inTnite dimensional vector spaces
doesn?t contain interior points. Thus the Extreme Value Theorem is practi-
cally unusable in optimizing functionals.
8
See Rudin ????
6.6. OPTIMIZATION OF NONLINEAR OPERATORS 265
If Example 554 were formulated in L
∞
(0,1), the maximum would be
attained by the discontinuous function
x(t)=
?
1for0≤ x<
1
2
?1for
1
2
≤ x ≤ 1
.
L
∞
(0,1) is also inTnite dimensional. Is this a contradiction to what we
claimed above? No. There are many optimizing problems in inTnite dimen-
sional vector spaces that attain optima but an optimum cannot be guaranteed
by the assumptions of continuity and compactness. To this end, we will in-
troduce a new type of convergence in a vector space X.In terms of this new
convergence we?ll deTne a new type of continuity and compactness so that
the collection of these ?new types? of compact sets is much broader than the
collection of original compact sets. In particular, we will identify a class of
such vector spaces in which the closed unit ball is ?weakly? compact.
Semicontinuous and concave functionals
Before introducing this ?new type? of convergence, we deTne certain prop-
erties of functionals. The concept of convexity and concavity for functionals
is analogous to the one for functions.
DeTnition 555 Let K be a convex subset of a normed vector space X .A
functional f : K ?→ R is called: (i) Concave if for any u,v ∈ K and for
any α ∈ [0,1],f(αu+(1?α)v) ≥ αf (u)+(1?α)f (v);(ii) Convex if
f (αu+(1?α)v) ≤ αf (u)+(1?α)f (v)
Exercise 6.6.2 Show that f is concave i? ?f is convex.
Exercise 6.6.3 Verify that the functional f (x)=
R
1
0
(x
2
(t)+|x(t)|)dt de-
Tned on L
2
[0,1] is convex.
Next we introduce the concept of semicontinuity of functionals (or func-
tions as the case may be). Why don?t we simply use continuity? Recall we
used the assumption of continuity in the Extreme Value theorem to guaran-
tee the existence of both amaximumand a minimum. Here we will show
that the assumption can be weakened at the cost of guaranteeing either a
maximum or a minimum.
266 CHAPTER 6. FUNCTION SPACES
DeTnition 556 A functional f deTned on a normed vector space X is said
to be: (i) upper semicontinuous at x
0
if given ε>0, there is a δ>0
such that f (x)?f (x
0
) <εfor kx?x
0
k <δ; (ii) lower semicontinuous
at x
0
if f (x
0
)?f (x) <εfor kx?x
0
k <δ.
Exercise 6.6.4 Show that f is usc i? ?f is lsc.
Exercise 6.6.5 Show that f is continuous at x
0
if f is both usc and lsc at
x
0
.
Exercise 6.6.6 A sequence version deTnition of semicontinuity is the fol-
lowing: (i) f is usc at x
0
if for any sequence hx
n
i converging to x
0
, limsup
n→∞
f (x
n
) ≤
f (x
0
);(ii) f is lsc at x
0
if liminf
n→∞
f (x
n
) ≥ f (x
0
).Show that the sequence
deTnition is equivalent to that in 556.
Now the Extreme Value Theorem can be reformulated:
Theorem 557 An upper (lower) semicontinuous functional f on a compact
subset K ofanormedvectorspaceX achieves a maximum (minimum) on
K.
Proof. Let M =sup
x∈K
f (x)(M may be ∞ ). There is a sequence hx
n
i
from K such that f (x
n
) → M. Since K is compact there is a convergent
subsequence hx
n
k
i → x
0
∈ K.Clearly,f (x
n
k
) → M and since f is usc,
f (x
0
) ≥ limsup
k→∞
f (x
n
k
) = lim
k→∞
f (x
n
k
)=M. Because x
0
∈ K, f (x
0
)
must be Tnite and because M is the supremum on K, f (x
0
)=M. Hence f
attains a maximum at x
0
∈ M.
Hereafter, we will formulate our optimization problem in terms of maxi-
mization (i.e given a functional f deTnedonasubsetK of a normed vector
space, Tnd max
xDK
f (x)). In this case the underlying assumptions for f are
upper semicontinuity and concavity. The problem of Tnding min
xDK
g(x)
where g is lower semicontinuous and convex can be transformed to maximiz-
ing one by substitution f = ?g because if g is lsc and convex, then ?g is
usc and concave (see Exercise 6.6.2 and 6.6.4).
Weak convergence
We assume that X is a complete normed vector space (i.e. a Banach space).
6.6. OPTIMIZATION OF NONLINEAR OPERATORS 267
DeTnition 558 Let hx
n
i be a sequence of elements in X. We say that hx
n
i
converges weakly to x
0
∈ X if for every continuous linear functional f ∈
X
?
we have hf,x
n
i→hf,x
0
i. We denote weak convergence with ?p?(rather
than the standard ?→?) and use notation <x
n
>p x
0
.
Since hf,x
n
i is a value of the functional f at point x
n
,thenhf,x
n
i
∞
n=1
is
the sequence of real numbers. It is easy to prove that weak convergence has
usual properties namely.
Exercise 6.6.7 Show that: (i) If <x
n
>p x
0
and <y
n
>p y
0
, then
<x
n
+y
n
>p x
0
+y
0
;(ii) Let hλ
n
i is a sequence of real numbers. If λ
n
→λ
0
and <x
n
>p x
0
, then λ
n
x
n
pλ
0
x
0
;(iii) If x
n
→x
0
, then <x
n
>p x
0
;(iv)
a weakly convergent sequence has a unique limit. Hint for (iv): Apply the
corollary to the Hahn Banach Theorem to <f,x?y>=0for each f ∈ X
?
.
Since there are now two types of convergence deTned on X, the original
one (i.e. with respect to its norm) is sometimes called strong convergence as
opposed to (the newly introduced) weak convergence. Property (iii)ofthe
exercise states that if a sequence converges strongly, then it also converges
weakly. This statement cannot be reversed in general. That means there
are sequences that converge weakly but not strongly. We will see this in the
following set of examples where we demonstrate weak convergence in some
Banach spaces.
Example 559 In the Tnite dimensional vector space R
n
, strong and weak
convergence coincides (i.e. for x
n
∈ R
n
,<x
n
>→ x
0
i? <x
n
>p x
0
).To
see this, let e
1
=(1,0,.....,0),e
2
=(0,1,.....,0),e
n
=(0,0,.....,1), and
hx
n
i px
0
. Then by deTnition hf,x
n
i → hf,x
0
i where f is a continuous
linear functional on R
n
. By Theorem 529 we know that each functional f is
represented by a scalar product i.e. given f there exists an element b of R
n
s.t.
hf,xi = hb,xi , ?xdX. (Just remeinder that hf,xi denotes the value of the
functional f at x (i.e. f (x))andhb,xi is the scalar product of b and x ). If
we substitute e
i
for b we have he
i
,x
n
i =0x
1
n
+...+1x
2
n
+...+0x
n
n
= x
i
n
→x
i
0
,
? i =1,2,...,n (i.e. the ith component of the vector x
n
tends to the ith
component of x
0
). Thus weak convergence in R
n
means convergence by com-
ponents. But Theorem 223 says that then hx
n
i→x
0
with respect to the norm
that means strongly.
268 CHAPTER 6. FUNCTION SPACES
Example 560 Let X = !
2
and hx
n
i px
0
. Then as in Example 559
hx
n
,e
i
i = x
i
n
→ hx
0
,e
i
i = x
i
0
, ?i =1,2,..... Thus weak convergence in
!
2
means that the i-th component of hx
n
i converges to the i-th component of
x
0
. But as Example 234 shows this doesn?t imply strong convergence in !
2
.
Example 561 Let X = C([a,b]). It can be shown that weak convergence of a
sequence of continuous functions hx
n
i px
0
means that:(i) hx
n
i is uniformly
bounded (i.e. ?B such that |x
n
(t)| ≤ B for all n =1,2,... and all t ∈
[a,b]); and (ii) hx
n
i → x pointwise on [a,b] (i.e. ?t ∈ [a,b] , hx
n
(t)i →
x(t) (as a sequence of real numbers)). Thus weak convergence in C([a,b])
is pointwise convergence (we can say convergence by components) whereas
strong convergence (convergence with respect to the sup norm) is uniform.
As Examples 166 and 167 show these two don?t always coincide.
Using weak convergence allows us to deTne weak closedness, weak com-
pactness, and weak continuity (or semicontinuity). We do it the same way we
did in Chapter 4 where all these notions were deTned in terms of sequences.
DeTnition 562 AsubsetK ? X is weakly closed if for any sequence
hx
n
i of elements from K that converges weakly to x
0
(i.e. hx
n
i px
0
), then
x
0
∈ K.
What is the relation between strong and weak closedness? While one
would expect that if a set is strongly closed then it is weakly closed, actually
the reverse is true.
Theorem 563 If K is weakly closed then K is (strongly) closed.
Proof. Let hx
n
i ? K and hx
n
i px
0
. then hx
n
i ?→ x
0
and because K is
weakly closed =? x
0
∈ K. Hence K is (strongly) closed.
To see that Theorem 563 cannot be reversed, we present the following
example.
9
Example 564 Let M ? !
2
where M =
?
he
i
i
∞
i=1
,e
i
=(0,0,...,1,0,...),i=1,2,....
a
.
M is closed (why?) but it is not weakly closed because he
i
i p h0i and
h0i /∈ M.
9
We cannot give an example in R
n
because in Tnite dimensional space weak closedness
and closedness, of course, coincide.
6.6. OPTIMIZATION OF NONLINEAR OPERATORS 269
Theorem 563 and Example 564 say that weak closedness is a stronger
assumption then strong closedness. Thus one should be careful in drawing
conclusions.
DeTnition 565 AsetK ? X is weakly compact if every inTnite sequence
from K contains a weakly convergent subsequence.
This deTnition 192 of sequential compactness is equivalent to the stan-
dard deTnition of compactness by Theorem 193 in metric spaces. After our
experience with closedness, one may wonder how weak and strong compact-
ness are related. But if weak compactness were stronger assumption than
strong compactness (as in the case of closedness) there would be fewer weakly
compact sets than compact sets. Then the whole idea of building the weak
topology would be useless because the main purpose of the introduction of the
weak topology is to make closed unit balls (weakly) compact. Fortunately,
it is not the case.
Theorem 566 If K ? X is (strongly) compact then it is weakly compact.
Proof. Let hx
n
i be a sequence in K. Because K is compact then there is a
convergent subsequence hx
n
k
i and x
0
∈ K such that hx
n
k
i→ x
0
. But strong
convergence implies weak convergence so that hx
n
k
i px
0
.
Theorem 566 cannot be reversed as the next example shows.
Example 567 Let K ? !
2
where K =
?
he
i
i
∞
i=0
,e
0
=(0,0,...,0,....),e
i
=(0,...,1,0,...)
a
(note that K = M ∪ {h0i} when M is from Example 564 ). K is weakly
compact because any sequence from K contains a weakly convergent sub-
sequence (To see this note that since <x
n
>p x
0
, we have x
i
n
→ x
i
0
?i =1,2,...,< x
i
n
>? {0,1} and {0,1} is compact in R. Then there is
x
i
0
and <x
i
n
k
> such that <x
i
n
k
>→ x
i
0
. Then x
0
=<x
i
0
,x
2
0
,...,x
i
0
,... > is
the point such that <x
i
n
k
>p x
0
.) But K is not compact because the dis-
tance between any two elements of K\{h0i} is
√
2. Hence there doesn?t exist
a convergent subsequence (with respect to the norm k·k
2
).
Theorem 568 If M is weakly compact then M is weakly closed.
Exercise 6.6.8 Prove Theorem 568.
270 CHAPTER 6. FUNCTION SPACES
DeTnition 569 Let M ? X and f be a functional deTned on M. We say
that f is weakly upper semicontinuous (usc) on M if for any x
0
∈ M
and any hx
n
i
∞
n=1
? M such that x
n
px
0
, then f (x
0
) ≥ lim
n→∞
supf (x
0
).
One can deTne weakly lower semicontinuity and weak continuity of func-
tionals in an analogous manner.
Again, there is a question about whether the assumption of weak up-
per semicontinuity of a functional is more restrictive than (strong) upper
semicontinuity.
Theorem 570 If f is a weakly usc functional on M, then f is usc.
Proof. Let x
0
∈ M and hx
n
i ? M such that hx
n
i → x
0
, then x
n
px
0
and
because f is weakly usc then lim
n→∞
supf (x
n
) ≤ f (x
0
)sothatf is usc.
Theconverseisnottrueasthenextexampleshows.
Example 571 Let X = L
2
[0,1] and f (a)=1+
R
1
0
a
2
(x)dx. This functional
is continuous (and hence usc) but is not weakly usc.
Exercise 6.6.9 Show the functional in Example 571 is continuous but not
weakly usc.
Now we are ready to prove an important theorem which is analogous to
the Extreme Value Theorem 262 but uses the concept of the weak topology.
A weak topology on X is a topology built in terms of weak convergence
instead of (strong) convergence.
Theorem 572 Let K be a non-empty weakly compact subset of a Banach
space X. Let f be a weakly upper semicontinuous functional on K.Thenf
attains its maximum on K. That is, ?x
0
∈ K such that
f (x
0
)=sup
x∈K
f (x)
Proof. By the supremum property, ?hx
n
i ? K such that lim
n→∞
f (x
n
)=
sup
x∈K
f (x). Since K is weakly compact, there exists a subsequence hx
n
k
i
and x
0
∈ K such that x
n
k
px
0
. Because f is weakly usc then
f (x
0
) ≥ lim
n→∞
supf (x
n
k
) = lim
n→∞
f (x
n
)=sup
x∈K
f (x).
6.6. OPTIMIZATION OF NONLINEAR OPERATORS 271
Obviously f (x
0
) ≤ sup
x∈K
f (x)becausex
0
∈ K. Thus combining these two
inequalities we have
f (x
0
)=sup
x∈K
f (x).
Comparing the assumptions of Theorem 572 with the Extreme Value
Theorem 262 we see that in Theorem 572 one assumption is weaker (that
being compactness) and one is stronger (that being semicontinuity). The
problem with this theorem is that it has basically non-veriTable assumptions.
How should one check weak compactness and weak semicontinuity in an
inTnite dimersional space? Our next step is to Tnd su?cient and at the
same time veriTable assumptsions that would guarantee weak compactness
of a set K and weak usc of a functional f.
Let?s start with weak compactness. We already know that (strong) com-
pactness is su?cient for weak compactness but we also know that it is too
restrictive in inTnite dimensional vector spaces. In order for a set K to be
weakly compact it has to be weakly closed (see Theorem 568). First we ex-
amine the conditions for a set K to be weakly closed. Theorem 563 says
that it must be closed but that?s not su?cient (see Example 564). There are
however quite simple assumptions that guarantee weak closedness of a set K.
Theorem 573 If K ? X is closed and convex then it is weakly closed.
Proof. Let hx
n
i ? K and hx
n
i px
0
. Then we need to show that x
0
∈
K. Assume the contrary, that is x
0
/∈ K. Then by Corollary 550 of the
Separation Theorem, there exists a non-zero continuous linear functional
f such that hf,x
0
i < inf
x∈K
hf,xi. Let hf,x
0
i = c and inf
x∈K
hf,xi = d
in which case c<d.Because f is a linear continuous functional we have
d ≤ lim
n?→∞
hf,x
n
i =<f,x
0
>= c<d.Hence d<dwhich is the desired
contradiction.
Theorem573 says that closedness andconvexity are su?cient assumptions
for weak closedness. However, we are looking for su?cient assumptions for
weak compactness. To make further progress, we have to restrict attention to
certain classes of normed vector spaces. In Section 6.4.2 we deTned a re?exive
space as a space for which X
??
= X (see DeTnition 533). We showed that,
for example, R
n
,!
p
,L
p
for 1 <p<∞ are re?exive whereas !
1
,!
∞
,L
1
,
L
∞
, C([a,b]) are not re?exive. From here on we will consider only re?exive,
normed vector spaces. Our next result is basically a Heine-Borel theorem for
inTnite dimensional spaces.
272 CHAPTER 6. FUNCTION SPACES
Theorem 574 (Eberlein-
?
Smuljan) A Banach space X is re?exive i? any
bounded weakly closed set K ? X is weakly compact.
Proof. Can be found in Aliprantis (1985, Theorem 10.13, p.156, Positive
Operators).
Thus in a re?exive Banach space, weak closedness and boundedness are
su?cient assumptions for weak compactness and in any Banach space closed-
ness and convexity are su?cient assumptions for weak closedness. Putting
all these together we have su?cient assumptions for a set K to be weakly
compact in a re?exive Banach space X.
Theorem 575 Let X be a re?exive Banach space and K ? X. If K is closed,
bounded and convex, then K is weakly compact.
Proof. Combine Theorems 573 and 574. Notice that all the assumtions of
the theorem are veriTable.
Corollary 576 In a re?exive space X, the closed unit ball is a weakly com-
pact set.
Proof. B
1
(0) = {x ∈ X : k·k ≤ 1} is a closed, bounded, and convex subset
of a re?exive space X. Hence by Theorem 575.
Let?s turn now to the assumption of a ?weakly upper semicontinuous
functional? and try to break it into veriTable parts. We Trst prove a lemma
thatgivesusanecessaryandsu?cient condition for a functional f to be
weakly upper semicontinuous.
Lemma 577 Let X beaBanachspaceandK ? X be weakly closed. Let a
functional f be deTned on M. Then f is weakly upper semicontinuous on K
i? ?a ∈R, E(a)={υ ∈ K : f (υ) ≥ a} is weakly closed.
Proof. (=?)Letf be weakly usc on M , a ∈ R, and hx
n
i
∞
n=1
? E(a)
such that x
n
px
0
∈ M. Then f (x
0
) ≥ limsup
n→∞
f (x
n
) ≥ a (because
x
n
∈ E(a) ?n ). Hence x
0
∈ E(a)andthusE(a)isweaklyclosed.
(?=) By contradiction. Let a ∈ R , E(a) be weakly closed, but f
not weakly usc. Then there exists x
0
∈ M and hx
n
i
∞
n=1
? M such that
<x
n
>p x
0
and limsup
n→∞
f (x
n
) >f(x
0
). Choose a ∈ R such that
limsup
n→∞
f (x
n
) >a>f(x
0
). Then there exists a subsequence hx
n
k
i of
hx
n
i such that x
n
k
∈ E(a), k =1,2,...Because E(a)isweaklyclosedand
6.6. OPTIMIZATION OF NONLINEAR OPERATORS 273
hx
n
k
i px
0
then x
0
∈ E(a). Thus f (x
0
) ≥ a>f(x
0
), which is a contradic-
tion.
Thus f is weakly usc if E(a)isweaklyclosed. ButbyTheorem573
we know that if a set is closed and convex, then it is weakly closed. When
is E(a) closed? Since E(a)isjusttheinverseimageoftheinterval[a,∞)
(i.e. E(a)={υ ∈ K : f (υ) ≥ a} = f
?1
([a,∞))), if f is continuous then the
inverse image of a closed set is closed (by a modiTcation of Theorem [200].
Thus if f is continuous, then E(a) is closed. When is E(a)isconvex?
Exercise 6.6.10 Show that if f is concave, then E(a) is convex.
Combining these two results we have su?cient conditions for weak upper
semicontinuity.
Theorem 578 A continuous, concave functional f deTned on a closed, con-
vex set K ? X is weakly upper semicontinuous.
Now when we combine Theorems 575 and 578 with Theorem 572 we get a
theorem that guarantees the existence of a maximum and all its assumptions
are ?easily? veriTable.
Theorem 579 Let K be a non-empty, convex, closed and bounded subset
of a re?exive Banach space. Let f be a continuous and concave functional
deTned on R. Then f attains its maximum on K (i.e. ?x
?
∈ K such that
f (x
?
)=sup
x∈K
f (x)).
First we note that minimization requires convexity of the functional f
instead of concavity while all other assumptions are the same. Second, we
want to stress that this is a nonlinear optimization problem. This means the
functional f doesn?t have to be linear (which is quite restrictive). The func-
tional f simply has to be continuous and concave in the case of maximization
and continuous and convex in the case of minimization.
Let us summarize what we have done in this section. In inTnite dimen-
sional vector spaces the original Extreme Value Theorem 262 (requiring semi-
continuity of a functional and compactness of a set) which guarantees the
existence of an optimum cannot be used since the assumption of compactness
is too stringent (compact sets don?t contain interior points). By introducing
the weak topology on X we deTne weak semicontinuity (an assumption that
is stronger than the continuity) and weak compactness (an assumption the
274 CHAPTER 6. FUNCTION SPACES
is weaker than compactness). We also must enlist the extra assumptions of
concavity (or convexity) of a functional and re?exivity of the space X. Then
we showed that with these modiTed assumptions an analogue of the Extreme
Value Theorem holds and this version ?covers? more optimization problems
(e.g. optimizing over unit balls).
6.6.2 Dynamic Programming
An important and frequently used example of operators is dynamic program-
ming. In inTnite horizon problems, dynamic programming turns the problem
of Tnding an inTnite sequence (or plan) describing the evolution of a vector
of (endogenous) state variables into simply choosing a single vector value for
the state variables and Tnding the solution to a functional equation.
More speciTcally, suppose the primitives of the problem are as follows.
Let X denote the set of possible values of (endogenous) state variables with
typical element x. We will assume that X ?R
n
is compact and convex. Let
Γ : X 3 X be the constraint correspondence describing feasible values for
the endogenous state variable. We will assume Γ(x)isnonempty,compact-
valued, and continuous. Let G = {(x,y) ∈ X × X : y ∈ Γ(x)} denote the
graph of Γ. Let r : G →R denote the per-period objective or return function
which we assume is continuous. Finally let β ∈ (0,1) denote the discount
factor. Thus, the ?givens? for the problem are X,Γ,r,β.Inthissectionwe
establish under what conditions solutions to the functional equation (FE)
v(x
0
)= max
y∈Γ(x
0
)
r(x
0
,y)+βv(y)(FE)
?solve? the sequence problem that is our ultimate objective
max
<x
t+1
>
∞
t=0
∞
X
t=0
β
t
r(x
t
,x
t+1
)(SP)
s.t. x
t+1
∈Γ(x
t
),?t, and x
0
given.
First we should establish what we mean by ?solve?. To begin with, we
need to know that (SP) is well deTned. That is, we must establish conditions
under which the feasible set is nonempty and the objective function is well
deTned for all points in the feasible set. To accomplish this, we need to
6.6. OPTIMIZATION OF NONLINEAR OPERATORS 275
introduce some more notation. Call the sequence <x
t
> a plan.Given
x
0
∈ X,let
F(x
0
)={<x
t
>
∞
t=0
: x
t+1
∈Γ(x
t
),t=0,1,...}
be the set of feasible plans from x
0
with typical element χ =(x
0
,x
1
,...) ∈
F(x
0
). Let ?
k
: F(x
0
) →R be given by
?
k
(χ)=
k
X
t=0
β
t
r(x
t
,x
t+1
)
which is simply the discounted partial sum of returns from any feasible plan
χ. Finally, let ? : F(x
0
) → R be given by ?(χ)=lim
k→∞
?
k
(χ). While
?
k
is obviously well deTned, ? may not be since there may be χ such that
?(χ)=±∞.
10
The assumption that Γ(x) 6= ?, ?x ∈ X ensures that F(x
0
)
is nonempty for all x
0
∈ X. The assumptions that X is compact and Γ is
compact-valued and continuous guarantees |r(x
t
,x
t+1
)| ≤ M<∞ so that
since β ∈ (0,1) we have |?(χ)| ≤
M
(1?β)
< ∞, ?χ ∈ F(x
0
),?x
0
. Hence SP is
well deTned and we can deTne the function v
?
: X →R given by
v
?
(x
0
)= max
χ∈F(x
0
)
?(χ)(?)
which is just (SP). Thus by ?solve? we mean that v
?
(x
0
)deTned in (SP?) is
equal to v(x
0
)deTned in (FE).
Before providing conditions under which a solution to (FE) implies a ?so-
lution? to (SP), we note the following consequences of the maximum function
deTned in (SP?). In particular, by DeTnition 96 we have
v
?
(x
0
) ≥ ?(χ),?χ ∈ F(x
0
) (6.16)
and ?ε>0,
v
?
(x
0
) <?(χ)+ε,for some χ ∈ F(x
0
). (6.17)
Similarly, v satisTes (FE) if
v(x) ≥ r(x,y)+βv(y),?y ∈Γ(x) (6.18)
and ?ε>0,
v(x) <r(x,y)+βv(y)+ε,for some y ∈Γ(x). (6.19)
Now we are ready to prove our main result that if we have a solution to
(FE), then we have a solution to (SP).
10
More generally Stokey and Lucas (1989) consider ? in the extended reals.
276 CHAPTER 6. FUNCTION SPACES
Theorem 580 If v is a solution to (FE) and satisTes
lim
k→∞
β
k
v(x
k
)=0,? <x
t
>∈ F(x
0
),?x
0
∈ X, (6.20)
then v = v
?
.
Proof. It su?ces to show that if (6.18) and (6.19) hold, then (6.16) and
(6.17) are satisTed. Inequality (6.18) implies that ?χ ∈ F(x
0
),
v(x
0
) ≥ r(x
0
,x
1
)+βv(x
1
)
≥ r(x
0
,x
1
)+β[r(x
1,
x
2
)+βv(x
2
)]
≥ ?
k
(χ)+β
k+1
v(x
k+1
),k=1,2...
Taking the limit as k →∞and using (6.20), we have (6.16).
Fix ε>0andchoose<δ
t
>
∞
t=1
?R
+
such that
P
∞
t=1
β
t?1
δ
t
≤ ε. Inequal-
ity (6.19) implies there exists x
1
∈Γ(x
0
),x
2
∈Γ(x
1
),...so that
v(x
t
) ≤ r(x
t
,x
t+1
)+βv(x
t+1
)+δ
t+1
,t=0,1,...
Then
v(x
0
) ≤ r(x
0
,x
1
)+βv(x
1
)+δ
1
≤ r(x
0
,x
1
)+β[r(x
1
,x
2
)+βv(x
2
)+δ
2
]+δ
1
≤ ?
k
(χ)+β
k+1
v(x
k+1
)+
k
X
t=1
β
t?1
δ
t
,k=1,2,...
Taking the limit as k →∞and using (6.20), we have (6.17).
Next we establish that a feasible plan which satisTes (FE) is an optimal
plan in the sense of (SP).
Theorem 581 Let χ
?
∈ F(x
0
) be a feasible plan from x
0
which satisTes the
functional equation
v
?
(x
?
t
)=r(x
?
t
,x
?
t+1
)+βv
?
(x
t+1
) (6.21)
with
lim
t→∞
maxβ
t
v
?
(x
?
t
) ≤ 0, (6.22)
then χ
?
attains the maximum in (SP) for initial state x
0
.
6.6. OPTIMIZATION OF NONLINEAR OPERATORS 277
Proof. It follows by an induction on (6.21) that
v
?
(x
0
)=r(x
0
,x
?
1
)+βv
?
(x
?
1
)
= r(x
0
,x
?
1
)+β[r(x
?
1
,x
?
2
)+βv
?
(x
?
2
)]
= ?
k
(χ
?
)+β
k+1
v
?
(x
?
k+1
),k=1,2,...
Then as k →∞using (6.22) we have v
?
(x
0
) ≤ ?(χ
?
). Since χ
?
∈ F(x
0
), we
have v
?
(x
0
) ≥ ?(χ
?
)by(6.16).Thusχ
?
attains the maximum.
Now that we have established that a solution v to (FE) is a solution to
the (SP) problem we are interested in, we set out to establish the existence
of a solution to (FE). Since r is a real valued, bounded, and continuous func-
tion, it makes sense to look for solutions in the space of continuous bounded
functions C(X)withthesupnormkvk =sup{|v(x)|,x∈ X} studied in sec-
tion 6.1. Furthermore, given a solution v ∈ C(X), we can deTne the policy
correspondence γ : X → X by
γ(x)={y ∈Γ(x):v(x)=r(x,y)+βv(y)}. (6.23)
This generates a plan since given x
0
, we have x
1
= γ(x
0
),x
2
= γ(x
1
),...
To this end, we deTne an operator T : C(X) →C(X)givenby
(Tf)(x)= max
y∈Γ(X)
[r(x,y)+βf(y)]. (6.24)
In this case (FE) becomes v = Tv. That is, all we must establish is that T
has a unique Txed point in C(X).
Before actually doing that, we provide a simple set of su?cient conditions
to establish a given operator is a contraction.
Lemma 582 (Blackwell?s su?cient conditions for a contraction) Let
X ?R
n
and B(X,R) be the space of bounded functions f : X →R with the
sup norm. Let T : B(X,R) →B(X,R) be an operator satisfying: (i) (mono-
tonicity) f,
e
f ∈ B(X,R) and f(x) ≤
e
f(x) implies (Tf)(x) ≤ (T
e
f)(x),?x ∈
X;(ii) (discounting) ?ρ ∈ (0,1) such that [T(f +a)](x) ≤ (Tf)(x)+ρa,a ≥
0,x∈ X.
11
Then T is a contraction with modulus ρ.
Proof. For any f,
e
f ∈ B(X,R),f≤
e
f +
°
°
°f ?
e
f
°
°
° wherewewritef ≤
e
f if
f(x) ≤
e
f(x),?x ∈ X. Then
Tf≤ T
3
e
f +
°
°
°f ?
e
f
°
°
°
′
≤ T
e
f +ρ
°
°
°f ?
e
f
°
°
°
11
Note (f +a)(x)=f(x)+a.
278 CHAPTER 6. FUNCTION SPACES
where the Trst inequality follows from (i) and the second from (ii). Reversing
the inequality gives T
e
f ≤ Tf+ρ
°
°
°
e
f ?f
°
°
°. Combining both inequalities gives
°
°
°Tf?T
e
f
°
°
°≤ ρ
°
°
°f ?
e
f
°
°
°.
Now we have the second main theorem of this section.
Theorem 583 In (C(X),k·k),Tgiven in (6.24) has the following properties:
T(C(X)) ?C(X); Tv= v ∈C(X); and ?v
0
∈C(X),
kv?T
n
v
0
k≤
β
n
(1?β)
kTv
0
?v
0
k,n=0,1,2... (6.25)
Furthermore, given v,the optimal policy correspondence γ : X → X deTned
in (6.23) is compact-valued and u.h.c.
Proof. For each f ∈C(X)andx ∈ X, the problem in (6.24) is to maximize
a continuous function [r(x,·)+βf(·)] on a compact set Γ(x). Hence, by the
Extreme Value Theorem 262, the maximum is attained. Since both r and
f are bounded, Tf is also bounded. Since r and f are continuous and Γ, it
follows from the Theorem of the Maximum 295 that Tf is continuous. Hence
T(C(X)) ?C(X).
It is clear that T satisTes Blackwell?s su?cient conditions for a contraction
(Lemma 582) since: (i) for f,
e
f ∈ C(X)withf(x) ≤
e
f(x),?x ∈ X, by T
given in (6.24) we have (Tf)(x) ≤ (T
e
f)(x) ? max
y∈Γ(X)
[r(x,y)+βf(y)] ≤
max
y∈Γ(X)
h
r(x,y)+β
e
f(y)
i
; and (ii)
T(f +a)(x)= max
y∈Γ(X)
[r(x,y)+β(f(y)+a)]
=max
y∈Γ(X)
[r(x,y)+βf(y)] +βa
=(Tf)(x)+βa.
Since C(X) is a complete normed vector space by Theorem 452 and T is a
contraction, then T has a unique Txed point v ∈ C(X) by the Contraction
MappingTheorem306whichsatisTes (6.25). The properties of γ follow from
the Theorem of the Maximum 295.
If we want to say more about v (and γ), we need to impose more structure
on the primitives. The next theorem illustrates this.
6.6. OPTIMIZATION OF NONLINEAR OPERATORS 279
Theorem 584 For each y,letr(·,y)be strictly increasing in each of its Trst
n arguments and let Γ be monotone in the sense that x ≤ x
0
implies Γ(x) ?
Γ(x
0
). Then v given by the solution to (FE) is strictly increasing.
Proof. Let
b
C(X) ?C(X) be the set of bounded, continuous, nondecreasing
functions and
e
C(X) ?
b
C(X) be the set of bounded, continuous, strictly
increasing functions. Since
b
C(X) is a closed subset of the Banach space
C(X),Corollary 307 and Theorem 583 implyit is su?cient to showT(
b
C(X)) ?
e
C(X), which is guaranteed by the assumptions on r and Γ.
Existence of solutions with unbounded returns
As we mentioned in the introduction, one application of dynamic optimiza-
tion in inTnite dimensional spaces is the growth model. In that case the en-
dogenous state variable is capital, denotedk
t
at any point in time t =0,1,2...,
with k
t
∈ R
+
and k
0
> 0 given. It is typically not the case that we assume
k
t
lies in a compact set, which is very di?erent from the assumptions of the
previous section. That is, the previous section relied heavily on the fact
that the return function was bounded (sothatwecouldworkinthespaceof
bounded functions).
To address this problem, here we will consider a speciTcexample.There
is a linear production technology where output y
t
= Ak
t
, A>0. Capital
depreciates over a period at rate δ>0. Assume that
e
A = A +(1?δ) > 0.
A household is risk neutral (i.e. u(c
t
)=c
t
where c
t
denotes consumption
at time t) and discounts the future at rate β. Assume that β
?1
>
e
A. Since
utility is strictly increasing in consumption, there is no free disposal and
the budget constraint implies that c
t
= Ak
t
+(1? δ)k
t
? k
t+1
. Hence the
household?s reward function is given by r(k
t
,k
t+1
) ≡ u(c
t
)=
e
Ak
t
?k
t+1
.
The problem of the household is to choose a sequence of capital stocks to
maximize the present discounted value of future rewards which is just
v
?
(k
0
)= sup
<k
t+1
>
∞
X
t=0
β
t
[
e
Ak
t
?k
t+1
](SP)
s.t. 0 ≤ k
t+1
≤
e
Ak
t
and k
0
> 0given.
We will attack this problem in several steps.
280 CHAPTER 6. FUNCTION SPACES
1. Show that the constraint correspondenceΓ : R
+
→R
+
given byΓ(k
t
)=
{k
t+1
∈R
+
: k
t+1
∈ [0,
e
Ak
t
]} is nonempty, compact-valued, continuous,
and for any k
t
∈ R
+
, k
t+1
∈ Γ(k
t
)impliesλk
t+1
∈ Γ(λk
t
)forλ ≥ 0.
Furthermore, show that
for some α ∈ (0,β
?1
),k
t+1
≤ αk
t
,?k
t
∈R
+
and ?k
t+1
∈Γ(k
t
). (6.26)
Nonempty: k
t+1
=0∈ Γ(k
t
). Compact: By Heine-Borel, it is suf-
Tcient to check that Γ is closed and bounded. Boundedness follows
since given k
t
,k
t+1
∈ [0,
e
Ak
t
]. To see Γ is closed, suppose <x
n
> is
a sequence such that x
n
∈ Γ(k
t
)andx
n
→ x. But x
n
∈ Γ(k
t
)implies
x
n
∈ [0,
e
Ak
t
]andx
n
→ x implies x ∈ [0,
e
Ak
t
]. Thus x ∈Γ(k
t
)sothatΓ
is closed. Continuous: One way to establish this is to show Γ is uhc
and lhc. On the other hand, it is clear that since the upper endpoint is
linear in k
t
, it is continuous in k
t
and hence Γ(k
t
)iscontinuous.Ho-
mogeneity: If k
t+1
∈ Γ(k
t
), then 0 ≤ k
t+1
≤
e
Ak
t
. Multiplying by λ
implies 0 ≤ λk
t+1
≤ λ
e
Ak
t
=
e
Aλk
t
or λk
t+1
∈ Γ(λk
t
). Existence of
α :Sincek
t+1
∈ Γ(k
t
), then k
t+1
≤
e
Ak
t
. Hence just take α =
e
A. By
assumption,
e
A<β
?1
so α ∈ (0,β
?1
).
2. Show that the conditions you proved in part 1 implies that for any k
0
,
k
t
≤ α
t
k
0
,
?t and for all feasible plans <k
t+1
>∈ F(k
0
)={<k
t+1
>: k
t+1
∈
Γ(k
t
),t=0,1,...},the set of plans feasible from k
0
. To see this , from
1, k
t+1
≤
e
Ak
t
= αk
t
≤ α(αk
t?1
)=α
2
k
t?1
≤ ... ≤ α
t+1
k
0
. Notice that
k
t
can be growing over time, though at rate less than β
?1
.
3. Let G = {(k
t
,k
t+1
) ∈R
+
×R
+
: k
t+1
∈Γ(k
t
)}. Show that r : G →R
+
is continuous and homogeneous of degree one. Show that
e
Ak
t
?k
t+1
≥
0,?t and ?B ∈ (0,∞) such that
e
Ak
t
?k
t+1
≤ B(k
t
+k
t+1
), ?(k
t
,k
t+1
) ∈ G. (6.27)
Given that (k
t
,k
t+1
) ∈ R
+
×R
+
, (6.27) assures a uniform bound on
the ratio of the return function u and the norm of its arguments. Con-
tinuous: We must show that ?ε>0, ?δ(k
t
,k
t+1
,ε) > 0 such that
if
q
(k
0
t
?k
t
)
2
+(k
0
t+1
?k
t+1
)
2
<δ, (6.28)
6.6. OPTIMIZATION OF NONLINEAR OPERATORS 281
then ˉ
ˉ
ˉ
3
e
Ak
0
t
?k
0
t+1
′
?
3
e
Ak
t
?k
t+1
′ˉ
ˉ
ˉ <ε. (6.29)
But
ˉ
ˉ
ˉ
3
e
Ak
0
t
?k
0
t+1
′
?
3
e
Ak
t
?k
t+1
′ˉ
ˉ
ˉ≤
e
A
ˉ
ˉ
k
0
t
?k
t
ˉ
ˉ
+
ˉ
ˉ
k
0
t+1
?k
t+1
ˉ
ˉ
by the triangle inequality. If (6.28) is satisTed, then
e
A
ˉ
ˉ
k
0
t
?k
t
ˉ
ˉ
≤
e
A
q
(k
0
t
?k
t
)
2
+(k
0
t+1
?k
t+1
)
2
<
e
Aδ
and
ˉ
ˉ
k
0
t+1
?k
t+1
ˉ
ˉ
≤
q
(k
0
t
?k
t
)
2
+(k
0
t+1
?k
t+1
)
2
<δ.
Hence let δ =max
n
ε
2
,
ε
2
e
A
o
.Homogeneity: r(λk
t
,λk
t+1
)=
e
Aλk
t
?
λk
t+1
= λ
h
e
Ak
t
?k
t+1
i
= λr(k
t
,k
t+1
). Nonnegative returns: Since
k
t+1
∈ [0,
e
Ak
t
], we know
e
Ak
t
?k
t+1
≥ 0, ?t. Boundedness: Inequality
(6.27) is established since
e
Ak
t
?k
t+1
≤
e
Ak
t
+k
t+1
≤ max{
e
A,1}(k
t
+k
t+1
),?(k
t
,k
t+1
) ∈ G
where B =max{
e
A,1}.
4. Show that the conditions you proved in the previous parts imply that
for any k
0
and ? <k
t+1
>∈ F(k
0
),
lim
n→∞
Σ
n
t=0
β
t
[
e
Ak
t
?k
t+1
]
exists. This, along with the prior conditions you have proven, estab-
lishes that a solution to (SP) satisTes the functional equation
v(k
t
)= sup
k
t+1
∈Γ(k
t
)
[
e
Ak
t
?k
t+1
]+βv(k
t+1
). (FE)
We start by noting
n
X
t=0
β
t
[
e
Ak
t
?k
t+1
] ≤
n
X
t=0
β
t
B[k
t
+k
t+1
]
≤ B
n
X
t=0
β
t
[α
t
k
0
+α
t+1
k
0
]
= Bk
0
(1 +α)
n
X
t=0
(αβ)
t
≤ Bk
0
(1 +α)/(1?αβ)
282 CHAPTER 6. FUNCTION SPACES
where the Trst inequality follows from part 3, the second from part 2,
andthethirdsinceαβ < 1frompart1. Since
P
n
t=0
β
t
[
e
Ak
t
?k
t+1
]is
increasing and bounded, the limit exists.
5. Showthatv
?
deTnedin(SP) is homogeneous of degree one (i.e. v
?
(θk
0
)=
θv
?
(k
0
)) and that for some η ∈ (0,∞), |v
?
(k
0
)| ≤ ηk
0
, for any k
0
. To
see this, as in part 2, consider <k
t+1
>∈ F(k
0
)andLetu(<k
t+1
>
) = lim
n→∞
P
n
t=0
β
t
[
e
Ak
t
? k
t+1
]. Then v
?
(k
0
)=sup
<k
t+1
>∈F(k
0
)
u(<
k
t+1
>). For θ>0,θk
0
∈R
+
since R
+
is a convex cone. Furthermore,
k
1
∈ Γ(k
0
) ? θk
1
∈ Γ(θk
0
) as established in part 1. Continuing in
this fashion we can show that ? <k
t+1
>∈ F(k
0
), <θk
t+1
>∈ F(θk
0
).
Homogeneity: For θ>0,
v
?
(θk
0
)= sup
<θk
t+1
>∈F(θk
0
)
u(<θk
t+1
>)
=sup
<θk
t+1
>∈F(θk
0
)
(
lim
n→∞
n
X
t=0
β
t
[
e
Aθk
t
?θk
t+1
]
)
=sup
<θk
t+1
>∈F(θk
0
)
(
lim
n→∞
θ
n
X
t=0
β
t
[
e
Ak
t
?k
t+1
]
)
= θv
?
(k
0
).
Boundedness: Let D = B(1 + α)/(1?αβ) > 0. Then ? <k
t+1
>∈
F(k
0
), it was shown in part 4 that
|u(<k
t+1
>)| =
ˉ
ˉ
ˉ
ˉ
ˉ
lim
n→∞
n
X
t=0
β
t
[
e
Ak
t
?k
t+1
]
ˉ
ˉ
ˉ
ˉ
ˉ
≤ Dk
0
.
Thus, ? <k
t+1
>∈ F(k
0
),
|v
?
(k
0
)| =
ˉ
ˉ
ˉ
ˉ
ˉ
sup
<k
t+1
>∈F(k
0
)
u(<k
t+1
>)
ˉ
ˉ
ˉ
ˉ
ˉ
≤ sup
<k
t+1
>∈F(k
0
)
|u(<k
t+1
>)|≤ Dk
0
.
6. As a converse to the above results, next consider seeking solutions to
(FE) in the space of functions (denoted H(R
+
)) that are continuous,
homogeneous of degree 1, and bounded in the sense that if f ∈ H(R
+
),
then
|f(k
t
)|
k
t
< ∞. This notion of boundedness is consistent with DeTni-
tion 522. Endow the space with the operator norm, which by Theorem
6.6. OPTIMIZATION OF NONLINEAR OPERATORS 283
518 can be of the form
kfk =sup
?
|f(k
t
)|
k
t
,k
t
∈R
+
,k
t
6=0
?
. (6.30)
We must verify that k·k on the vector space H(R
+
)satisTes the prop-
erties of a norm and that (H(R
+
),k·k)iscomplete. Normed Vec-
tor Space: We must show: (a)kfk ≥ 0 with equality i? f =0;(b)
kafk = |a|·kfk;and(c)kf +f
0
k≤kfk+kf
0
k. Complete: Consider
any Cauchy sequence <f
n
> in H(R
+
). For all x ∈ R
+
, <f
n
(x) >
is Cauchy in R and thus has a limit. DeTne f : R
+
→ R by f(x)=
lim
n→∞
f
n
(x), ?x ∈ R
+
. We must show that f ∈ H(R
+
)byverify-
ing that (i) f is homogeneous of degree 1; (ii) bounded in the sense
that
|f(x)|
x
< ∞;and(iii)f is continuous. Starting with (i), for λ>0
and x ∈ R
+
, since f
n
∈ H(R
+
)?n we have f(λx) = lim
n→∞
f
n
(λx)=
lim
n→∞
λf
n
(x)=λf(x). Next, for (ii) we know that f is bounded
since < kf
n
k > is a convergent sequence in R and thus bounded (i.e.
kfk≤kf
N
k+1forsomeN ∈N). Finally, for (iii) note that <f
n
>→ f
uniformly by the Cauchy criterion and that H(R
+
) ? C(R
+
). But uni-
form convergence of continuous functions implies the limit function is
continuous.
7. Show that for any v ∈ H(R
+
),lim
n→∞
β
t
v(k
t
) = 0 which establishes
the conditions necessary to prove that a solution to (FE) implies a
solution to (SP). It follows directly from parts 2 and 6 that for any
f ∈ H(R
+
),
|f(k
t
)|≤ k
t
·kfk≤ α
t
k
0
so that since αβ < 1, lim
n→∞
β
t
v(k
t
)=0.
8. DeTne an operator T on H(R
+
)by
(Tf)(k)= sup
k
0
∈Γ(k)
h
e
Ak?k
0
+βf(k
0
)
i
. (6.31)
Show that T maps functions in H(R
+
) to functions in H(R
+
). Conti-
nuity: Since r and f are continuous by part 3 and f ∈ H(R
+
), and Γ
is compact valued, we know Tf is continuous by the Theorem of the
Maximum. Boundedness: Since r and f are bounded by part 3 and
284 CHAPTER 6. FUNCTION SPACES
f ∈ H(R
+
), Tf is bounded. Homogeneity: For θ>0,
(Tf)(θk)= sup
θk
0
∈Γ(θk)
h
e
Aθk?θk
0
+βf(θk
0
)
i
= θ sup
θk
0
∈Γ(θk)
h
e
Ak?k
0
+βf(k
0
)
i
= θ sup
k
0
∈Γ(k)
h
e
Ak?k
0
+βf(k
0
)
i
= θ(Tf)(k)
since θk
0
∈Γ(θk) ?? k
0
∈Γ(k)bypart1.
9. Show that T satisTes Blackwell?s su?cient conditions for a contraction
(and hence there exists a unique Txed point of (FE) v = Tv).
6.7 Appendix - Proofs for Chapter 6
Proof of Dini?s Theorem 453. Let <f
n
> be decreasing, f
n
→ f
pointwise, and deTne f
n
= f
n
?f.Then
-
f
n
?
is a decreasing sequence of
non-negative functions with f
n
→ 0 pointwise. For a given x ∈ X and ε>0,
?N(ε,x)suchthat0≤ f
N(ε,x)
(x) <ε.Since f
N(ε,x)
is continuous ?δ(x)such
that 0 ≤ f
N(ε,x)
(x
0
) <εfor all x
0
∈ B
δ(x)
(x). Since f
n
is decreasing, 0 ≤
f
n
(x
0
) <εfor all n ≥ N(ε,x)wehavex
0
∈ B
δ(x)
(x). Since the collection
{B
δ(x)
(x),x∈ X} is an open covering of X,thereexistsaTnite subcovering
of X (i.e. X = ∪
k
i=1
B
δ(x
i
)
(x
i
)). DeTne N(ε)=min
i=1,...,k
{N(ε,x
i
)} which is
well deTned since N(ε)isjusttheminimumofaTnite set. For a given ε,we
found N(ε) such that 0 ≤ f
n
(x) <εfor all n ≥ N(ε)and for all x ∈ X (i.e.
f
n
→ 0uniformlysothatf
n
→f).
Proof of Lemma 456. (?=) This direction is apparent. ( =?)Let
D ? C (X) be equicontinuous. Then given x ∈ X and ε>0, ?δ(ε,x)
such that |h(x
0
) ? h(x)| <
ε
2
for all x
0
such that d
X
(x,x
0
) <δfor all
h ∈ D. The collection of open balls
n
B1
2
δ(x,
ε
2
)
(x),xdX
o
is an open cover-
ing of X and since X is compact there exists Tnitely many x
1
,....,x
k
s.t.
n
B1
2
δ(x
i
,
ε
2
)
(x
i
),i=1,...,k
o
covers X. Let δ ≡
1
2
min
?
δ
?
x
i
,
ε
2
¢
,i=1,...,k
a
.
For x ∈ X,then?isuch that x ∈ B
δ
(
x
i
,
ε
2
)
(x
i
).Let y ∈ X such thatd
X
(x,y) ≤
6.7. APPENDIX - PROOFS FOR CHAPTER 6 285
δ. Then d
X
(y,x
i
) ≤ d
X
(y,x)+d
X
(x,x
i
) ≤ δ+
1
2
δ
?
x
i
,
ε
2
¢
≤ δ
?
x
i
,
ε
2
¢
. There-
fore for anyh ∈D,|h(x)?h(y)|≤|h(x)?h(x
i
)|+|h(x
i
)?h(y)|≤
ε
2
+
ε
2
= ε
since D is equicontinuous at x
i
.HenceD is uniformly equicontinuous.
Proof of Lemma 457. (?=) Suppose that D is totally bounded. Let
ε>0 be given and choose positive numbers ε
1
and ε
2
such that 2ε
1
+ ε
2
≤
ε. Total boundedness of D implies that there exist Tnitely many functions
f
1
,.....,f
n
such that the collection of open balls {B
ε
1
(f
i
),i=1,...,n} covers
D. Fix x
0.
Because {f
i,
i =1,...,n} is equicontinuous at x
0
(since a Tnite
subset of continuous functions is equicontinuous), there exists δ>0such
that d
Y
(f
i
(x),f
i
(x
0
)) <ε
2
for all x such that d(x,x
0
) <δand for all
i =1,...,n. To prove that D is equicontinous at x
0
we need to show that
d
Y
(f (x),f(x
0
)) <εfor all x such that d
X
(x,x
0
) <δand all f ∈ D.
Let f ∈ D.Because{B
ε
1
(f
i
),i=1,...,n} covers D,then? f
i
such that
f ∈ B
ε
1
(f
i
). By the triangle inequality
d
Y
(f (x),f(x
0
)) ≤ d
Y
(f (x),f
i
(x)) +d
Y
(f
i
(x),f
i
(x
0
)) +d
Y
(f
i
(x
0
),f(x
0
))
≤ ε
1
+ε
2
+ε
1
≤ ε
holds true for all x such that d
X
(x,x
0
) <δand for all f ∈ D. Notice that
this direction doesn?t require compactness of either X nor Y, hence total
boundedness always implies equicontinuity.
(=?) Suppose D is equicontinuous. Given ε>0wewishtocover
D by Tnitely many open ε? balls. Choose ε
1
and ε
2
such that 2ε
1
+
ε
2
≤ ε.Using equicontinuity of D at x ∈ X, given ε
1
, ?δ(ε
1
,x)suchthat
for x
0
∈ B
δ(ε
1
,x)
(x),d(f(x
0
),f(x)) <ε
1
for all f ∈ D.Thecolection
?
B
δ(ε
1
,x)
(x),x∈ X
a
is an open covering of X. Since X is compact there
exist Tnitely many x
1
,...,x
k
such that
?
B
δ(ε
1
,x
i
)
(x
i
),i=1,....,k
a
covers X
and d
Y
(f (x),f(x
i
)) <ε
1
holds for x ∈ B
δ(ε
1
,x
i
)
(x
i
)andallf ∈ D.Now
cover Y by Tnitely many open balls {B
ε
2
(y
j
),j=1,....,m}. Let J be the
set of all functions α : {1,...,k} → {1,...,m}. The set J is Tnite. Given
α ∈ J, if there exists a function f ∈ D such that f(x
i
) ∈ B
ε
2
(y
α(i)
)for
each i =1,...,k,choose one such function and label it f
α
. The Tnite colle-
cion of open balls {B
ε
(f
α
),α∈ J} with ε ≤ 2ε
1
+ ε
2
covers D.Foreach
i =1,,...,k,choose an integer α(i) such that f(x
i
) ∈ B
ε
2
(y
α(i)
). For this
index α,the ε-ball around f
α
contains f (i.e. f ∈ B
ε
(f
α
)). Let f ∈D.Then
f (x
i
) ∈ B
ε
2
3
y
j
(i)
′
for i =1,....,k because {B
ε
2
(y
j
),j=1,....,m} covers all
of Y (which is possible since Y is compact and thus totally bounded). DeTne
286 CHAPTER 6. FUNCTION SPACES
the function f
j
(i)
such that f
j
(i)
(x
i
)dB
ε
2
3
y
j
(i)
′
for i =1,....,k. Let x ∈ X.
Choose i such that x ∈ B
δ(ε
1
,x
i
)
(x
i
). Then
d
Y
3
f (x),f
j
(i)
(x)
′
≤ d(f (x),f(x
i
)) +d(f (x
i
),f
α
(x
i
)) +d(f
α
(x
i
),f
α
(x))
≤ ε
1
+ε
2
+ε
1
<ε.
Proof of Lemma 467. By construction. DeTne P
n
(x)on[?1,1] by
induction: P
1
(x)=0andP
n+1
(x)=P
n
(x)+
1
2
(x
2
?P
2
n
(x)),?n ∈N.
To prove that P
n
(x) →|x| on [?1,1] uniformly, we will use Dini?s Lemma
453. In that case we must check: (i) P
n
(x) ≤ P
n+1
(x), ?x ∈ [?1,1]; and (ii)
P
n
(x) converges to |x| pointwise on [?1,1]. We check that 0 ≤ P
n
(x) ≤
P
n+1
(x) ≤|x|, ?x ∈ [?1,1] by induction.
To show <P
n
> is non-decreasing, suppose it holds for n ≥ 1(n =1
is clear). Then P
n+2
(x)=P
n+1
(x)+
1
2
?
x
2
?P
2
n+1
(x)
¢
≥ P
n+1
(x) because
0 ≤ P
n+1
(x) ≤|x|? P
2
n+1
(x) ≤|x|
2
.
To show P
n+2
≤|x|,use the identity
P
n+2
= |x|?(|x|?P
n+1
(x))
μ
1?
1
2
[|x|+P
n+1
(x)]
?
.
Since |x| ? P
n+1
(x) ≥ 0 by assumption, |x| + P
n+1
(x) ≤ 2|x| and hence
1?
1
2
[|x|+P
n+1
(x)] ≥ 0.
Thus the sequence <P
n
(x) > is increasing and bounded ?x ∈ [?1,1]
and therefore it converges to a function f(x). Taking the limit of P
n+1
(x)=
P
n
(x)+
1
2
(x
2
?P
2
n
(x)) yields f = f+
1
2
(x
2
?f
2
) which implies f
2
(x)=x
2
or
f(x)=|x| (which we know is continuous). By Dini?s lemma 453 <P
n
(x) >
converges to |x| uniformly on [?1,1].
Proof of Schauder?s Fixed Point Theorem 475.
Since K is compact, K is totally bounded. Hence, given any ε>0, there
exists a Tnite set{y
i
,i=1,...,n}such that the collection{B
ε
(y
i
),i=1,...,n}
coversK.We now deTne the convexhullK
ε
= {θ
1
y
1
+....... +θ
n
y
n
:
P
θ
i
=1, all θ
i
≥ 0}.
This is a subset of K since K is convex and K contains all the points y
i
. We
will now map all of K into K
ε
by a continuous function P
ε
(y)thatapproxi-
mates y (i.e. kP
ε
(y)?yk <ε,?ydK). To construct this function P
ε
(y), we
must construct n continuous functions θ
i
= θ
i
(y) ≥ 0, with
P
n
i=1
θ
i
=1.
First, for i =1,.....,n, we deTne
?
i
(y)=
?
0if|y
i
?y|≥ ε
ε?|y
i
?y| if |y
i
?y| <ε
,i=1,...,n.
6.7. APPENDIX - PROOFS FOR CHAPTER 6 287
Each of these n functions ?
i
(y) is continuous and the fact that the set
{y
1
,....,y
n
} is dense in K guarantees ?
i
(y) > 0forsomei =1,....,n. Now
construct continuous functions
θ
i
(y)=
?
i
(y)
P
n
i=1
?
i
(y)
,i=1,......,n,y ∈ K.
These functions are well-deTned since
P
n
i=1
?
i
(y) > 0. The functions θ
i
(y)
satisfy θ
i
≥ 0,
P
θ
i
=1. Finally we construct the continuous function
P
ε
(y)=θ
1
(y)y
1
+........+θ
n
(y)y
n
.
This function maps K into K
ε.
From the construction of ?
i
,θ
i
(y)=0unless
ky
i
?yk <ε.Therefore P
ε
(y) is a convex combination of just those points
y
i
for which ky
i
?yk <ε.Hence
kP
ε
(y)?yk =
°
°
°
X
θ
i
(y)y
i
?y
°
°
° =
°
°
°
X
θ
i
(y)(y
i
?y)
°
°
°≤
X
θ
i
(y)ky
i
?yk <ε.
This establishes that P
ε
(y) approximates y. Nowwemaptheconvexset
K
ε
continuously into itself by the function f
ε
: K
ε
→ K
ε
where f
ε
(x) ≡
P
ε
(f (x)) for all x ∈ K
ε
. Since K
ε
is a convex, compact, Tnite-dimensional
vector subspace spanned by the n points y
1
,.....,y
n
and f
ε
: K
ε
→ K
ε
is
continuous, there exists a Txed point x
ε
= f
ε
(x
ε
)inK
ε
due to Brouwer?s
Txed point theorem 302. Now we take the limit as ε → 0. Set y
ε
= f (x
ε
).
Since K is compact, we may let ε → 0 through some sequence ε
1
,ε
2
,..for
which <y
ε
> converges to a limit in K :
f (x
ε
)=y
ε
→y as ε
k
→ 0. (6.32)
We now write
x
ε
= f
ε
(x
ε
) ≡ P
ε
(f (x
ε
)) = P
ε
(y
ε
)
x
ε
= y
ε
+[P
ε
(y
ε
)?y
ε
]
Then
kx
ε
?yk = ky
ε
+P
ε
(y
ε
)?y
ε
?yk = kP
ε
(y
ε
)?yk≤kP
ε
(y
ε
)?y
ε
k+ky
ε
?yk.
The Trst term vanishes since P
ε
(y) approximates y and the second term
vanishes since y
ε
converges to y as ε
k
→ 0.Hencex
ε
→ y as ε = ε
k
→ 0.
288 CHAPTER 6. FUNCTION SPACES
Now since f is continuous then f (x
ε
) → f (y). Combining this and (??)we
have f (y)=y, for some y ∈ K.Hence y is a Txed point of f.
Proof of Riesz-Fischer Theorem 481. Thecaseforp = ∞ is in the
text.
Second, let p ∈ [1,∞). Let <f
n
> be a Cauchy sequence in L
p
.Inorder
to Tnd a function to which the sequence converges in light of Example 444,
we need to take a more sophisticated approach than for p = ∞. Since <f
n
>
is Cauchy we can recursively construct a strictly increasing sequence <n
j
>
in N such that kf
k
?f
n
k
p
< 2
?j
, ?k,n≥ n
j
and ?j ∈N.Then
°
°
°
°
°
J
X
j=1
ˉ
ˉ
f
n
j+1
?f
n
j
ˉ
ˉ
°
°
°
°
°
p
≤
J
X
j=1
°
°
f
n
j+1
?f
n
j
°
°
p
<
J
X
j=1
2
?j
< 1
for each J ∈N by the Minkowski inequality in Theorem 480. Therefore,
Z
X
?
∞
X
j=1
ˉ
ˉ
f
n
j+1
?f
n
j
ˉ
ˉ
!
p
=
Z
X
?
lim
J→∞
J
X
j=1
ˉ
ˉ
f
n
j+1
?f
n
j
ˉ
ˉ
!
p
=
Z
X
lim
J→∞
?
J
X
j=1
ˉ
ˉ
f
n
j+1
?f
n
j
ˉ
ˉ
!
p
= lim
J→∞
Z
X
?
J
X
j=1
ˉ
ˉ
f
n
j+1
?f
n
j
ˉ
ˉ
!
p
= lim
J→∞
?
?
°
°
°
°
°
J
X
j=1
ˉ
ˉ
f
n
j+1
?f
n
j
ˉ
ˉ
°
°
°
°
°
p
?
?
p
≤ 1
where the third equality follows from the Monotone Convergence Theorem
396.This implies that the sum
P
J
j=1
ˉ
ˉ
f
n
j+1
?f
n
j
ˉ
ˉ
is Tnite a.e. This means
there exists a set A such that mA =0and
P
∞
j=1
ˉ
ˉ
f
n
j+1
?f
n
j
ˉ
ˉ
converges on
X\A. Since f
n
j+1
?f
n
j
≤
ˉ
ˉ
f
n
j+1
?f
n
j
ˉ
ˉ
, we have
P
∞
j=1
f
n
j+1
?f
n
j
converges
on X\A. Let f(x), x ∈ X\Abethe limit of this series. Then by Theorem 364
(the pointwise limit of measurable functions is measurable), f is measurable.
Finally, we need to show that f is also p -integrable. To do so, suppose
that ε>0isgivenandletN
0
be an integer such that kf
k
?f
n
k
p
<εfor
6.7. APPENDIX - PROOFS FOR CHAPTER 6 289
k,n≥ N
0
.Then
kf
n
?fk
p
=
μZ
X
|f
n
?f|
p
?1
p
=
μZ
X
lim
j→∞
ˉ
ˉ
f
n
?f
n
j
ˉ
ˉ
p
?1
p
≤
μ
lim
j→∞
inf
Z
X
ˉ
ˉ
f
n
?f
n
j
ˉ
ˉ
p
?1
p
≤ ε,?n ≥ N
0
where the second equality follows since <f
n
> is Cauchy and the Trst
inequality follows by Fatou?s Lemma 393. Since kfk
p
= kf ?f
n
+f
n
k
p
≤
kf ?f
n
k
p
+kf
n
k
p
≤ ε+kfk
p
< ∞, then f ∈ L
p
and f
n
→ f in L
p
.
Proof of Theorem 520. Let <T
n
> be a Cauchy sequence in BL(X,Y).
For a Txed x ∈ X we have
kT
n
x?T
m
xk
Y
≤kT
n
?T
m
kkxk
X
so that <T
n
(x) > is a Cauchy sequence in Y.Since Y is complete, <T
n
(x) >
converges to an element y ∈ Y. Call this element Tx (Tx =lim
n→∞
T
n
(x),
?x ∈ X). Thus we can deTne T : X → Y by Tx= lim
n→∞
T
n
(x). We must
show that T is bounded and that <T
n
>→ T as n →∞.
Since <T
n
> is Cauchy, then given ε>0 ?N such that ?m,n ≥
N we have kT
n
?T
m
k <ε.Hence kT
n
?T
N
k <ε,?n ≥ N or kT
n
k <
kT
N
k+ε,?n ≥ N. Thus, kTxk
Y
= lim
n→∞
kT
n
xk
Y
≤ lim
n→∞
(kT
n
kkxk
X
) ≤
(kT
N
k+ε)kxk
X
. Thus T is bounded.
For each x ∈ X we have kT
n
x?Txk
Y
=lim
m→∞
kT
n
x?T
m
xk
Y
≤
lim
m→∞
kT
n
?T
m
kkxk
X
≤ εkxk
X
,?n ≥ N where the inequality follows
from Corollary 519. Thus kT
n
?Tk =sup{k(T
n
?T)xk
Y
,kxk
X
=1} ≤
εk1k = ε. Thus T
n
→ T in BL(X,Y).
Proof of Theorem 529. Let G : X →R be any bounded linear functional
on R
n
from X
?
(i.e. G ∈ X
?
). Let {e
1
,...,e
n
} be the natural basis in R
n12
and deTne b
i
= G(e
i
)fori =1,...,n. For x =(x
1
,...,x
n
) ∈R
n
we have
G(x)=G(x
1
e
1
+... +x
n
e
n
)
= x
1
G(e
1
)+... +x
n
G(e
n
)
= x
1
b
1
+... +x
n
b
n
= <x,b>.
12
Recall that the natural (or canonical) basis in R
n
is deTned to be the set of vectors
{e
1
,...,e
n
} where e
i
=(0,...,1,...,0) with ?1
0
in the ith place.
290 CHAPTER 6. FUNCTION SPACES
Clearly the functional G is represented by the point b =(b
1
,...,b
n
) ∈ R
n
.
Next we show that kGk = kbk
X
. First,
|G(x)|≤
n
X
i=1
|x
i
G(e
i
)|≤
?
n
X
i=1
x
2
i
!1
2
?
n
X
i=1
G(e
i
)
2
!1
2
= kxk
X
kbk
X
where the Trst inequality follows from the triangle inequality and the sec-
ond inequality follows from the Cauchy-Schwartz inequality (Theorem 210).
Hence kGk≤kbk
X
by (iv) of Theorem 518. Second, choose x
0
=(b
1
,...,b
n
).
Then we have
kGk≥
|G(x
0
)|
kx
0
k
X
=
|G(b)|
kbk
X
=
<b,b>
kbk
X
=
kbk
2
X
kbk
X
= kbk
X
where the inequality follows from Corollary 519. Combining these two in-
equalities we have kGk = kbk
X
.
It is easy to show that each functional G ∈ X
?
is uniquely represented
by the point b ∈ R
n
. To see this, it is su?cient to prove that an operator
T : X
?
→ X deTned by T(G)=(G(e
1
),...,G(e
n
)) = b is a bounded, linear
bijection such that G(x)=<b,x>.To see that T is bounded (and hence
continuous by Theorem 511) note
kTk =sup
?
kTGk
X
kGk
X
?
,kGk
X
?
6=0
?
=sup
?
kbk
X
kbk
X
,kbk
X
6=0
?
=1.
To see that T is linear, since < (αb + βb
0
),x>= α<b,x>+β<b
0
,x>,
we know T(αb+βb
0
)=αT(b)+βT(b
0
). To see that T is a bijection, Trst we
establish it is an injection (i.e. one-to-one). Let G
1
6= G
2
. Then ?x ∈ X such
that G
1
(x) 6= G
2
(x). Since x = x
1
e
1
+... +x
n
e
n
uniquely, we have
G
1
(x)=G
1
(x
1
e
1
+...+x
n
e
n
)
= x
1
G
1
(e
1
)+... +x
n
G
n
(e
n
)
= x
1
b
1
1
+... +x
n
b
1
n
and similarly
G
2
(x)=x
1
b
2
1
+... +x
n
b
2
n
.
Then G
1
(x) 6= G
2
(x) ? b
1
1
6= b
2
1
or ... b
1
n
6= b
2
n
so that TG
1
6= TG
2
. To see T
is a surjection (onto), we must show that for any d ∈ X, ?G ∈ X
?
such that
T(G)=d. But G(x)=<d,x>∈ X
?
and TG= d.
13
13
For example, if X = R
2
and we take d =(3,4) ∈ X, then G
(3,4)
(x)=3x
1
+4x
2
∈ X
?
.
6.7. APPENDIX - PROOFS FOR CHAPTER 6 291
Proof of Theorem 530. Let {e
i
,i ∈ N } is a complete orthonormal
system in H.Setb
i
= F(e
i
),?i ∈ N. Then we have
n
X
i=1
b
2
i
= F(
n
X
i=1
b
i
e
i
) ≤
kFkk
n
X
i=1
b
i
e
i
k = kFk
?
n
X
i=1
b
2
i
!1
2
. Taking the square of both sides , we have
n
X
i=1
b
2
i
≤ kFk
2
for arbitrary n.Then
∞
X
i=1
b
2
i
≤ kFk
2
< ∞ which means
that the series
∞
X
i=1
b
2
i
is convergent. Then there exists an element y ∈
H whose Fourier coe?cients are b
i
,i ∈ N. Since {e
i
,i∈ N} is a com-
plete orthonormal system (by Parserval?s equality) we have b =
∞
X
i=1
b
i
e
i
and also kbk ≤ kFk.Let x be any element of H and let {x
i
;i ∈ N} be
its Fourier coeTcients. Then
n
X
i=1
x
i
e
i
→ x by Parseval?s Theorem 504.
Since F is linear, F(x) = lim
n→∞
F
?
n
X
i=1
x
i
e
i
!
= lim
n→∞
n
X
i=1
x
i
F(e
i
)=
lim
n→∞
n
X
i=1
x
i
b
i
=
∞
X
i=1
x
i
b
i
=<x,b>.By the Cauchy-Schuartz inequality
| F(x) |≤ kxkkbk,?x ∈ H,sothatkFk ≤ kbk.Combining the two inequali-
ties, we have kFk = kbk.
Proof of Theorem 531. Suppose x =(x
1
,.....x
n
,....) ∈ !
p
and F ∈ !
?
p
.
Set {e
i
,idN} where e
i
is the vector having the i-th entry equal to one and all
other entries equal to zero.
Let s
n
=
P
n
i=1
x
i
e
i
. Then s
n
∈ !
p
and kx?s
n
k
p
p
=
P
∞
i=n+1
|x
i
|
p
?→ 0as
n →∞. Thus F (s
n
)=F (
P
n
i=1
x
i
e
i
)=
P
n
i=1
x
i
F (e
i
)and|F (x)?F (s
n
)| =
|F (x?s
n
)|≤kFkkx?S
n
k
p
?→ 0asn ?→ ∞.HenceF (x)=lim
n?→∞
F (s
n
)=
P
∞
i=1
x
i
F (e
i
). Set z
i
= F (e
i
),i∈N and z = hz
1
,...,z
n
,...i . We must show
that z ∈ !
q
. For this purpose choose a particular x = hx
i
i :
x
i
=
?
|z
i
|
q?2
z
i
when z
i
6=0
0whenz
i
=0
.
For this case ks
n
k
p
p
=
P
n
i=1
|x
i
|
p
=
P
n
i=1
|z
i
|
p(q?1)
=
P
n
i=1
|z
i
|
2
. Moreover
292 CHAPTER 6. FUNCTION SPACES
F (s
n
)=
P
n
i=1
x
i
z
i
=
P
n
i=1
|z
i
|
q
and|F (s
n
)|≤kFkks
n
k
p
≤kFk(
P
n
i=1
|z
i
|
q
)
1
p
.
Hence
P
n
i=1
|z
i
|
q
≤ kFk(
P
n
i=1
|z
i
|
q
)
1
p
? (
P
n
i=1
|z
i
|
q
)
1
q
≤ kFk holds true for
arbitrary n.Thusz = hz
i
i ∈ !
q
and kzk
q
≤ kFk. On the other hand
by Hyolder?s inequality we have |F (x)| = |
P
∞
i=1
x
i
z
i
| ≤ kxk
p
kzk
q
and thus
kFk≤kzk
q
. This shows that kFk = kzk
q
.
Proof of Riesz Representation Theorem 532. Let us Trst consider
m(X) < ∞. In that case χ
E
∈ L
p
(X) for any E ? X which is L-measurable
(i.e. E ∈ L). Then deTned a set function ν : L→R by ν(E)=F(χ
E
)
for E ? L. ν is a Tnite signed measure which is absolutely continuous with
respect to m. (Show it???). Then by the Radon Nikdodyn Theorem 434
there is an integrable function g such that ν(E)=
R
E
dm for all E ∈ L.
Thus we take F(χ
E
)=
R
E
gdm for all E ∈ L.If? is a simple function (i.e.
it is a Tnite linear combination of characteristic functions), then by linearity
of F we have
F(?)=F
?
n
X
i=1
c
i
χ
E
i
!
=
n
X
i=1
c
i
F
?
χ
E
i
¢
=
n
X
i=1
c
i
Z
E
i
gdm
=
n
X
i=1
c
i
Z
X
χ
E
i
gdm =
Z
X
n
X
i=1
c
i
χ
E
i
=
Z
X
g?dm.
Since |F(?)| ≤ kFkk?k
p
we have that g ∈ L
q
(X). (Show it???) Hence
there is a function g ∈ L
q
(X) such that F(?)=
R
g?dm for all simple
functions ?. Since the subset of all simple functions is dense in L
p
(X),then
F(f)=
R
X
gfdm for all f ∈ L
p
(X). (Show it???) Also show that kFk =
kgk
q
.Thefunctiong is determined uniquely for if g
1
and g
2
determine the
same functional F, then g
1
? g
2
must determine the zero functional; hence
kg
1
?g
2
k
q
=0whichimpliesq
1
= q
2
a.e.
Let m(X)=∞. Since mis σ-Tnite, there is an increasing sequence of L-
measurable sets <X
n
> with Tnite measure whose union is X. By the Trst
part of the proof for each nthere is a function g
n
∈ L
q
such that g
n
vanishes
outside X
n
and F(f)=
R
fg
n
dm for all f ∈ L
p
that vanish outside x. More-
over kg
n
k
q
≤ F. Since any function g
n
is unique on X
n
(except on sets of
measure zero), g
n+1
= g
n
on X
n
. Set g(x)=g
n
(x)forx ∈ X
n
.Theng is a
well deTned L-measurable function and |g
n
| increases pointwise to |g|. Thus
by the Monotone Convergence Theorem 396
Z
|g|
q
dm = lim
n→∞
Z
|g
n
|
q
dm ≤kF
q
k
6.7. APPENDIX - PROOFS FOR CHAPTER 6 293
for g ∈ L
q
.Forf ∈ L
p
deTne
f
n
=
?
f(x)forx ∈ X
n
0forx ∈ X\X
n
.
Then f
n
→ f pointwise and in L
p
. By the Holder Inequality 479 |fg| is
integrable and |f
n
g|≤|fg| so that by the Lebesgue Dominated Convergence
Theorem 404
Z
fgdm=lim
n→∞
Z
f
n
gdm = lim
n→∞
Z
f
n
g
n
dm =lim
n→∞
F(f
n
)=F(f).
Proof of Hahn-Banach Theorem 539. If M = X, then there is nothing
to prove. Thus assume M ? X.Then?x
1
∈ X which is not in M.Let
M
1
= {w ∈ X : w = αx
1
+x,α ∈R,x∈ M} (6.33)
Onecanprove(seeExercise6.7.1below)thatM
1
deTnedin(6.33)isa
subspace of X and that the representation in (6.33) is unique.
Next, extend f to M
1
and call this extension F.InorderforF : M
1
→R
to be a linear functional, it must satisfy
F(αx
1
+x)=αF(x
1
)+F(x)=αF(x
1
)+f(x) (6.34)
whee the second equality follows since x ∈ M. Hence F is completely de-
termined by the choice of F(x
1
). Moreover, we must have αF(x
1
)+f(x) ≤
P(αx
1
+x)forallscalarsα and x ∈ X. If α>0,this means
F (x
1
) ≤
1
α
[p(αx
1
+x)?f (x)] = p
3
x
1
+
x
α
′
?f
3
x
α
′
= p(x
1
+z)?p(z)
where z =
x
α
.If α<0,we have
F (x
1
) ≥
1
α
[p(αx
1
+x)?f (x)] = f (y)?p(?x,+y)
where y = ?
x
α
.Combining these two inequalities we have
f (y)?p(y?x
1
) ≤ F (x
1
) ≤ p(x
1
+z)?f (z),?y,z ∈ M???orM
1
(6.35)
Conversely, if we can pick F (x
1
)tosatTsfy (6.35) then will satisfy (6.34) and
F will satisfy (6.10) on M
1
. For if F (x
1
)satisTses (6.35) then for α>0we
have
αF (x
1
)+f (x)=α
h
F (x
1
)+f
3
x
α
′i
≤ αp
3
x
1
+
x
α
′
= p(αx
1
+x)
294 CHAPTER 6. FUNCTION SPACES
while for α<0wehave
αF (x
1
)+f (x)=?α
h
?F (x
1
)+f
3
?
x
α
′i
≤?αp
3
?
x
α
?x
1
′
= p(αx
1
+x).
So we have now reduced the problem to TndingavalueF (x
1
)tosatisTy
(6.35). In order for such a value to exist, we must have
f (y)?p(y?x
1
) ≤ p(x
1
+z)?f (z),?y,z ∈ M (6.36)
In other words we need
f (y +z) ≤ p(x
1
+z)+p(y?x
1
)
Butthisistrueby(6.8). Hence (6.36) holds. If we Tx y and let z run through
all elements of M, we have
f (y)?p(y?x
1
) ≤ inf
z∈M
{p(x
1
+z)?f (z)}≡ C.
Since this if true for any y ∈ M, we have
c ≡ sup
y∈M
{f (y)?p(y?x
1
)}≤ C.
We now pick F (x
1
)tosatisTy c ≤ F (x
q
) ≤ C. Note that the extention
is unique only when c = C. Thus we have extended f from M to M
1
. If
M
1
= X we are done. Otherwise, there is an element x
2
∈ X not in M
1
. Let
M
2
be the space spanned by M
1
and x
2
(M
2
= αx
2
+x,α ∈R,x∈ M
1
). By
repeating the process we can extend f to M
2
.
If we prove that the collection of all linear bounded functionals deTned
on subspaces of X satisTes the assumptions of Zorn?s lemma we are done
(because Zorn?s lemma guarantees the existence of a maximal element which
we will prove is the desired functional). Consider the collection L of all linear
functionals g : D(g) ?→ R deTnedonavectorsubspaceofX such that the
vector subspace satisTes: (i) D(g) ? M; (ii) g(x)=f (x), ?x ∈ M; (iii)
g(x) ≤ p(x), ?x ∈ D(g). Note that L is not empty since F belongs there.
Introduce a partial ordering ???inL as follows. If D(g
1
) ? D(g
2
)and
g
1
(x)=g
2
(x) ?x ∈ D(g
n
), then g
1
? g
2
. One can prove (see Exercise 6.7.2
below) that ???deTned above is a partial ordering in L.
6.7. APPENDIX - PROOFS FOR CHAPTER 6 295
We have to check now that every totally ordered subset of L has an upper
bound in L.LetW be a totally ordered subset of L. DeTne the functional h
by
D(h)=∪
gDW
D(g)
h(x)=g(x),g∈ W,x ∈ D(g).
Clearly h ∈ L anditisanupperboundforW. Note that the deTnition of h is
not ambiguous because if g
1,
g
2
are any two elements of W, then either g
1
? g
2
or g
2
? g
1
and in either case if x ∈ D(g
1
) ∩D(g
2
), then g
1
(x)=g
2
(x).
Hence this shows that the assumptions of Zorn?s lemma are met and therefore
a maximal element F of L exists.
We must show that F is the desired functional. That means that D(F)=
X. Suppose by contradiction that D(F) ( X. Then ? x
0
∈ X\D(F)and
by repeating the process that we used at the beginning we would construct
the extension h of F such that h ? F and h 6= F. This would violate the
maximality of F.
Exercise 6.7.1 Prove that M
1
deTned in (??)isasubspaceofX and that
the representation in (??)isunique.
Exercise 6.7.2 Prove that the relation ???deTned in the proof of Theorem
539 is a partial ordering in L.
Proof of Separation Theorem 549. Suppose without loss of generality
that 0 is an internal point of K
1
. Then K
1
?K
2
= {x?y : x ∈ K
1
,y∈ K
2
}
is convex by Theorem 216. Let x
0
∈ K
2
. Since 0 is an internal point of K
1
,
then 0?x
0
= ?x
0
must be internal point of K
1
?K
2
. Let K = x
0
+K
1
?K
2
.
Then K is convex and 0∈ K is its internal point. See Figure 6.5.4.
We claim that x
0
is not an internal point of K. Suppose it was. Then 0
wouldbeaninternalpointofK
1
?K
2
. Then for any y 6= 0 and some positive
number α,thepointαy would belong to K
1
?K
2
(i.e. αy = k
1
?k
2
for some
k
1
∈ K
1
and k
2
∈ K
2
). This implies
αy+k
2
1+α
=
k
1
1+α
.Ify is a point of K
2
then
the left-hand side represents a point in K
2
because
α
1+α
y+
1
1+α
k
2
is a convex
combination of two points of a convex set K
2.
Furthermore, the right-hand
side is an internal point of K
1
because k
1
∈ K
1,
0 ∈ K
1
and
1
1+α
< 1. This
contradicts the assumption that K
2
contains no internal points of K
1
. Thus
if P (x) is the support function of K and x
0
is not an internal point of K we
know by (iii) of Lemma 546 that P (x
0
) ≥ 1.
296 CHAPTER 6. FUNCTION SPACES
Let M be a one-dimensional linear subspace spanned by x
0
(i.e. M =
{x : x = αx
0
,α∈R}.DeTne a linear functional f : M → R by f (αx
0
)=
αP (x
0
). We must check that f satisTes the assumptions of the Hahn-Banach
Theorem 539. Is f (αx
0
) ≤ P (αx
0
)forallα?Ifα ≤ 0, then f (αx
0
) ≤ 0and
hence f(αx
0
) ≤ P (αx
0
)sinceP is non-negative. If α>0, then f (αx
0
)=
αP (x
0
)=P (αx
0
) by property of (i) of P in Lemma 546. Now by the Hahn-
Banach Theorem f (x) can be extended to a linear functional F : X → R
satisfying F (x) ≤ P (x) for all x ∈ X. Thus for x ∈ K we have F (x) ≤ 1,
x = x
0
+ y?z with y ∈ K
1
and z ∈ K
2
.Then x?y + x
0
∈ K and we have
F (x?y +x
0
) ≤ 1andF (x)?F (y)+F (x
0
) ≤ 1. Since F (x
0
) ≥ 1, we have
F (x) ≤ F (y)foranyx ∈ K
1
and y ∈ K
2
. Then we have
sup
x∈K
1
F (x) ≤ inf
y∈K
2
F (y)
and hence F separates K
1
,K
2
and F is a non-zero functional (since F (x
0
) ≥
1).???Check x,y,z ???
Proof of Second Welfare Theorem 552. Proof. Since S is Tnite
dimensional (A5) and the aggregate technological possibilities set is convex
(A4), for the existence of φ we must show that the set of allocations preferred
to {x
?
i
}
I
i=1
given by A =
P
I
i=1
A
i
is convex where A
i
= {x ∈ X
i
: u
i
(x) ≥
u
i
(x
?
i
)},?i. Assumptions (A1) ?(A3) are su?cient to guarantee that each
A
i
is convex and so A is convex. Finally, we show that A does not contain
any interior points of Y. Suppose to the contrary that y ∈ intY and y ∈ A.
Thus, for some {x
i
}
I
i=1
with x
i
∈ A
i
for all i, we have y =
P
I
i=1
x
i
. By
assumption, there is some h ∈{1,...,I}, ?bx
h
such that u
h
(bx
h
) >u
h
(x
?
h
). Let
x
α
h
= αbx
h
+(1? α)x
h
,α∈ (0,1). By A1andA2,x
α
h
∈ X
h
and u
h
(x
α
h
) >
u
h
(x
?
h
). Let y
α
=
P
i6=h
x
i
+ x
α
h
. Since y ∈ intY, it follows that for some
su?ciently small ε, y
ε
∈ Y. In this case the allocation
?
{x
i6=h
}
I
i=1
∪x
ε
h
,y
ε
¢
is
feasible and satisTes
x
i
∈ X
i
,?i,u
i
(x) ≥ u
i
(x
?
i
),?i 6= h, and u
h
(x
α
h
) >u
h
(x
?
h
)
which contradicts the Pareto Optimality of ({x
?
i
}
I
i=1
,{y
?
j
}
J
j=1
). Therefore the
conditions for Theorem 549 are met.
To complete the proof, it is su?cient to show (b) holds in the deTnition
of a competitive equilibrium. By (6.15), suppose that x
i
∈ X
i
and φ(x
i
) <
φ(x
?
i
). Hence it follows by contraposition of (6.13) that u
i
(x
α
i
) <u
i
(x
?
i
) ?α ∈
(0,1). By A3, lim
α→0
u
i
(x
α
i
)=u
i
(x
i
) <u
i
(x
?
i
).
6.8. BIBILOGRAPHY FOR CHAPTER 6 297
6.8 Bibilography for Chapter 6
This material is based on Royden (Chapters ) and Munkres (Chapters ).
298 CHAPTER 6. FUNCTION SPACES
Chapter 7
Topological Spaces
This chapter is a brief overview of topological spaces; it does not go into
details nor prove theorems. Let?s start with an example about Txed points.
In Chapter 4 there is a theorem 257 saying that if f : I ?→ I is a continuous
mapping from a closed interval I into itself, then there exists a point x
0
∈ I
such that f (x
0
)=x
0
. Is the theorem still true if the line segment I is
distorted (i.e. if it is an arc or an arbitrary curve or a circle)? See Figure
7.1 Since every concept behind the theorem is a topological one, the theorem
remains true as long as the object change is homeomorphic.
We will explain the notions of topological properties and homeomor-
phisms, but at this stage we say that topological properties of an object
are those that are invariant with respect to various distortions like bending,
increasing (magnifying), decreasing (reducing)-all these transformations are
homeomorphic, but are not invariant, for example, to tearing or welding.
Thus the theorem remains true for an arc or an arbitrary curve but not for
the circle. The Trst two objects have two ends but the circle does not have
any. Thus there is an ?inside? and ?outside? of the circle but not of the arc
or arbitrary curve. It is easy to see that in the case of a circle, the Txed point
theorem doesn?t hold. Consider a revolution of the circle about an angle-it is
a continuous mapping of the circle into itself with no point remaining Txed.
See Figure 7.2.????
DeTnition 585 AsetX together with a collectionO (for open sets) which
satisTes the following conditions: (i) ?∈O, X ∈O;(ii)(∪
i∈Υ
A
i
) ∈O ,for
A
i
∈O (an arbitrary union of elements ofO belongs toO); (iii) (∩
n
i=1
A
i
) ∈O
for A
i
∈ O (a Tnite intersection of elements ofO belongs toO).O is called a
299
300 CHAPTER 7. TOPOLOGICAL SPACES
topology on X and its elements are called open sets.
Recall the following facts. A set B is called closed if X\B is open. Also
? and X are both open and closed. By using DeMorgan rules we can show
that (i) ∪
n
i=1
B
i
is closed for B
i
closed and (ii) ∩
i∈Υ
B
i
is closed for B
i
-closed.
The intersection of all closed sets containing a set C is called the closure of
C written C. Hence the closure of C is the smallest closed set containing C
and C ?
ˉ
C.
Exercise 7.0.1 Show that C is closed i? C =
ˉ
C.
The union of all open sets contained in a set D is called the interior of D
(written intD) and it is the largest open set contained in D.
Exercise 7.0.2 Show that D is open i? intD = D.
Example 586 Let (R,|·|) be a metric space. The collection of all open sets
(see Def. 104)O satisTes all three properties of Theorem 106 and hence |·|
deTnes a topologyO in R. A topology is determined by its metric. You should
realizethattwoequivalentmetricsdeterminethesametopology(seenotes
after the Theorem 221).
Hence any metric space is also a topological space. What about the
converse? Consider a topological space X with a topologyO. Does a metric
d on X exist that would generate a topologyO.
DeTnition 587 If there exists a metric d on X that generates a topologyO
we say that this topological space is metrisable.
Using this deTnition we can rephrase the question, is any topological
space metrisable? The answer is no, as we will see.
Example 588 Given a set X ,letO be the collection of all subsets of X
(i.e.O is the power set of X). This is the largest possible topology on X.We
call it the discrete topology. The discrete topology is not very interesting.
All sets are open (and closed), any mapping from X is continuous. This
topological space is metrisable, put d(x,x)=0, and d(x,y)=1for x 6= y
(i.e. the discrete metric).
301
Example 589 Let X have at least two elements andO contain only ? and
X. This is the smallest possible topological space on X called the trival
topology on X. This topological space is not metrisable for the following
reason. In any metric space, a set containing just one element is closed. In
this topological space the closure of a one element set {x} is the whole space
X (since this is the only closed set containing x) and hence by Exercise 7.0.1
{x} is not closed.
Example 590 Let X be an inTnite set. LetO contain ? and all subsets
A ? X such that X\A is Tnite. This topology is called the topology of Tnite
complements.
Exercise 7.0.3 Show thatO in the preceding example 590 is a topology on
X and that the closure of a set A is
ˉ
A =
?
A if A is Tnite
X if A is inTnite
. This topological space is not metrizable as we will see in the next subsection
on separation axioms.
Now we will deTne other topological properties the same way we did in
metric spaces. Naturally, we cannot use the notion of distance here; all new
objects and properties can be deTned only in terms of open sets or in terms of
other objects and properties, originally deTnedbyopensets. Aneighborhood
of a point x is any open set containing x. Apointx is a cluster point of a set
A if any neighborhood of x contains a point of A di?erent from x.
Exercise 7.0.4 Show that A is closed i? A contains all its cluster points.
As in DeTnition 153 we say that S ? X is dense in X if
ˉ
S = X. In many
casesitisratherdi?cult to deTne the collection of all the open sets; we often
use a small subcollection of open sets to deTne all open sets. This is exactely
the method that we used in a metric space, where open sets were deTned in
terms of open balls (see DeTnition 106).
DeTnition 591 AcollectionB
x
of open sets is called the local basis of X
if x ∈ B, ?B ∈ B
x
and if for any neighborhood U of x there exists B ∈ B
x
such that B ? U. B is called the topological basis of X if for any x ∈ X
there exists B
x
?B that is a local bases in x.
302 CHAPTER 7. TOPOLOGICAL SPACES
Hence B is a topological basis of X i? for any x ∈ X and for any neigh-
borhood U of X there exists B ∈B
x
such that x ∈ B and B ? U.
Exercise 7.0.5 Show that the collection of open balls is the topological basis
of n dimensional Euclidean space R
n
.
If B is a topological basis of a topological space X then it satisTes the
following:(i) For any x ∈ X there exists B ∈B such that x ∈ B;and(ii)IfB
1
,
B
2
∈B and x ∈ (B
1
∩B
2
) there existsB
3
∈B such that x ∈ B
3
? (B
1
∩B
2
).
Assume now that we have a set X (without a topology) and a collection B
of subsets of X satisfying conditions (i) and (ii). We say that S ? X is open
if for any x ∈ S there exists B ∈ B such that x ∈ B ? S. The collection of
all these open sets is a topology on X.
Exercise 7.0.6 Prove the above statement.
The method of deTning a topology onX through a basis is very important.
ThismethodwasusedinChapter3indeTning a topology on R where the
basis B was the collection of all open intervals.
DeTnition 592 Let X be a topological space with a topologyO and let X
0
?
X. Then we can deTne a topology O
0
on X
0
as the collection of all sets of
the form O ∩X
0
where O ∈ O. O
0
is called the relative topology on X
0
created byO and X
0
is called a topological subspace of X.
Example 593 Let X = R be a topological space with the usual topology in
DeTnition 104. Let X
0
=[0,1). Then the relative topologyO
0
on X
0
is the
collection of all sets of the form O∩[0,1) where O is open in R. For example
O
0
=
£
0,
1
2
¢
is open in X
0
because
£
0,
1
2
¢
=(?1,1) ∩
£
0,
1
2
¢
and (?1,1) is
open in R.
7.1 Continuous Functions andHomeomorphisms
Let X,Y be topological spaces and f : X ?→ Y be a function from X to Y.
We say that f is continuous at x
0
∈ X if for any neighborhood V of f (x
0
)in
Y the inverse image f
?1
(V) is a neighborhood of x
0
in X. f is continuous on
X if f is continuous at every x ∈ X. This is similar to DeTnition 244 where
neighborhood has been substituted for open ball.
7.2. SEPARATION AXIOMS 303
Exercise 7.1.1 Prove that f : X ?→ Y is continuous i? f
?1
(V) is open
(closed) in X for any V open (closed) in Y.
DeTnition 594 Let f : X ?→ Y be a function from X to Y. Assume that
there exists an inverse function f
?1
: Y ?→ X and let both f and f
?1
be continuous. Then we say that f is a homeomorphism of X onto Y
and that X and Y are homeomorphic. Homeomorphic means topologically
equivalent (i.e.the same from a topological point of view).
Example 595 Let X = R with a topology determined by the Euclidean met-
ric d
2
(x,y)=
q
(x
1
?x
2
)
2
+(y
1
?y
2
)
2
. Let Y = R with a topology deter-
minedbythesupmetricd
∞
(x,y)=max{|x
1
?x
2
|,|y
1
?y
2
|}. Then X and
Y are homeomorphic. Hence from a topological point of view a circle and a
square are indistinguishable.
We could deTne other topological properties like compactness, connect-
edness, and separability but we will only touch upon them. All these notions
are deTned in Chapter 4. Notice that although they are deTnedinamet-
ric space, these deTnitions don?t use the notion of distance (they are simply
formulated in terms of open sets).
7.2 Separation Axioms
Since the notion of distance is very natural for us, a metric space is more
easily envisioned than a topological space. That is why we take for granted
many results. For instance, in a metric space given two di?erent elements
x,y ∈ X , x 6= y there exists two disjoint open sets U,V each containing just
one element (i.e. x ∈ U , y ∈ V,andU ∩V = ?).See Figure 7.3. But this is
not necessarily true in a general topological space. Before we show this we
will state separation axioms
DeTnition 596 A topological space X is called: (i)a T
0
-space if for any two
distinct elements x,y there exists a neighborhood U of x not containing y;(ii)
a T
1
-space if for any two distinct elements x,y there exists a neighborhood
U of x not containing y and a neighborhood V of y not containing x; (iii)
a T
2
-space (or Hausdor? space) if any two distinct elements x,y have
disjoint neighborhoods (i.e. there exist two open sets U,V such that x ∈ U ,
304 CHAPTER 7. TOPOLOGICAL SPACES
y ∈ V and U ∩V = ?); (iv) a T
3
-space ifanyclosedsetA and an element
x/∈ A have disjoint neighborhoods (i.e. any point x and any closed set A
not containing x can be separated by disjoint open sets); (v) a T
4
-space if
any two disjoint closed sets have two disjoint neighborhoods (i.e. any two
disjoint closed sets can be separated by disjoint open sets). These axioms can
be pictured in Figure 7.4. A regular space is a T
3
-space which is also a
T
1
-space. A normal space is a T
4
-space which is also a T
1
-space.
Note that there may be slightly di?erent terminology in the literature
depending on the book you reference.
Exercise 7.2.1 Show that the following sequence of implications holds true:
Normal space=? regular space=? Hausdor? space=?T
1
-space=?T
0
-space.
Exercise 7.2.2 Show that any metric space is normal. Hint: A positive
distance can be always bisected.
Combining the statements of Exercises 7.2.1and7.2.2wegetthatany
metric space satisTes all the separation axioms. Now we are going to show
that none of the implications of Exercise 7.2.1 can be reversed.
Example 597 AsetX containing at least two distinct elements with the
trivial the topology (deTned in the Example 589) is not a T
0
-space. To see
this, let x 6= y . x cannot be separated by an open set from y because the only
open set containing x is the whole set X (that also contains y ).
Before giving other examples it is useful to state the following theorem
that you should prove as an exercise.
Exercise 7.2.3 A topological space X is T
1
i? every singleton is closed.
Example 598 Let X = {a,b} and O = {?,{a},X} beatopologyonX.
Note that by deTnition, {a} is open. To show that O is a topology we must
satisfy the conditions in DeTnition 585. Obviously ?,X ∈O by construction.
On closedness with respect to arbitrary union, {a} ∪ ? = {a} ∈ O and
{a}∪X = X ∈O . On closedness with respect to Tnite intersection {a}∩X =
{a}∈O and {a}∩? = ?∈O . Now O is a T
0
-space because for a 6= b, the
open set {a} is a neighborhood of the element a not containing b. Note that
{b} is closed. According to Exercise 7.2.3 O is not a T
1
-space because {a} is
not closed.
7.3. CONVERGENCE AND COMPLETENESS 305
Example 599 Let X = N with the topology of Tnite complements (deTned
in Example 590). This is a T
1
-space because {x} is closed for x ∈ N (note
N\{x} is inTnite thus open). It is not Hausdorf since: an open set A con-
taining 1 has the form A = N\{x
1
,....,x
m
} where x
i
6=1;anopensetB
containing 2 has the form B = N\{y
1
,.....,y
n
} where g
i
6=2;and the sets
A,B are not disjoint.
Example 600 Let X = R and the topology O consists of: (i) all open sets
in the usual Euclidean topology (i.e. the topology induced by the Euclidean
metric); and (ii) all sets of the form U\K where K =
?
1
n
,ndN
a
and U is
open in the usual Euclidean topology. This is a Hausdor? space because open
sets of type (i) can be used for separating two distinct points. It is not a
T
3
-space because K is closed and 0 /∈ K cannot be separated from K.
None of the topological spaces in Examples 597 to 600 are metrizable.
Why? Let (X,O)be a topological space which is metrizable by d.Then
(X,d) is a metric space. That is, let O
0
be the collection of all open sets of
(X,d). Then O
0
coincides with O (ie. this means that (X,O) is metrizable).
Then (X,O)and(X,O
0
) are identical topological spaces one of which is
not T
0
and othe other is normal (because it is a metric space by Exercise
7.2.2).Hence, there is a contradiction. The same argument can be used in
the other examples. Hence being a normal topological space is a necessary
condition for the space to be metrizable. But this condition is not su?cient.
For further reading see Kelley (???)
7.3 Convergence and Completeness
In Section 4.1 the notion of the convergence of a sequence in a metric space
was deTned. In DeTnition 143 we characterize a closed set A as the set con-
taining limit points of all convergent sequences from A. DeTning closed sets
we can also deTne open sets as their complements. This means we can deTne
a topology. That is, a topology in a metric space can be deTnedintermsof
convergence of a sequence (as we did in Chapter 4). Can this procedure be
used in a topological space? First, we must address whether convergence of
a sequence can be introduced in topological space? In DeTnition 136 we see
that the concept of distance (metric) is used there (we say that hx
n
i ?→ x
if for any ε>0, ?N such that ?n ≥ N , d(x
n
,x) <ε.In this deTnition, ε
represents an ε -ball around x (i.e. a neighborhood of x). Hence the deTnition
306 CHAPTER 7. TOPOLOGICAL SPACES
can be reformulated as the following: hx
n
i ?→ x if for any neighborhood U
of x there exists N such that ?n ≥ N , x
n
∈ U. In this new version only topo-
logical notions are used and hence convergence of a sequence can be deTned
in a topological space.
You may wonder if a topology can be built only in terms of convergence
of sequences (the same way it is done in a metric space). The answer is
not always. Loosely speaking it is possible in topological spaces that are
separable (i.e. containing a countably dense set). Thus separability of a
topological (metric) space is an important property.
Exercise 7.3.1 Show that if X has a countable basis then it is separable.
Among topological spaces that are not separable there exist spaces whose
topology cannot be fully built only in terms of convergence of sequences. If
we want to build a topology in these spaces in terms of convergence, then
the notion of sequence has to be replaced by the more general notion of a
net. We will not deal with it here (again see Kelley).
The last important property of a metric space is completeness. Is this
property topological? That is, can completeness be deTnedintermsofopen
sets? As we know, deTning completeness requires the notion of a Cauchy
sequence and DeTnition 169 of a Cauchy sequence is based on the concept of
distance. It cannot be deTned without a metric. In other words, a Cauchy se-
quence cannot be deTned in a general topological space. Hence completeness
is not a topological property as the next example shows.
Example 601 Let X =(0,1],|·|),Y =([1,∞),|·|) be two metric spaces.
Then f : X → Y given by f (x)=
1
x
is a homeomorphism ( f is a bijection
and f and f
?1
are continuous). Hence these two metric space are topologi-
cally equivalent but X is not complete whereas Y is.
Total boundedness is not a topological property either. Example 601
showsthissinceX is totally bounded whereas Y is not. Theorem 198 says
that compactness in a metric space is equivalent to completeness and to-
tal boundedness. Compactness is a topological property while completness
and total boundedness are not topological properties individually but if they
occur simultaneously they are a topological property.