CHAPTER 3 LEAST SQUARES METHODS FOR ESTIMATING β 1
Chapter 3 Least Squares Methods for Estimating β
Methods for estimating β
Least squares estimation
Maximum likelihood estimation
Method of moments estimation
Least absolute deviation estimation
...
3.1 Least squares estimation
The criterion of the least squares estimation is
minb
0
nsummationdisplay
i=1
(yi ?Xprimeib0)2
or
minb
0
(y?Xb0)prime (y?Xb0).
Let the objective function be
S (b0) = (y?Xb0)prime (y?Xb0) = yprimey?bprime0Xprimey?yprimeXb0 + bprime0XprimeXb0
= yprimey?2yprimeXb0 + b0XprimeXb0.
The first—order condition for the minimization of this function is
?S (b0)
?b0 = ?2X
primey + 2XprimeXb0 = 0.
The solution of this equation is the least squares estimate of the coefficient vector β.
b = (XprimeX)?1 Xprimey.
If rank(X) = K, rank(XprimeX) = K. Thus, the inverse of XprimeX exists.
Let e = y?Xb. We call this residual vector. We have
e = y?Xb (1)
= y?X(XprimeX)?1Xprimey
= (I ?X (XprimeX)?1 Xprime)y
= (I ?P)y, (2)
where P = X (XprimeX)?1 Xprime. The matrix P is called projection matrix. We also let I?P =
M. Then, we may write (2) as
y = Xb + e = Py + My.
CHAPTER 3 LEAST SQUARES METHODS FOR ESTIMATING β 2
We often write Py = ?y. This is the part of y that is explained by X.
Properties of the matrices P and M are:
(i) Pprime = P, P2 = P (idempotent matrix)
(ii) Mprime = M, M2 = M
(iii) PX = X, MX = 0
(iv) PM = 0
Using (1) and (iii), we have
Xprimee = XprimeMy = 0.
If the first column of X is 1 = (1,···,1)prime , this relation implies
Xprime1e =
nsummationdisplay
i=1
ei = 0.
In addition, (iv) gives
yprimey = yprimePprimePy + yprimeMprimeMy = ?yprime?y + eprimee
3.2 Partitioned regression and partial regression
Consider
y = Xβ + ε = X1β1 + X2β2 + ε.
The normal equations for b1 and b2 are
parenleftBigg Xprime
1X1 X
prime
1X2
Xprime2X1 Xprime2X2
parenrightBiggparenleftBigg b
1
b2
parenrightBigg
=
parenleftBigg Xprime
1y
Xprime2y
parenrightBigg
.
The first part of these equations are
(Xprime1X1)b1 + (Xprime1X2)b2 = Xprime1y
which gives
b1 = (Xprime1X1)?1 Xprime1y ?(Xprime1X1)?1 Xprime1X2b2
= (Xprime1X1)?1 Xprime1 (y ? X2b2).
Plug this into the second part of the normal equations. Then, we have
Xprime2X1b1 + Xprime2X2b2
= Xprime2X1 (Xprime1X1)?1 Xprime1y ? Xprime2X1 (Xprime1X1)?1 Xprime1X2b2 + Xprime2X2b2
= Xprime2X1 (Xprime1X1)?1 Xprime1y + Xprime2 (I ? PX1)X2b2
= Xprime2y.
CHAPTER 3 LEAST SQUARES METHODS FOR ESTIMATING β 3
Thus
b2 = (Xprime2 (I ? PX1)X2)?1 Xprime2 (I ? PX1)y.
In the same manner,
b1 = (Xprime1 (I ? PX2)X1)?1 Xprime1 (I ? PX2)y.
Suppose that
X1 =
?
??
?
1
...
1
?
??
? and X2 = Z(n×K2).
Then
b2 = (Zprime (I ? P1)Z)?1 Zprime (I ? P1)y.
But
(I ? P1)Z = Z ?1(1prime1)1primeZ
and
1prime1 =n
1primeZ =
parenleftBig
1 ··· 1
parenrightBig
?
??
?
z11 ··· z1K2
...
zn1 ··· znK2
?
??
?
=
parenleftBig summationtextn
i=1 zi1 ···
summationtextn
i=1 ziK2
parenrightBig
.
Thus,
(I ? P1)Z = Z ?
?
??
?
1
...
1
?
??
?
parenleftBig
ˉz1 ··· ˉzK2
parenrightBig
=
?
??
??
?
z11 ? ˉz1 ··· z1K2 ? ˉzK2
z21 ? ˉz1 ··· z2K2 ? ˉzK2
...
zn1 ? ˉz1 ··· znK2 ? ˉzK2
?
??
??
?
In the same way,
(I ? P1)y =
?
??
?
y1 ? ˉy
...
yn ? ˉy
?
??
?.
CHAPTER 3 LEAST SQUARES METHODS FOR ESTIMATING β 4
These show that b2 is equivalent to the OLS estimator of β in the demeaned regression
equation
yi ? ˉy = βprime (zi ? ˉz)+ εi.
parenleftBig
ˉz = (ˉz1,···, ˉzK2)prime
parenrightBig
Whether we demean the data and run regression or put a constant term in the model
and run regression, we get the same results.