Ch. 9 Heteroscedasticity
Regression disturbances whose variance are not constant across observations
are heteroscedastic. In the heteroscedastic model, we assume that
E(""0) = 2 = 2
2
66
66
66
4
!1 0 : : : 0
0 !2 : : : 0
: : : : : :
: : : : : :
: : : : : :
0 0 : : : !N
3
77
77
77
5
=
2
66
66
66
4
21 0 : : : 0
0 22 : : : 0
: : : : : :
: : : : : :
: : : : : :
0 0 : : : 2N
3
77
77
77
5
:
It will sometimes prove useful to write 2i = 2!i. This form is an arbitrary
scaling which allows us to use a normalization,
tr( ) =
NX
i=1
!i = N:
(For example, 2 =
PN
i=1 2i
N .) This makes the classical regression with homoscedas-
tic disturbance a simple special cases with !i = 1; i = 1; 2; :::; N.
Example:
See the residuals at Figure 11.1.
1 Ordinary Least Squares Estimation
We showed in Section 8.1 that in the presence of heteroscedasticity, the OLS
estimator ^ is unbiased and consistent. However it is ine cient relative to the
GLS estimator.
1.1 Estimating the Appropriate Covariance Matrix for
OLS Estimators
If the type of heteroscedasticity is known with certainty, then the OLS estimator
is undesirable; we should use the GLS instead. The precise form of the het-
eroscedasticity is usually unknown, however. In that case, GLS is not usable,
and we may need to salvage what we can from the results of OLS estimators.
1
The conventional estimated covariance matrix for the OLS estimator 2(X0X) 1
is inappropriate; the appropriate matrix is 2(X0X) 1X0 X(X0X) 1. White
(1980) has shown that it is still possible to obtain an appropriate covariance es-
timator of the OLS estimators even the form of heteroscedasticity is unknown.
What is actually required is an estimate of
= 1N 2X0 X = 1N
NX
i=1
2i xix0i:
White (1980) shows that under very general conditions, the matrix
S0 = 1N
NX
i=1
e2i xix0i;
where ei is the i th least square residual, is a consistent estimator of . There-
fore, the White estimator,
\V ar(^ ) = N(X0X) 1S
0(X0X) 1;
can be used as an estimator of the true variance of the OLS estimator. Inference
concerning are still possible by means of OLS estimator even when the speci c
structure of is not speci ed as ^ is normally distributed asymptotically. More
generally, White shows that tests of the general linear hypothesis R = q, under
the null hypothesis, the statistics
(R^ q)0[R(X0X) 1NS0(X0X) 1R0] 1(R^ q) 2m;
where m denote the number of restrictions imposed.
Exercise:
Reproduce the results at Table 11.1.
2 Testing for Heteroscedasticity
One can rarely be certain that the disturbances are heteroscedastic however, and
unfortunately, what form the heteroscedasticity takes if they are. As such, it is
useful to be able to test for homoscedasticity and if necessary, modify our esti-
mation procedure accordingly.
2
Most of the test for heteroscedasticity are based on the following strategy. OLS
estimator is a consistent estimator of even in the presence of heteroscedastic-
ity. As such, the OLS residuals will mimic, albeit imperfectly because of sam-
pling variability, the heteroscedasticity of the true disturbance. Therefore, tests
designed to detect heteroscedasticity will, in most cases, be applied to the OLS
residuals.
2.1 Nonspeci c Tests for Heteroscedasticity
There may be instance when the form of the heteroscedasticity is not known, but
nevertheless, it is known that the disturbance variance in monotonically related
to the size of a known exogenous variables z by which observation on the depen-
dent variable y can be ordered. One frequently used test in this instance are the
Goldfeld-Quandt test.
Perhaps it is also believed that the broader class of heteroscedasticity is
2i = h(z0i ), where h( ) is a general function independent of i, is applicable
(such as 2i = z0i , 2i = (z0i )2 and 2i = exp(z0i )). If so, the Breush-Pagan
test is appropriate. If nothing is known a priori other than the heteroscedastic
variance are uniformly bounded, White general test is applicable.
2.1.1 The Goldfeld-Quandt Test
A very popular test for determining the presence of heteroscedasticity which is
monotonically related to an exogenous variables by which observations on the de-
pendent variables can be ordered is the Goldfeld-Quandt (1965) test. The steps
of this test are as follow:
1. Order the observations by the values of the variables z.
2. Choose p central observations and omit them.
3. Fit separate regression by OLS to the two groups, provides (N p)=2 > k.
4. Let SSE1 and SSE2 denote the sum of squared residuals based on the small
variance (which you suppose they do) and the large variance group, respectively.
Form the statistics
F = SSE1SSE
2
= e
0
1e1=N1 k
e02e2=N2 k;
3
which will distributed as FN1 k;N2 k under the null hypothesis of homoscedasticity
since e01e1 2 2N1 k and by the null assumption that 2 = 21 = 22.
2.1.2 The Breush-Pagan Test
The Goldfeld-Quandt test has been found to be reasonably powerful when we
are able to identify correctly the variable to use in the sample separation. This
requirement does limit its generality, however. Breush-Pagan (1979) assume a
border class of heteroscedasticity de ned by
2 = 2h( 0 + z0i 1);
where zi is a (p 1)vector of exogenous variables. This model is homoscedastic
if 1 = 0. Breush and Pagen consider the general estimation equation
^e2i
2 = 0 + z
0
i 1 + vi;
where ^ei represent the i th OLS residual and 2 = PNi=1 ^e2i =N. The null hy-
pothesis 1 = 0 can be tested if the "i are normally distributed. Let SSR denote
the sum of squares obtained in an OLS estimation of
^ei
2 = 0 + z
0
i 1 + vi:
(Let yi = ^e2i 2 ; y = PNi=1 yi=N, and ^yi = ^ 0 + z0i ^ 1. Then SSR = PNi=1(^yi y)2.)
Breush and Pagan shows, if 1 = 0, then
1
2SSR
2
p:
2.1.3 White’s General Test
White address the case where nothing is known about the structure of the het-
eroscedasticity other than the heteroscedastic variance 2i are uniformly bounded.
It would be desirable to be able to test a general hypothesis of the form:
H0 : 2i = 2 for all i;
H1 : Not H0:
If there is no heteroscedasticity (under H0), then s2(X0X) will give a consistent
estimator of variance ^ , where if there is, then it will not (see Ch. 8 sec.1). White
4
derives a test for heteroscedasticity which consists of comparing the elements of
NS0(= PNi=1 e2i xix0i) and s2(X0X)(= s2 PNi=1 xix0i), thus indicating whether or
not the usual OLS formula s2(X0X) is a consistent covariance estimator. Large
discrepancies between NS0 and s2(X0X) support the contention of heteroscedas-
ticity while small discrepancies support homoscedasticity.
A simple operational version of this test is carried out by obtaining NR2 in
the regression of e2i on a constant and all unique variables in x x. This statistics
is asymptotically distributed as 2p, where p is the number of regressors in the
regression, including the constant.
Exercise:
Reproduce the results of Example 11.3 at p.224.
3 Weighted Least Squares When is Known
Having tested for and found evidence of heteroscedasticity, the logical next is to
revise the estimation technique to account for it. The GLS estimator is
~ = (X0 1X) 1X0 1Y:
Consider the most general case, 2i = 2!i. Then 1 is a diagonal matrix
whose i th diagonal element is 1=!i. The GLS is obtained by regressing
PY =
2
66
66
66
4
y1=p!1
y2=p!2
:
:
:
yN=p!N
3
77
77
77
5
on PX =
2
66
66
66
4
x1=p!1
x2=p!2
:
:
:
xN=p!N
3
77
77
77
5
:
Applying OLS to the transformed model, we obtain the weighted least squares
(WLS) estimator,
~ =
" NX
i=1
wixix0i
# 1 " NX
i=1
wixiyi
#
;
where wi = 1=!i. The logic of the computation is that observations with smaller
variances receive a large weight in the computations of the sums and therefore
5
have greater in uence in the estimate obtained.
A common speci cation is that the variance is proportional to one of the
regressors or its square. If
2i = 2x2ik;
then the transformed regression model for the GLS is
y
xk = k + 1
x
1
xk
+ 2
x
2
xk
+ ::: + "x
k
:
If the variance is proportional to xk instead of x2k, then the weight applied to each
observations is 1=pxk instead of 1=xk.
4 Estimation When Contains Unknown Pa-
rameters
The general form of the heteroscedastic regression model has too many parameters
to estimate by ordinary method. Typically, the model is restricted by formulat-
ing 2 as a function of a few parameters, such as 2i = 2x i or 2i = 2[x0i ]2.
Write this as ( ), FGLS based on a consistent estimator of ( ) is asymptot-
ically equivalent to GLS. The new problem is that we must rst nd consistent
estimators of the unknown parameters in ( ). Two methods are typically used,
two step GLS and maximum likelihood.
4.1 Two-Step Estimation
For the heteroscedastic model, the GLS estimator is
~ =
" NX
i=1
1
2i
xix0i
# 1 " NX
i=1
1
2i
xiyi
#
:
The two step estimators are computed by rst obtaining estimators ^ 2i , usually
using some function of the OLS residuals, then the FGLS will be
=
" NX
i=1
1
^ 2i
xix0i
# 1 " NX
i=1
1
^ 2i
xiyi
#
:
6
Let
"2i = 2i + vi;
where vi is just the di erence between the random variable "2i and its expectation.
Since "i is unobservable, we would use the OLS residual, for which
ei = "i x0i(^ ) = "i + ui:
But in large sample, as ^ p ! , terms in ui will become negligible, so that at
least approximately,
ei = 2i + v i :
The procedure suggested is to treat the variance function as a regression and
use the squares of the OLS residual as the dependent variable. For example, if
2i = 0 + z0i 1, then a consistent estimator of will be the OLS in the model
e2i = 0 + z0i 1 + v i :
In this model, v i is both heteroscedastic and autocorrelated, so ^ is consistent
but ine cient. But, consistency is all that is required for asymptotically e cient
estimation of using (^ ).
The two-step estimator may be iterated by recomputing the residuals after
computing the FGLS estimate and then reentering the computation (OLS, ^ !
e ! ^ ! ! e !....).
Exercise:
Reproduce the results at Table 11.2 on p.231.
4.2 Maximum Likelihood Estimation
5 Autoregressive Conditional Heteroscedastic-
ity (ARCH)
7