Ch. 9 Heteroscedasticity Regression disturbances whose variance are not constant across observations are heteroscedastic. In the heteroscedastic model, we assume that E(""0) = 2 = 2 2 66 66 66 4 !1 0 : : : 0 0 !2 : : : 0 : : : : : : : : : : : : : : : : : : 0 0 : : : !N 3 77 77 77 5 = 2 66 66 66 4 21 0 : : : 0 0 22 : : : 0 : : : : : : : : : : : : : : : : : : 0 0 : : : 2N 3 77 77 77 5 : It will sometimes prove useful to write 2i = 2!i. This form is an arbitrary scaling which allows us to use a normalization, tr( ) = NX i=1 !i = N: (For example, 2 = PN i=1 2i N .) This makes the classical regression with homoscedas- tic disturbance a simple special cases with !i = 1; i = 1; 2; :::; N. Example: See the residuals at Figure 11.1. 1 Ordinary Least Squares Estimation We showed in Section 8.1 that in the presence of heteroscedasticity, the OLS estimator ^ is unbiased and consistent. However it is ine cient relative to the GLS estimator. 1.1 Estimating the Appropriate Covariance Matrix for OLS Estimators If the type of heteroscedasticity is known with certainty, then the OLS estimator is undesirable; we should use the GLS instead. The precise form of the het- eroscedasticity is usually unknown, however. In that case, GLS is not usable, and we may need to salvage what we can from the results of OLS estimators. 1 The conventional estimated covariance matrix for the OLS estimator 2(X0X) 1 is inappropriate; the appropriate matrix is 2(X0X) 1X0 X(X0X) 1. White (1980) has shown that it is still possible to obtain an appropriate covariance es- timator of the OLS estimators even the form of heteroscedasticity is unknown. What is actually required is an estimate of = 1N 2X0 X = 1N NX i=1 2i xix0i: White (1980) shows that under very general conditions, the matrix S0 = 1N NX i=1 e2i xix0i; where ei is the i th least square residual, is a consistent estimator of . There- fore, the White estimator, \V ar(^ ) = N(X0X) 1S 0(X0X) 1; can be used as an estimator of the true variance of the OLS estimator. Inference concerning are still possible by means of OLS estimator even when the speci c structure of is not speci ed as ^ is normally distributed asymptotically. More generally, White shows that tests of the general linear hypothesis R = q, under the null hypothesis, the statistics (R^ q)0[R(X0X) 1NS0(X0X) 1R0] 1(R^ q) 2m; where m denote the number of restrictions imposed. Exercise: Reproduce the results at Table 11.1. 2 Testing for Heteroscedasticity One can rarely be certain that the disturbances are heteroscedastic however, and unfortunately, what form the heteroscedasticity takes if they are. As such, it is useful to be able to test for homoscedasticity and if necessary, modify our esti- mation procedure accordingly. 2 Most of the test for heteroscedasticity are based on the following strategy. OLS estimator is a consistent estimator of even in the presence of heteroscedastic- ity. As such, the OLS residuals will mimic, albeit imperfectly because of sam- pling variability, the heteroscedasticity of the true disturbance. Therefore, tests designed to detect heteroscedasticity will, in most cases, be applied to the OLS residuals. 2.1 Nonspeci c Tests for Heteroscedasticity There may be instance when the form of the heteroscedasticity is not known, but nevertheless, it is known that the disturbance variance in monotonically related to the size of a known exogenous variables z by which observation on the depen- dent variable y can be ordered. One frequently used test in this instance are the Goldfeld-Quandt test. Perhaps it is also believed that the broader class of heteroscedasticity is 2i = h(z0i ), where h( ) is a general function independent of i, is applicable (such as 2i = z0i , 2i = (z0i )2 and 2i = exp(z0i )). If so, the Breush-Pagan test is appropriate. If nothing is known a priori other than the heteroscedastic variance are uniformly bounded, White general test is applicable. 2.1.1 The Goldfeld-Quandt Test A very popular test for determining the presence of heteroscedasticity which is monotonically related to an exogenous variables by which observations on the de- pendent variables can be ordered is the Goldfeld-Quandt (1965) test. The steps of this test are as follow: 1. Order the observations by the values of the variables z. 2. Choose p central observations and omit them. 3. Fit separate regression by OLS to the two groups, provides (N p)=2 > k. 4. Let SSE1 and SSE2 denote the sum of squared residuals based on the small variance (which you suppose they do) and the large variance group, respectively. Form the statistics F = SSE1SSE 2 = e 0 1e1=N1 k e02e2=N2 k; 3 which will distributed as FN1 k;N2 k under the null hypothesis of homoscedasticity since e01e1 2 2N1 k and by the null assumption that 2 = 21 = 22. 2.1.2 The Breush-Pagan Test The Goldfeld-Quandt test has been found to be reasonably powerful when we are able to identify correctly the variable to use in the sample separation. This requirement does limit its generality, however. Breush-Pagan (1979) assume a border class of heteroscedasticity de ned by 2 = 2h( 0 + z0i 1); where zi is a (p 1)vector of exogenous variables. This model is homoscedastic if 1 = 0. Breush and Pagen consider the general estimation equation ^e2i 2 = 0 + z 0 i 1 + vi; where ^ei represent the i th OLS residual and 2 = PNi=1 ^e2i =N. The null hy- pothesis 1 = 0 can be tested if the "i are normally distributed. Let SSR denote the sum of squares obtained in an OLS estimation of ^ei 2 = 0 + z 0 i 1 + vi: (Let yi = ^e2i 2 ; y = PNi=1 yi=N, and ^yi = ^ 0 + z0i ^ 1. Then SSR = PNi=1(^yi y)2.) Breush and Pagan shows, if 1 = 0, then 1 2SSR 2 p: 2.1.3 White’s General Test White address the case where nothing is known about the structure of the het- eroscedasticity other than the heteroscedastic variance 2i are uniformly bounded. It would be desirable to be able to test a general hypothesis of the form: H0 : 2i = 2 for all i; H1 : Not H0: If there is no heteroscedasticity (under H0), then s2(X0X) will give a consistent estimator of variance ^ , where if there is, then it will not (see Ch. 8 sec.1). White 4 derives a test for heteroscedasticity which consists of comparing the elements of NS0(= PNi=1 e2i xix0i) and s2(X0X)(= s2 PNi=1 xix0i), thus indicating whether or not the usual OLS formula s2(X0X) is a consistent covariance estimator. Large discrepancies between NS0 and s2(X0X) support the contention of heteroscedas- ticity while small discrepancies support homoscedasticity. A simple operational version of this test is carried out by obtaining NR2 in the regression of e2i on a constant and all unique variables in x x. This statistics is asymptotically distributed as 2p, where p is the number of regressors in the regression, including the constant. Exercise: Reproduce the results of Example 11.3 at p.224. 3 Weighted Least Squares When is Known Having tested for and found evidence of heteroscedasticity, the logical next is to revise the estimation technique to account for it. The GLS estimator is ~ = (X0 1X) 1X0 1Y: Consider the most general case, 2i = 2!i. Then 1 is a diagonal matrix whose i th diagonal element is 1=!i. The GLS is obtained by regressing PY = 2 66 66 66 4 y1=p!1 y2=p!2 : : : yN=p!N 3 77 77 77 5 on PX = 2 66 66 66 4 x1=p!1 x2=p!2 : : : xN=p!N 3 77 77 77 5 : Applying OLS to the transformed model, we obtain the weighted least squares (WLS) estimator, ~ = " NX i=1 wixix0i # 1 " NX i=1 wixiyi # ; where wi = 1=!i. The logic of the computation is that observations with smaller variances receive a large weight in the computations of the sums and therefore 5 have greater in uence in the estimate obtained. A common speci cation is that the variance is proportional to one of the regressors or its square. If 2i = 2x2ik; then the transformed regression model for the GLS is y xk = k + 1 x 1 xk + 2 x 2 xk + ::: + "x k : If the variance is proportional to xk instead of x2k, then the weight applied to each observations is 1=pxk instead of 1=xk. 4 Estimation When Contains Unknown Pa- rameters The general form of the heteroscedastic regression model has too many parameters to estimate by ordinary method. Typically, the model is restricted by formulat- ing 2 as a function of a few parameters, such as 2i = 2x i or 2i = 2[x0i ]2. Write this as ( ), FGLS based on a consistent estimator of ( ) is asymptot- ically equivalent to GLS. The new problem is that we must rst nd consistent estimators of the unknown parameters in ( ). Two methods are typically used, two step GLS and maximum likelihood. 4.1 Two-Step Estimation For the heteroscedastic model, the GLS estimator is ~ = " NX i=1 1 2i xix0i # 1 " NX i=1 1 2i xiyi # : The two step estimators are computed by rst obtaining estimators ^ 2i , usually using some function of the OLS residuals, then the FGLS will be = " NX i=1 1 ^ 2i xix0i # 1 " NX i=1 1 ^ 2i xiyi # : 6 Let "2i = 2i + vi; where vi is just the di erence between the random variable "2i and its expectation. Since "i is unobservable, we would use the OLS residual, for which ei = "i x0i(^ ) = "i + ui: But in large sample, as ^ p ! , terms in ui will become negligible, so that at least approximately, ei = 2i + v i : The procedure suggested is to treat the variance function as a regression and use the squares of the OLS residual as the dependent variable. For example, if 2i = 0 + z0i 1, then a consistent estimator of will be the OLS in the model e2i = 0 + z0i 1 + v i : In this model, v i is both heteroscedastic and autocorrelated, so ^ is consistent but ine cient. But, consistency is all that is required for asymptotically e cient estimation of using (^ ). The two-step estimator may be iterated by recomputing the residuals after computing the FGLS estimate and then reentering the computation (OLS, ^ ! e ! ^ ! ! e !....). Exercise: Reproduce the results at Table 11.2 on p.231. 4.2 Maximum Likelihood Estimation 5 Autoregressive Conditional Heteroscedastic- ity (ARCH) 7