Ch. 20 Processes with Deterministic Trends 1 Traditional Asymptotic Results of OLS Suppose a linear regression model with stochastic regressor given by Yt = x0t + "t; t = 1; 2; :::; T; ; 2Rk; (1) or in matrix form: y = X + ": We are interested in the asymptotic properties such as consistency and limiting distribution of the OLS estimator of ; ^ = (X0X) 1X0y as T ! 1, under simple traditional assumptions. 1.1 Independent Identically Distributed Observation 1.1.1 Consistency To prove consistency of ^ , we use Kolmogorov’s laws of large number of Ch 4. Rewrite ^ = (X0X) 1X0" = X0X T 1 X0" T = PT t=1 xtx 0 t T ! 1 PT t=1 xt"t T ! ; we have the following result. Theorem: In addition to (1), suppose that (1). f(x0t; "t)0g(k+1) 1 is an i:i:d: sequences; (2). (a) E(xt"t) = 0; 1 (b) EjXti"tj < 1; i = 1; 2; :::; k; (3). (a) EjXtij2 < 1; i = 1; 2; :::; k; (b) M E(xtx0t) is positive de nite; Then ^ a:s ! . Remark: 1. Assumption (2a) is talking about of the mean of this i:i:d: sequences (Xti"t; i = 1; 2; :::; k), see Proposition 3.3 of White, 2001, p.32) and (2b) is about its rst moment exist. 2. Assumption (3a) guarantee its (XtiXtj) rst moment exist by Cauchy-Schwarz inequality and (3b) is talking about of the mean of this i:i:d: (XtiXtj; i = 1; 2; ::; k; j = 1; 2; :::; k) sequence. An existence of the rst moment is what is need for LLN of i:i:d: sequence. See p.15 of Ch.4. Proof: It is obvious that from these assumptions we have X0" T = PT t=1 xt"t T ! a:s ! E PT t=1 xt"t T ! = 0 and X0X T = PT t=1 xtx 0 t T ! a:s ! E PT t=1 xtx 0 t T ! = M: (2) Therefore ^ a:s ! M 10 = 0; or ^ a:s ! : 2 1.1.2 Asymptotic Normality To prove asymptotic normality of ^ , we use Kolmogorov’s LLN and Lindeberg- L evy’s central limit theorem of Ch 4. Rewrite pT(^ ) = X0X T 1 p T X0" T = PT t=1 xtx 0 t T ! 1 p T PT t=1 xt"t T ! ; we have the following result. Theorem: In addition to (1), suppose (i). f(x0t; "t)0g is an i.i.d. sequences; (ii). (a) E(xt"t) = 0; (b) EjXti"tj2 < 1; i = 1; 2; :::; k; (c) VT V ar(T 1=2X0") = V is positive de nite; (iii). (a) M E(xtx0t) is positive de nite; (b) EjXtij2 < 1; i = 1; 2; :::; k; Then D 1=2pT(^ ) L ! N(0;I), where D M 1VM 1. Remark: 1. Assumption (ii.a) is talking about of the mean of this i:i:d: sequences (Xti"t; i = 1; 2; :::; k), (ii.b) is about its second moment exist which is needed for the appli- cation of Lindeberg-L evy’s central limit theorem (see p.22 of Ch. 4) and (ii.c) is to standardize the random vector T 1=2(X0") so that the asymptotic distribution is unit multivariate normal. 2. Assumption (iii.a) is talking about of the mean of this i:i:d: (XtiXtj; i = 1; 2; ::; k; j = 1; 2; :::; k) sequence and (iii.b) guarantee its rst moment exist by Cauchy-Schwarz inequality. An existence of the rst moment is what is need for LLN of i:i:d: sequence. See p.15 of Ch.4. Proof: 3 It is obvious that from these assumptions we have pT X0" T = T 1=2X0" L ! N(0; V ar(T 1=2X0") N(0;V) and X0X T = PT t=1 xtx 0 t T ! a:s ! E PT t=1 xtx 0 t T ! = M: Therefore pT(^ ) L ! M 1 N(0;V) N(0;M 1VM 1); or (M 1VM 1) 1=2pT(^ ) L ! N(0;I): 1.2 Independent Heterogeneously Distributed Observa- tion 1.2.1 Consistency To prove consistency of ^ , we use revised Markov laws of large number of Ch 4. Rewrite ^ = (X0X) 1X0" = X0X T 1 X0" T = PT t=1 xtx 0 t T ! 1 PT t=1 xt"t T ! ; we have the following result. Theorem: In addition to (1), suppose 4 (i). f(x0t; "t)0g is an independent sequences; (ii). (a) E(xt"t) = 0; (b) EjXti"tj1+ < < 1; for some > 0; i = 1; 2; :::; k; (iii). (a) MT E(X0X=T) is positive de nite; (b) EjX2tij1+ < < 1; for some > 0; i = 1; 2; :::; k; Then ^ a:s ! . Remark: 1. Assumption (ii.a) is talking about of the mean of this independent sequences (Xti"t; i = 1; 2; :::; k) and (ii.b) is about its (1 + ) moment exist. 2. Assumption (iii.a) is talking about of the limits of almost sure convergence of X0X T and (iii.b) guarantee its (1 + ) moment exist of (XtiXtj; i = 1; 2; ::; k; j = 1; 2; :::; k) by Cauchy-Schwarz inequality. An existence of the (1 + ) moment is what is need for LLN of independent sequence. See p.15 of Ch.4. Proof: It is obvious that from these assumptions we have X0" T = PT t=1 xt"t T ! a:s ! E PT t=1 xt"t T ! = 0 and X0X T = PT t=1 xtx 0 t T ! a:s ! E PT t=1 xtx 0 t T ! = MT: Therefore ^ a:s ! M 1T 0 = 0; or ^ a:s ! : 5 1.2.2 Asymptotic Normality To prove asymptotic normality of ^ , we use revised Markov’s LLN and Liapounov and Lindeberg-Feller’s central limit theorem of Ch 4. Rewrite pT(^ ) = X0X T 1 p T X0" T = PT t=1 xtx 0 t T ! 1 p T PT t=1 xt"t T ! ; we have the following result. Theorem: In addition to (1), suppose (i). f(x0t; "t)0g is an independent sequences; (ii). (a) E(xt"t) = 0; (b) EjXti"tj2+ < < 1; for some > 0; i = 1; 2; :::; k; (c) VT V ar(T 1=2X0") is positive de nite; (iii). (a) M E(X0X=T) is positive de nite; (b) EjX2tij1+ < < 1; for some > 0; i = 1; 2; :::; k; Then D 1=2T pT(^ ) L ! N(0;I), where DT M 1T VTM 1T . Remark: 1. Assumption (ii.a) is talking about of the mean of this independent sequences (Xti"t; i = 1; 2; :::; k), (ii.b) is about its (2 + ) moment exist which is needed for the application of Liapounov’s central limit theorem (see p.23 of Ch. 4) and (ii.c) is to standardize the random vector T 1=2(X0") so that the asymptotic distribu- tion is unit multivariate normal. 2. Assumption (iii.a) is talking about of the limits of almost sure convergence of X0X T and (iii.b) guarantee its (1 + ) moment exist of (XtiXtj; i = 1; 2; ::; k; j = 1; 2; :::; k) by Cauchy-Schwarz inequality. An existence of the (1 + ) moment is what is need for LLN of independent sequence. See p.15 of Ch.4. Proof: 6 It is obvious that from these assumptions we have pT X0" T = T 1=2X0" L ! N(0; V ar(T 1=2X0") N(0;VT ) and X0X T = PT t=1 xtx 0 t T ! a:s ! E PT t=1 xtx 0 t T ! = MT : Therefore pT(^ ) L ! M 1 T N(0;VT ) N(0;M 1T VTM 1T ); or (M 1T VTM 1T ) 1=2pT(^ ) L ! N(0;I): From results above, the asymptotic normality of OLS estimator depend cru- cial on the existence of at least second moments of the regressors Xti and from that we have LLN such that X0XT a:s ! E PT t=1 xtx 0t T = MT = O(1). As we have seen from last chapter that a I(1) variables does not have a nite sec- ond moments, therefore when the regressor is a unit root process, then tradi- tional asymptotic results for OLS estimator would not apply. However, there is a case that the regressor is not stochastic, but it violate the condition that X0X T a:s ! E PTt=1 xtx0t T = MT = O(1), as we will see in the following that the asymptotic normality still valid though the rate convergence to the normality changes. 7 2 Processes with Deterministic Time Trends The coe cients of regression models involving unit roots or deterministic time trends are typically estimated by OLS. However, the asymptotic distributions of the coe cient estimates cannot be calculated in the same way as are those for regression models involving stationary variables. Among other di culties, the estimates of di erent parameters will in general have di erent asymptotic rate of convergence. 2.1 Asymptotic Distribution of OLS Estimators of the Simple Time Trend Model Consider the OLS estimation of the parameters of a simple time trend, Yt = + t + "t; (3) for "t a white noise process. If "t N(0; 2), the model (3) satis es the classical assumption and the standard OLS t or F statistics would have exact small-sample t or F distributions. On the other hand, if "t is non-Gaussian, then a slightly di erent technique for nding the asymptotic distribution of the OLS estimates of and would to be used from that employed in last section. Write (3) in the form of the standard regression model, Yt = x0t + "t; where x0t 1 t : Let ^ T denote the OLS estimate of based on a sample of size T, the deviation of ^ T from the true value can be expressed as (^ T ) = " TX t=1 xtx0t # 1" TX t=1 xt"t # : (4) 8 To nd the asymptotic distribution of (^ T ), the approach in last section was to multiply (4) by pT, resulting in pT(^ T ) = " (1=T) TX t=1 xtx0t # 1" (1=pT) TX t=1 xt"t # : (5) The usual assumption was that (1=T)PTt=1 xtx0t converge in probability to non- singular matrix M while (1=pT)PTt=1 xt"t converges in distribution to a N(0;V) random variables, implying that pT(^ T ) L ! N(0; (M 1VM 1)). For xt given in (5), we note that 1 T v+1 TX t=1 tv ! 1v + 1; (6) implying that TX t=1 xtx0t = P1 Pt Pt Pt2 = T T(T + 1)=2 T(T + 1)=2 T(T + 1)(2T + 1)=6 O(T1) O(T2) O(T2) O(T3) : (7) In contrast to the usual results as (2), the matrix (1=T)PTt=1 xtx0t in (5) diverges. To obtain converge and nondegenerates limiting distribution, we can think of premultiplying and postmultiplying hP T t=1 xtx 0 t i by the matrix 1T = T1=2 0 0 T3=2 1 ; and obtains ( 1T " TX t=1 xtx0t # 1T ) = [T 1=2 0 0 T 3=2 P1 Pt Pt Pt2 [T 1=2 0 0 T 3=2 = T 1P1 T 2Pt T 2Pt T 3Pt2 ! Q; where Q = 1 1 21 2 1 3 (8) according to (6). Turning next to the second term in (4) and premultiplying it by 1T , 1T " TX t=1 xt"t # = T 1=2 0 0 T 3=2 P" tP t"t = (1=pT)P" t (1=pT)P(t=T)"t : (9) 9 We now prove the asymptotic normality of (9) under standard assumption about "t. Suppose that "t is i:i:d: with mean zero, variance 2, and nite fourth moment. Then the rst element of the vector in (9) satis es (1=pT) X "t L ! N(0; 2) by the Linderberg-L evy CLT. For the second element of the vector in (9), observe that f(t=T)"tg is a martin- gale di erence sequence that satis es the de nition on p.13 of Ch. 4. Speci cally, its variance is 2t = E[(t=T)"t]2 = 2 (t2=T2); where (1=T) TX t=1 2t = 2(1=T3) TX t=1 t2 ! 2=3: Furthermore, to apply CLT of a martingale di erence sequence, we need to show that (1=T)PTt=1[(t=T)"t]2 p ! 2=3 as the condition (iii) on page 26 of Ch. 4. To prove this, notices that E (1=T) TX t=1 [(t=T)"t]2 (1=T) TX t=1 2t !2 = E (1=T) TX t=1 [(t=T)"t]2 (1=T) TX t=1 (t=T)2 2 !2 = E (1=T) TX t=1 (t=T)2("2t 2) !2 = (1=T)2 TX t=1 (t=T)4E("2t 2)2 = E("2t 2)2 1=T6 TX t=1 t4 ! ! 0; according to (6) and fourth moment of "t exist by assumption. This imply that (1=T) TX t=1 [(t=T)"t]2 (1=T) TX t=1 2t m:s ! 0; which also imply that (1=T) TX t=1 [(t=T)"t]2 p ! 2=3: 10 Hence, from Theorem ? (p.26 of Ch. 4), (1=pT)PTt=1(t=T)"t satis es the CLT: (1= p T) TX t=1 (t=T)"t L ! N(0; 2=3): Finally, consider the joint distribution of the two element in the (2 1) vector described by (9). Any linear combination of these elements takes the form (1=pT) TX t=1 [ 1 + 2(t=T))]"t: Then [ 1 + 2(t=T))]"t is also a martingale di erence sequence with positive vari- ance given by 2[ 21 + 2 1 2(t=T)) + 22(t=T)2] satisfying (1=T) TX t=1 2 21 + 2 1 2(t=T)) + 22(t=T)2 ! 2[ 21 + 2 1 2 1 2 + 22 1 3 ] = 2 1 2 1 1=2 1=2 1=3 1 2 = 2 0Q for ( 1; 2)0 and Q the matrix in (8). Furthermore, we can show that (1=T) TX t=1 [ 1 + 2(t=T))]2"2t p ! 2 0Q : That is this martingale di erence sequence [ 1 + 2(t=T))]"t could apply CLT. Thus, any linear combination of the two elements in the vector in (9) is asymp- totically Gaussian, implying that bivariate Gaussian distribution: (1=pT)P" t (1=pT)P(t=T)"t L ! N(0; 2Q) form Cramer-Wold device and the fact that Ef[(1=pT)P"t][(1=pT)P(t=T)"t]g = 1 T2 2Pt ! 1 2 2. 11 Collecting results we have T1=2(^ T ) T3=2(^ T ) = T " TX t=1 xtx0t # 1" TX t=1 xt"t # = T " TX t=1 xtx0t # 1 T 1T " TX t=1 xt"t # = ( 1T " TX t=1 xtx0t # 1T ) 1( 1T " TX t=1 xt"t #) L ! Q 1 N(0; 2Q) N(0; 2Q 1): It turns out that the OLS estimators ^ T and ^ T have di erent asymptotic rates of convergence. Note that the time trend estimator ^ T is superconsistent{not only ^ T p ! , but even when multiplied by T, we still have T(^ T ) p ! 0: 12 2.2 Asymptotic Distribution of OLS Estimators for an Autoregressive Process Around a Deterministic Time Trend The same principles can be used to study a general autoregressive process around a deterministic trend: Yt = + t + 1Yt 1 + 2Yt 2 + ::: + pYt p + "t: (10) It is assumed that "t is i:i:d: with mean zero, variance 2, and nite fourth moment, and that roots of 1 1z 2z2 ::: pzp = 0 lie outside the unit circle. Consider a sample of size T + p observations on Y , and let ^ ; ^ ; ^ 1;T ; :::; ^ p;T denote coe cients based on OLS estimation of (10) for t = 1; 2; :::; T. Remark: The regressor Yt i; i = 1; :::; p in (10) is a trend-stationary process (it is nonstationary itself !). To remove the nonstationarity of the regressors for the validity of LLN (X0X=T p ! E(X0X=T)), we may transform the regressor in terms of zero-mean covariance-stationary process by subtracting time trend from each regressor. 2.2.1 A useful Transformation of the Regressors By adding and subtracting j[ + (t j)] for j = 1; 2; :::; p on the right side, the regression model (10) can be equivalent be written as ( for each regressor Yt j, it has a constant and trend (t j)) Yt = (1 + 1 + 2 + ::: + p) ( 1 + 2 2 + ::: + p p) + (1 + 1 + 2 + ::: + p)t + 1[Yt 1 (t 1)] + 2[Yt 2 (t 2)] + ::: + p[Yt p (t p)] + "t or Yt = + t + 1Y t 1 + 2Y t 2 + ::: + pY t p + "t: (11) 13 where = (1 + 1 + 2 + ::: + p) ( 1 + 2 2 + ::: + p p) = (1 + 1 + 2 + ::: + p) j = j for j = 1; 2; :::; p and Y t j = Yt j (t j) for j = 1; 2; :::; p: The original regression model (10) can be written Yt = x0t + "t; (12) where xt = 2 66 66 66 66 66 4 Yt 1 Yt 2 : : : Yt p 1 t 3 77 77 77 77 77 5 = 2 66 66 66 66 66 4 1 2 : : : p 3 77 77 77 77 77 5 : The algebraic transformation in arriving at (11) could then be described as rewrit- ing (12) in the form Yt = x0tG0[G0] 1 + "t = [x t ]0 + "t; (13) where G0 2 66 66 66 66 66 4 1 0 : : : 0 0 0 0 1 : : : 0 0 0 : : : : : : : : : : : : : : : : : : : : : : : : 0 0 : : : 1 0 0 + + 2 : : : + p 1 0 : : : 0 1 3 77 77 77 77 77 5 ; 14 [G0] 1 2 66 66 66 66 66 4 1 0 : : : 0 0 0 0 1 : : : 0 0 0 : : : : : : : : : : : : : : : : : : : : : : : : 0 0 : : : 1 0 0 2 : : : p 1 0 : : : 0 1 3 77 77 77 77 77 5 ; (hints: From partitioned inverse rule, I 0 H I 1 = I 0 H I .) x t Gxt = 2 66 66 66 66 66 4 Y t 1 Y t 2 : : : Y t p 1 t 3 77 77 77 77 77 5 [G0] 1 = 2 66 66 66 66 66 4 1 2 : : : p 3 77 77 77 77 77 5 : The OLS estimate of based on regression of Yt on x t is given by ^ = " TX t=1 x t [x t ]0 # 1" TX t=1 xtYt # = " G TX t=1 xtx0t ! G0 # 1 G TX t=1 xtYt ! = [G0] 1 TX t=1 xtx0t ! 1 G 1G TX t=1 xtYt ! = [G0] 1 TX t=1 xtx0t ! 1 TX t=1 xtYt ! = [G0] 1 ^ ; where ^ is the coe cients OLS estimation from original data of Yt on xt. The asymptotic distribution of ^ can therefore be inferred from ^ = G0 ^ : (14) 15 2.2.2 The Asymptotic Distribution of OLS Estimates for the Trans- formed Regression To derive the asymptotic distribution ^ T , we note that T (^ T ) = T " TX t=1 x t [x ]0t # 1" TX t=1 x t "t # = T " TX t=1 x t [x ]0t # 1 T 1T " TX t=1 x t "t # = ( 1T " TX t=1 x t [x ]0t # 1T ) 1( 1T " TX t=1 x t "t #) ; where T = 2 66 66 66 66 66 4 T1=2 0 0 : : : 0 0 0 0 T1=2 0 : : : 0 0 0 : : : : : : : : : : : : : : : : : : : : : : : : : : : 0 0 0 : : : T 1=2 0 0 0 0 0 : : : 0 T 1=2 0 0 0 0 : : : 0 0 T 3=2 3 77 77 77 77 77 5 : From (13), TX t=1 x t [x ]0t = 2 66 66 66 66 66 4 P(Y t 1) 2 PY t 1Y t 2 : : : PY t 1Y t p PY t 1 PtY t 1P Y t 2Y t 1 P(Y t 2)2 : : : PY t 2Y t p PY t 2 PtY t 2 : : : : : : : : : : : : : : : : : : : : : : : :P Y t pY t 1 PY t pY t 2 : : : P(Y t p)2 PY t p PtY t pP Y t 1 PY t 2 : : : PY t p P1 PtP tY t 1 PtY t 2 : : : PtY t p Pt Pt2 3 77 77 77 77 77 5 ; and therefore, 1T TX t=1 x t [x ]0t 1T 16 = 2 66 66 66 66 66 4 T 1P(Y t 1)2 T 1PY t 1Y t 2 : : : T 1PY t 1Y t p T 1PY t 1 T 2PtY t 1 T 1PY t 2Y t 1 T 1P(Y t 2)2 : : : T 1PY t 2Y t p T 1PY t 2 T 2PtY t 2 : : : : : : : : : : : : : : : : : : : : : : : : T 1PY t pY t 1 T 1PY t pY t 2 : : : T 1P(Y t p)2 T 1PY t p T 2PtY t p T 1PY t 1 T 1PY t 2 : : : T 1PY t p T 1T T 2Pt T 2PtY t 1 T 2PtY t 2 : : : T 2PtY t p T 2Pt T 3Pt2 3 77 77 77 77 77 5 :(15) For the rst p rows and columns, the row i, column j elements of the matrix (15) T 1 TX t=1 Y t iY t j p ! ji jj by LLN of covariance stationary process. The rst p element of row p + 1 (and the rst p element of column p + 1) T 1 TX t=1 Y t i p ! 0 also by LLN of zero-mean covariance stationary process. The rst p element of row p + 2 (and the rst p element of column p + 2) T 1 TX t=1 (t=T)Y t i p ! 0 from the theorem below. Thus, 1T TX t=1 x t [x ]0t 1T p ! Q (16) where Q = 2 66 66 66 66 66 4 0 1 2 : : : p 1 0 0 1 0 1 : : : p 2 0 0 : : : : : : : : : : : : : : : : : : : : : : : : : : : p 1 p 2 p 3 : : : 0 0 0 0 0 0 : : : 0 1 12 0 0 0 : : : 0 12 13 3 77 77 77 77 77 5 : 17 Theorem: Let Y t i be covariance-stationary with mean zero and absolutely summable au- tocovariance, then T 2Pty t i p ! 0; i = 1; 2; :::; p: Proof: We show that E(T 2PtY t i 0)2 ! 0 which would imply that T 2PtY t i m:s ! 0 and also that T 2PtY t i p ! 0. To see this, since E(T 2 X tY t i 0)2 = (1=T 4)E[(Y1 i + 2Y2 i + ::: + TYT i)(Y1 i + 2Y2 i + ::: + TYT i)] = (1=T 4)Ef(Y1 i)[(Y1 i + 2Y2 i + ::: + TYT i)] +(2Y2 i)[(Y1 i + 2Y2 i + ::: + TYT i)] +(3Y3 i)[(Y1 i + 2Y2 i + ::: + TYT i)] +::: + (TYT i)[(Y1 i + 2Y2 i + ::: + TYT i)]g = (1=T 4)f[1 1 0 + 1 2 1 + 1 3 2 + 1 4 3 + ::: + 1 T T 1] +[2 1 1 + 2 2 0 + 2 3 1 + 2 4 2 + ::: + 2 T T 2] +[3 1 2 + 3 2 1 + 3 3 0 + 3 4 1 + 3 5 2 + ::: + 3 T T 3] +::: + [T 1 T 1 + T 2 T 2 + T 3 T 3 + ::: + T T 0]g = (1=T 4) ( TX t=1 t2 0 + 2 T 1X t=1 t(t + 1) 1 + 2 T 2X t=1 t(t + 2) 2 + ::: + 2T T 1 ) = (1=T) (" TX t=1 t2=T3 # 0 + "T 1X t=1 t(t + 1)=T 3 # 2 1 "T 2X t=1 t(t + 2)=T 3 # 2 2 + :::: + T=T3 2 T 1 ) = (1=T) " TX t=1 t2=T3 # 0 + "T 1X t=1 t(t + 1)=T 3 # 2 1 "T 2X t=1 t(t + 2)=T 3 # 2 2 + :::: + T=T3 2 T 1 ; 18 then T E(T 2 X tY t i 0)2 = " TX t=1 t2=T3 # 0 + "T 1X t=1 t(t + 1)=T 3 # 2 1 "T 2X t=1 t(t + 2)=T 3 # 2 2 + :::: + T=T3 2 T 1 ; ( TX t=1 t2=T3 j 0j + T 1X t=1 t(t + 1)=T 3 2j 1j T 2X t=1 t(t + 2)=T 3 2j 2j + :::: + T=T3 2j T 1j ) fj 0j + 2j 1j + 2j 2j + :::g since (1=T v+1 TX t=1 tv ! 1=(v + 1) < 1 ! < 1: So, E(T 2PtY t i 0)2 ! 0 and therefore T 2PtY t i p ! 0 as claimed. We now turn to second element of OLS estimator, ( 1T " TX t=1 x t "t #) = 2 66 66 66 66 66 4 T 1=2PY t 1"t T 1=2PY t 2"t : : : T 1=2PY t p"t T 1=2P"t T 1=2P(t=T)"t 3 77 77 77 77 77 5 = T 1=2 TX t=1 t; where t = 2 66 66 66 66 66 4 PY t 1"tP Y t 2"t : : :P Y t p"tP "tP (t=T)"t 3 77 77 77 77 77 5 : But t is a martingale di erence with variance E( t 0t) = 2Q t ; 19 where Q t = 2 66 66 66 66 66 4 0 1 2 : : : p 1 0 0 1 0 1 : : : p 2 0 0 : : : : : : : : : : : : : : : : : : : : : : : : : : : p 1 p 2 p 3 : : : 0 0 0 0 0 0 : : : 0 1 t=T 0 0 0 : : : 0 t=T t2=T2 3 77 77 77 77 77 5 and (1=T) TX t=1 Q t ! Q : Applying the CLT for martingale di erence, it can be shown that 1T TX t=1 x t "t L ! N(0; 2Q ): (17) It follow from (16) and (17) that T (^ T ) L ! N(0; [Q ] 1 2Q [Q ] 1) = N(0; 2[Q ] 1): (18) 2.2.3 The Asymptotic Distribution of OLS Estimators for the Original Regression From (14) we have 2 66 66 66 66 66 64 ^ 1 ^ 2 : : : ^ p ^ ^ 3 77 77 77 77 77 75 = 2 66 66 66 66 66 4 1 0 : : : 0 0 0 0 1 : : : 0 0 0 : : : : : : : : : : : : : : : : : : : : : : : : 0 0 : : : 1 0 0 + + 2 : : : + p 1 0 : : : 0 1 3 77 77 77 77 77 5 2 66 66 66 66 66 64 ^ 1 ^ 2 : : : ^ p ^ ^ 3 77 77 77 77 77 75 : 20 The original OLS estimators ^ i are identical to the estimators from the trans- formed regression ^ i , the asymptotic distribution of ^ i is given immediately from (18). The estimator ^ is a linear combination of variables that converges to a Gaussian distribution at rate pT, so ^ behave the same way. Speci cally, ^ = g0 ^ T , where g0 + + 2 : : : + p 1 0 ; so from (18), pT(^ ) L ! N(0;g0 2[Q ] 1g ): Finally, the estimator ^ is a linear combination of variables converging at di erent rate: ^ = g0 ^ T + ^ T ; where g0 = : : : 0 0 : Since g0 ^ T is O(T 1=2) and ^ T is O(T 3=2), ^ is O(T 1=2) (see p. 3 of Ch 4), therefore, p T(^ ) L ! N(0;g0 2[Q ] 1g ): Thus, each of the elements of ^ T individually is asymptotically Gaussian and is O(T 1=2). The asymptotic distribution of the full vector pT(^ T ) is multivariate Gaussian. 21