Ch. 20 Processes with Deterministic Trends
1 Traditional Asymptotic Results of OLS
Suppose a linear regression model with stochastic regressor given by
Yt = x0t + "t; t = 1; 2; :::; T; ; 2Rk; (1)
or in matrix form:
y = X + ":
We are interested in the asymptotic properties such as consistency and limiting
distribution of the OLS estimator of ; ^ = (X0X) 1X0y as T ! 1, under
simple traditional assumptions.
1.1 Independent Identically Distributed Observation
1.1.1 Consistency
To prove consistency of ^ , we use Kolmogorov’s laws of large number of Ch 4.
Rewrite
^ = (X0X) 1X0"
=
X0X
T
1 X0"
T
=
PT
t=1 xtx
0
t
T
! 1 PT
t=1 xt"t
T
!
;
we have the following result.
Theorem:
In addition to (1), suppose that
(1). f(x0t; "t)0g(k+1) 1 is an i:i:d: sequences;
(2).
(a) E(xt"t) = 0;
1
(b) EjXti"tj < 1; i = 1; 2; :::; k;
(3).
(a) EjXtij2 < 1; i = 1; 2; :::; k;
(b) M E(xtx0t) is positive de nite;
Then ^ a:s ! .
Remark:
1. Assumption (2a) is talking about of the mean of this i:i:d: sequences (Xti"t; i =
1; 2; :::; k), see Proposition 3.3 of White, 2001, p.32) and (2b) is about its rst
moment exist.
2. Assumption (3a) guarantee its (XtiXtj) rst moment exist by Cauchy-Schwarz
inequality and (3b) is talking about of the mean of this i:i:d: (XtiXtj; i = 1; 2; ::; k; j =
1; 2; :::; k) sequence.
An existence of the rst moment is what is need for LLN of i:i:d: sequence.
See p.15 of Ch.4.
Proof:
It is obvious that from these assumptions we have
X0"
T
=
PT
t=1 xt"t
T
!
a:s ! E
PT
t=1 xt"t
T
!
= 0
and
X0X
T
=
PT
t=1 xtx
0
t
T
!
a:s ! E
PT
t=1 xtx
0
t
T
!
= M: (2)
Therefore
^ a:s ! M 10 = 0;
or
^ a:s ! :
2
1.1.2 Asymptotic Normality
To prove asymptotic normality of ^ , we use Kolmogorov’s LLN and Lindeberg-
L evy’s central limit theorem of Ch 4. Rewrite
pT(^ ) = X0X
T
1 p
T
X0"
T
=
PT
t=1 xtx
0
t
T
! 1 p
T
PT
t=1 xt"t
T
!
;
we have the following result.
Theorem:
In addition to (1), suppose
(i). f(x0t; "t)0g is an i.i.d. sequences;
(ii).
(a) E(xt"t) = 0;
(b) EjXti"tj2 < 1; i = 1; 2; :::; k;
(c) VT V ar(T 1=2X0") = V is positive de nite;
(iii).
(a) M E(xtx0t) is positive de nite;
(b) EjXtij2 < 1; i = 1; 2; :::; k;
Then D 1=2pT(^ ) L ! N(0;I), where D M 1VM 1.
Remark:
1. Assumption (ii.a) is talking about of the mean of this i:i:d: sequences (Xti"t; i =
1; 2; :::; k), (ii.b) is about its second moment exist which is needed for the appli-
cation of Lindeberg-L evy’s central limit theorem (see p.22 of Ch. 4) and (ii.c) is
to standardize the random vector T 1=2(X0") so that the asymptotic distribution
is unit multivariate normal.
2. Assumption (iii.a) is talking about of the mean of this i:i:d: (XtiXtj; i =
1; 2; ::; k; j = 1; 2; :::; k) sequence and (iii.b) guarantee its rst moment exist by
Cauchy-Schwarz inequality. An existence of the rst moment is what is need for
LLN of i:i:d: sequence. See p.15 of Ch.4.
Proof:
3
It is obvious that from these assumptions we have
pT X0"
T
= T 1=2X0" L ! N(0; V ar(T 1=2X0") N(0;V)
and
X0X
T
=
PT
t=1 xtx
0
t
T
!
a:s ! E
PT
t=1 xtx
0
t
T
!
= M:
Therefore
pT(^ ) L ! M 1 N(0;V)
N(0;M 1VM 1);
or
(M 1VM 1) 1=2pT(^ ) L ! N(0;I):
1.2 Independent Heterogeneously Distributed Observa-
tion
1.2.1 Consistency
To prove consistency of ^ , we use revised Markov laws of large number of Ch 4.
Rewrite
^ = (X0X) 1X0"
=
X0X
T
1 X0"
T
=
PT
t=1 xtx
0
t
T
! 1 PT
t=1 xt"t
T
!
;
we have the following result.
Theorem:
In addition to (1), suppose
4
(i). f(x0t; "t)0g is an independent sequences;
(ii).
(a) E(xt"t) = 0;
(b) EjXti"tj1+ < < 1; for some > 0; i = 1; 2; :::; k;
(iii).
(a) MT E(X0X=T) is positive de nite;
(b) EjX2tij1+ < < 1; for some > 0; i = 1; 2; :::; k;
Then ^ a:s ! .
Remark:
1. Assumption (ii.a) is talking about of the mean of this independent sequences
(Xti"t; i = 1; 2; :::; k) and (ii.b) is about its (1 + ) moment exist.
2. Assumption (iii.a) is talking about of the limits of almost sure convergence of
X0X
T and (iii.b) guarantee its (1 + ) moment exist of (XtiXtj; i = 1; 2; ::; k; j =
1; 2; :::; k) by Cauchy-Schwarz inequality.
An existence of the (1 + ) moment is what is need for LLN of independent
sequence. See p.15 of Ch.4.
Proof:
It is obvious that from these assumptions we have
X0"
T
=
PT
t=1 xt"t
T
!
a:s ! E
PT
t=1 xt"t
T
!
= 0
and
X0X
T
=
PT
t=1 xtx
0
t
T
!
a:s ! E
PT
t=1 xtx
0
t
T
!
= MT:
Therefore
^ a:s ! M 1T 0 = 0;
or
^ a:s ! :
5
1.2.2 Asymptotic Normality
To prove asymptotic normality of ^ , we use revised Markov’s LLN and Liapounov
and Lindeberg-Feller’s central limit theorem of Ch 4. Rewrite
pT(^ ) = X0X
T
1 p
T
X0"
T
=
PT
t=1 xtx
0
t
T
! 1 p
T
PT
t=1 xt"t
T
!
;
we have the following result.
Theorem:
In addition to (1), suppose
(i). f(x0t; "t)0g is an independent sequences;
(ii).
(a) E(xt"t) = 0;
(b) EjXti"tj2+ < < 1; for some > 0; i = 1; 2; :::; k;
(c) VT V ar(T 1=2X0") is positive de nite;
(iii).
(a) M E(X0X=T) is positive de nite;
(b) EjX2tij1+ < < 1; for some > 0; i = 1; 2; :::; k;
Then D 1=2T pT(^ ) L ! N(0;I), where DT M 1T VTM 1T .
Remark:
1. Assumption (ii.a) is talking about of the mean of this independent sequences
(Xti"t; i = 1; 2; :::; k), (ii.b) is about its (2 + ) moment exist which is needed for
the application of Liapounov’s central limit theorem (see p.23 of Ch. 4) and (ii.c)
is to standardize the random vector T 1=2(X0") so that the asymptotic distribu-
tion is unit multivariate normal.
2. Assumption (iii.a) is talking about of the limits of almost sure convergence of
X0X
T and (iii.b) guarantee its (1 + ) moment exist of (XtiXtj; i = 1; 2; ::; k; j =
1; 2; :::; k) by Cauchy-Schwarz inequality. An existence of the (1 + ) moment is
what is need for LLN of independent sequence. See p.15 of Ch.4.
Proof:
6
It is obvious that from these assumptions we have
pT X0"
T
= T 1=2X0" L ! N(0; V ar(T 1=2X0") N(0;VT )
and
X0X
T
=
PT
t=1 xtx
0
t
T
!
a:s ! E
PT
t=1 xtx
0
t
T
!
= MT :
Therefore
pT(^ ) L ! M 1
T N(0;VT )
N(0;M 1T VTM 1T );
or
(M 1T VTM 1T ) 1=2pT(^ ) L ! N(0;I):
From results above, the asymptotic normality of OLS estimator depend cru-
cial on the existence of at least second moments of the regressors Xti and from
that we have LLN such that X0XT a:s ! E
PT
t=1 xtx
0t
T
= MT = O(1). As we
have seen from last chapter that a I(1) variables does not have a nite sec-
ond moments, therefore when the regressor is a unit root process, then tradi-
tional asymptotic results for OLS estimator would not apply. However, there
is a case that the regressor is not stochastic, but it violate the condition that
X0X
T
a:s ! E PTt=1 xtx0t
T
= MT = O(1), as we will see in the following that the
asymptotic normality still valid though the rate convergence to the normality
changes.
7
2 Processes with Deterministic Time Trends
The coe cients of regression models involving unit roots or deterministic time
trends are typically estimated by OLS. However, the asymptotic distributions
of the coe cient estimates cannot be calculated in the same way as are those for
regression models involving stationary variables. Among other di culties, the
estimates of di erent parameters will in general have di erent asymptotic rate of
convergence.
2.1 Asymptotic Distribution of OLS Estimators of the
Simple Time Trend Model
Consider the OLS estimation of the parameters of a simple time trend,
Yt = + t + "t; (3)
for "t a white noise process. If "t N(0; 2), the model (3) satis es the classical
assumption and the standard OLS t or F statistics would have exact small-sample
t or F distributions. On the other hand, if "t is non-Gaussian, then a slightly
di erent technique for nding the asymptotic distribution of the OLS estimates
of and would to be used from that employed in last section.
Write (3) in the form of the standard regression model,
Yt = x0t + "t;
where
x0t 1 t
:
Let ^ T denote the OLS estimate of based on a sample of size T, the
deviation of ^ T from the true value can be expressed as
(^ T ) =
" TX
t=1
xtx0t
# 1" TX
t=1
xt"t
#
: (4)
8
To nd the asymptotic distribution of (^ T ), the approach in last section was
to multiply (4) by pT, resulting in
pT(^
T ) =
"
(1=T)
TX
t=1
xtx0t
# 1"
(1=pT)
TX
t=1
xt"t
#
: (5)
The usual assumption was that (1=T)PTt=1 xtx0t converge in probability to non-
singular matrix M while (1=pT)PTt=1 xt"t converges in distribution to a N(0;V)
random variables, implying that pT(^ T ) L ! N(0; (M 1VM 1)).
For xt given in (5), we note that
1
T v+1
TX
t=1
tv ! 1v + 1; (6)
implying that
TX
t=1
xtx0t =
P1 Pt
Pt Pt2
=
T T(T + 1)=2
T(T + 1)=2 T(T + 1)(2T + 1)=6
O(T1) O(T2)
O(T2) O(T3)
: (7)
In contrast to the usual results as (2), the matrix (1=T)PTt=1 xtx0t in (5)
diverges. To obtain converge and nondegenerates limiting distribution, we can
think of premultiplying and postmultiplying
hP
T
t=1 xtx
0
t
i
by the matrix
1T =
T1=2 0
0 T3=2
1
;
and obtains
(
1T
" TX
t=1
xtx0t
#
1T
)
=
[T 1=2 0
0 T 3=2
P1 Pt
Pt Pt2
[T 1=2 0
0 T 3=2
=
T 1P1 T 2Pt
T 2Pt T 3Pt2
! Q;
where
Q =
1 1
21
2
1
3
(8)
according to (6).
Turning next to the second term in (4) and premultiplying it by 1T ,
1T
" TX
t=1
xt"t
#
=
T 1=2 0
0 T 3=2
P"
tP
t"t
=
(1=pT)P"
t
(1=pT)P(t=T)"t
: (9)
9
We now prove the asymptotic normality of (9) under standard assumption
about "t. Suppose that "t is i:i:d: with mean zero, variance 2, and nite fourth
moment. Then the rst element of the vector in (9) satis es
(1=pT)
X
"t L ! N(0; 2)
by the Linderberg-L evy CLT.
For the second element of the vector in (9), observe that f(t=T)"tg is a martin-
gale di erence sequence that satis es the de nition on p.13 of Ch. 4. Speci cally,
its variance is
2t = E[(t=T)"t]2 = 2 (t2=T2);
where
(1=T)
TX
t=1
2t = 2(1=T3)
TX
t=1
t2 ! 2=3:
Furthermore, to apply CLT of a martingale di erence sequence, we need to show
that (1=T)PTt=1[(t=T)"t]2 p ! 2=3 as the condition (iii) on page 26 of Ch. 4.
To prove this, notices that
E
(1=T)
TX
t=1
[(t=T)"t]2 (1=T)
TX
t=1
2t
!2
= E
(1=T)
TX
t=1
[(t=T)"t]2 (1=T)
TX
t=1
(t=T)2 2
!2
= E
(1=T)
TX
t=1
(t=T)2("2t 2)
!2
= (1=T)2
TX
t=1
(t=T)4E("2t 2)2
= E("2t 2)2
1=T6
TX
t=1
t4
!
! 0;
according to (6) and fourth moment of "t exist by assumption.
This imply that
(1=T)
TX
t=1
[(t=T)"t]2 (1=T)
TX
t=1
2t m:s ! 0;
which also imply that
(1=T)
TX
t=1
[(t=T)"t]2 p ! 2=3:
10
Hence, from Theorem ? (p.26 of Ch. 4), (1=pT)PTt=1(t=T)"t satis es the
CLT:
(1=
p
T)
TX
t=1
(t=T)"t L ! N(0; 2=3):
Finally, consider the joint distribution of the two element in the (2 1) vector
described by (9). Any linear combination of these elements takes the form
(1=pT)
TX
t=1
[ 1 + 2(t=T))]"t:
Then [ 1 + 2(t=T))]"t is also a martingale di erence sequence with positive vari-
ance given by 2[ 21 + 2 1 2(t=T)) + 22(t=T)2] satisfying
(1=T)
TX
t=1
2 21 + 2 1 2(t=T)) + 22(t=T)2 ! 2[ 21 + 2 1 2
1
2
+ 22
1
3
]
= 2 1 2
1 1=2
1=2 1=3
1
2
= 2 0Q
for ( 1; 2)0 and Q the matrix in (8). Furthermore, we can show that
(1=T)
TX
t=1
[ 1 + 2(t=T))]2"2t p ! 2 0Q :
That is this martingale di erence sequence [ 1 + 2(t=T))]"t could apply CLT.
Thus, any linear combination of the two elements in the vector in (9) is asymp-
totically Gaussian, implying that bivariate Gaussian distribution:
(1=pT)P"
t
(1=pT)P(t=T)"t
L ! N(0; 2Q)
form Cramer-Wold device and the fact that Ef[(1=pT)P"t][(1=pT)P(t=T)"t]g =
1
T2
2Pt ! 1
2
2.
11
Collecting results we have
T1=2(^
T )
T3=2(^ T )
= T
" TX
t=1
xtx0t
# 1" TX
t=1
xt"t
#
= T
" TX
t=1
xtx0t
# 1
T 1T
" TX
t=1
xt"t
#
=
(
1T
" TX
t=1
xtx0t
#
1T
) 1(
1T
" TX
t=1
xt"t
#)
L ! Q 1 N(0; 2Q)
N(0; 2Q 1):
It turns out that the OLS estimators ^ T and ^ T have di erent asymptotic rates
of convergence. Note that the time trend estimator ^ T is superconsistent{not
only ^ T p ! , but even when multiplied by T, we still have
T(^ T ) p ! 0:
12
2.2 Asymptotic Distribution of OLS Estimators for an
Autoregressive Process Around a Deterministic Time
Trend
The same principles can be used to study a general autoregressive process around
a deterministic trend:
Yt = + t + 1Yt 1 + 2Yt 2 + ::: + pYt p + "t: (10)
It is assumed that "t is i:i:d: with mean zero, variance 2, and nite fourth
moment, and that roots of
1 1z 2z2 ::: pzp = 0
lie outside the unit circle. Consider a sample of size T + p observations on Y ,
and let ^ ; ^ ; ^ 1;T ; :::; ^ p;T denote coe cients based on OLS estimation of (10) for
t = 1; 2; :::; T.
Remark: The regressor Yt i; i = 1; :::; p in (10) is a trend-stationary process
(it is nonstationary itself !). To remove the nonstationarity of the regressors for
the validity of LLN (X0X=T p ! E(X0X=T)), we may transform the regressor in
terms of zero-mean covariance-stationary process by subtracting time trend from
each regressor.
2.2.1 A useful Transformation of the Regressors
By adding and subtracting j[ + (t j)] for j = 1; 2; :::; p on the right side, the
regression model (10) can be equivalent be written as ( for each regressor Yt j, it
has a constant and trend (t j))
Yt = (1 + 1 + 2 + ::: + p) ( 1 + 2 2 + ::: + p p)
+ (1 + 1 + 2 + ::: + p)t
+ 1[Yt 1 (t 1)] + 2[Yt 2 (t 2)] + :::
+ p[Yt p (t p)] + "t
or
Yt = + t + 1Y t 1 + 2Y t 2 + ::: + pY t p + "t: (11)
13
where
= (1 + 1 + 2 + ::: + p) ( 1 + 2 2 + ::: + p p)
= (1 + 1 + 2 + ::: + p)
j = j for j = 1; 2; :::; p
and
Y t j = Yt j (t j) for j = 1; 2; :::; p:
The original regression model (10) can be written
Yt = x0t + "t; (12)
where
xt =
2
66
66
66
66
66
4
Yt 1
Yt 2
:
:
:
Yt p
1
t
3
77
77
77
77
77
5
=
2
66
66
66
66
66
4
1
2
:
:
:
p
3
77
77
77
77
77
5
:
The algebraic transformation in arriving at (11) could then be described as rewrit-
ing (12) in the form
Yt = x0tG0[G0] 1 + "t = [x t ]0 + "t; (13)
where
G0
2
66
66
66
66
66
4
1 0 : : : 0 0 0
0 1 : : : 0 0 0
: : : : : : : :
: : : : : : : :
: : : : : : : :
0 0 : : : 1 0 0
+ + 2 : : : + p 1 0
: : : 0 1
3
77
77
77
77
77
5
;
14
[G0] 1
2
66
66
66
66
66
4
1 0 : : : 0 0 0
0 1 : : : 0 0 0
: : : : : : : :
: : : : : : : :
: : : : : : : :
0 0 : : : 1 0 0
2 : : : p 1 0
: : : 0 1
3
77
77
77
77
77
5
;
(hints: From partitioned inverse rule,
I 0
H I
1
=
I 0
H I
.)
x t Gxt =
2
66
66
66
66
66
4
Y t 1
Y t 2
:
:
:
Y t p
1
t
3
77
77
77
77
77
5
[G0] 1 =
2
66
66
66
66
66
4
1
2
:
:
:
p
3
77
77
77
77
77
5
:
The OLS estimate of based on regression of Yt on x t is given by
^ =
" TX
t=1
x t [x t ]0
# 1" TX
t=1
xtYt
#
=
"
G
TX
t=1
xtx0t
!
G0
# 1
G
TX
t=1
xtYt
!
= [G0] 1
TX
t=1
xtx0t
! 1
G 1G
TX
t=1
xtYt
!
= [G0] 1
TX
t=1
xtx0t
! 1 TX
t=1
xtYt
!
= [G0] 1 ^ ;
where ^ is the coe cients OLS estimation from original data of Yt on xt.
The asymptotic distribution of ^ can therefore be inferred from
^ = G0 ^ : (14)
15
2.2.2 The Asymptotic Distribution of OLS Estimates for the Trans-
formed Regression
To derive the asymptotic distribution ^ T , we note that
T (^ T ) = T
" TX
t=1
x t [x ]0t
# 1" TX
t=1
x t "t
#
= T
" TX
t=1
x t [x ]0t
# 1
T 1T
" TX
t=1
x t "t
#
=
(
1T
" TX
t=1
x t [x ]0t
#
1T
) 1(
1T
" TX
t=1
x t "t
#)
;
where
T =
2
66
66
66
66
66
4
T1=2 0 0 : : : 0 0 0
0 T1=2 0 : : : 0 0 0
: : : : : : : : :
: : : : : : : : :
: : : : : : : : :
0 0 0 : : : T 1=2 0 0
0 0 0 : : : 0 T 1=2 0
0 0 0 : : : 0 0 T 3=2
3
77
77
77
77
77
5
:
From (13),
TX
t=1
x t [x ]0t =
2
66
66
66
66
66
4
P(Y
t 1)
2 PY
t 1Y
t 2 : : :
PY
t 1Y
t p
PY
t 1
PtY
t 1P
Y t 2Y t 1 P(Y t 2)2 : : : PY t 2Y t p PY t 2 PtY t 2
: : : : : : : :
: : : : : : : :
: : : : : : : :P
Y t pY t 1 PY t pY t 2 : : : P(Y t p)2 PY t p PtY t pP
Y t 1 PY t 2 : : : PY t p P1 PtP
tY t 1 PtY t 2 : : : PtY t p Pt Pt2
3
77
77
77
77
77
5
;
and therefore,
1T
TX
t=1
x t [x ]0t 1T
16
=
2
66
66
66
66
66
4
T 1P(Y t 1)2 T 1PY t 1Y t 2 : : : T 1PY t 1Y t p T 1PY t 1 T 2PtY t 1
T 1PY t 2Y t 1 T 1P(Y t 2)2 : : : T 1PY t 2Y t p T 1PY t 2 T 2PtY t 2
: : : : : : : :
: : : : : : : :
: : : : : : : :
T 1PY t pY t 1 T 1PY t pY t 2 : : : T 1P(Y t p)2 T 1PY t p T 2PtY t p
T 1PY t 1 T 1PY t 2 : : : T 1PY t p T 1T T 2Pt
T 2PtY t 1 T 2PtY t 2 : : : T 2PtY t p T 2Pt T 3Pt2
3
77
77
77
77
77
5
:(15)
For the rst p rows and columns, the row i, column j elements of the matrix (15)
T 1
TX
t=1
Y t iY t j p ! ji jj
by LLN of covariance stationary process.
The rst p element of row p + 1 (and the rst p element of column p + 1)
T 1
TX
t=1
Y t i p ! 0
also by LLN of zero-mean covariance stationary process.
The rst p element of row p + 2 (and the rst p element of column p + 2)
T 1
TX
t=1
(t=T)Y t i p ! 0
from the theorem below. Thus,
1T
TX
t=1
x t [x ]0t 1T p ! Q (16)
where
Q =
2
66
66
66
66
66
4
0 1 2 : : : p 1 0 0
1 0 1 : : : p 2 0 0
: : : : : : : : :
: : : : : : : : :
: : : : : : : : :
p 1 p 2 p 3 : : : 0 0 0
0 0 0 : : : 0 1 12
0 0 0 : : : 0 12 13
3
77
77
77
77
77
5
:
17
Theorem:
Let Y t i be covariance-stationary with mean zero and absolutely summable au-
tocovariance, then T 2Pty t i p ! 0; i = 1; 2; :::; p:
Proof:
We show that E(T 2PtY t i 0)2 ! 0 which would imply that T 2PtY t i m:s ! 0
and also that T 2PtY t i p ! 0.
To see this, since
E(T 2
X
tY t i 0)2 = (1=T 4)E[(Y1 i + 2Y2 i + ::: + TYT i)(Y1 i + 2Y2 i + ::: + TYT i)]
= (1=T 4)Ef(Y1 i)[(Y1 i + 2Y2 i + ::: + TYT i)]
+(2Y2 i)[(Y1 i + 2Y2 i + ::: + TYT i)]
+(3Y3 i)[(Y1 i + 2Y2 i + ::: + TYT i)]
+::: + (TYT i)[(Y1 i + 2Y2 i + ::: + TYT i)]g
= (1=T 4)f[1 1 0 + 1 2 1 + 1 3 2 + 1 4 3 + ::: + 1 T T 1]
+[2 1 1 + 2 2 0 + 2 3 1 + 2 4 2 + ::: + 2 T T 2]
+[3 1 2 + 3 2 1 + 3 3 0 + 3 4 1 + 3 5 2 + ::: + 3 T T 3]
+::: + [T 1 T 1 + T 2 T 2 + T 3 T 3 + ::: + T T 0]g
= (1=T 4)
( TX
t=1
t2 0 + 2
T 1X
t=1
t(t + 1) 1 + 2
T 2X
t=1
t(t + 2) 2 + ::: + 2T T 1
)
= (1=T)
(" TX
t=1
t2=T3
#
0 +
"T 1X
t=1
t(t + 1)=T 3
#
2 1
"T 2X
t=1
t(t + 2)=T 3
#
2 2 + :::: + T=T3 2 T 1
)
= (1=T)
" TX
t=1
t2=T3
#
0 +
"T 1X
t=1
t(t + 1)=T 3
#
2 1
"T 2X
t=1
t(t + 2)=T 3
#
2 2 + :::: + T=T3 2 T 1
;
18
then
T E(T 2
X
tY t i 0)2 =
" TX
t=1
t2=T3
#
0 +
"T 1X
t=1
t(t + 1)=T 3
#
2 1
"T 2X
t=1
t(t + 2)=T 3
#
2 2 + :::: + T=T3 2 T 1
;
(
TX
t=1
t2=T3
j 0j +
T 1X
t=1
t(t + 1)=T 3
2j 1j
T 2X
t=1
t(t + 2)=T 3
2j 2j + :::: +
T=T3 2j
T 1j
)
fj 0j + 2j 1j + 2j 2j + :::g
since (1=T v+1
TX
t=1
tv ! 1=(v + 1) < 1
!
< 1:
So, E(T 2PtY t i 0)2 ! 0 and therefore T 2PtY t i p ! 0 as claimed.
We now turn to second element of OLS estimator,
(
1T
" TX
t=1
x t "t
#)
=
2
66
66
66
66
66
4
T 1=2PY t 1"t
T 1=2PY t 2"t
:
:
:
T 1=2PY t p"t
T 1=2P"t
T 1=2P(t=T)"t
3
77
77
77
77
77
5
= T 1=2
TX
t=1
t;
where
t =
2
66
66
66
66
66
4
PY
t 1"tP
Y t 2"t
:
:
:P
Y t p"tP
"tP
(t=T)"t
3
77
77
77
77
77
5
:
But t is a martingale di erence with variance
E( t 0t) = 2Q t ;
19
where
Q t =
2
66
66
66
66
66
4
0 1 2 : : : p 1 0 0
1 0 1 : : : p 2 0 0
: : : : : : : : :
: : : : : : : : :
: : : : : : : : :
p 1 p 2 p 3 : : : 0 0 0
0 0 0 : : : 0 1 t=T
0 0 0 : : : 0 t=T t2=T2
3
77
77
77
77
77
5
and
(1=T)
TX
t=1
Q t ! Q :
Applying the CLT for martingale di erence, it can be shown that
1T
TX
t=1
x t "t L ! N(0; 2Q ): (17)
It follow from (16) and (17) that
T (^ T ) L ! N(0; [Q ] 1 2Q [Q ] 1) = N(0; 2[Q ] 1): (18)
2.2.3 The Asymptotic Distribution of OLS Estimators for the Original
Regression
From (14) we have
2
66
66
66
66
66
64
^ 1
^ 2
:
:
:
^ p
^
^
3
77
77
77
77
77
75
=
2
66
66
66
66
66
4
1 0 : : : 0 0 0
0 1 : : : 0 0 0
: : : : : : : :
: : : : : : : :
: : : : : : : :
0 0 : : : 1 0 0
+ + 2 : : : + p 1 0
: : : 0 1
3
77
77
77
77
77
5
2
66
66
66
66
66
64
^ 1
^ 2
:
:
:
^ p
^
^
3
77
77
77
77
77
75
:
20
The original OLS estimators ^ i are identical to the estimators from the trans-
formed regression ^ i , the asymptotic distribution of ^ i is given immediately from
(18). The estimator ^ is a linear combination of variables that converges to
a Gaussian distribution at rate pT, so ^ behave the same way. Speci cally,
^ = g0 ^ T , where
g0 + + 2 : : : + p 1 0 ;
so from (18),
pT(^ ) L ! N(0;g0
2[Q ] 1g ):
Finally, the estimator ^ is a linear combination of variables converging at
di erent rate:
^ = g0 ^ T + ^ T ;
where
g0 = : : : 0 0 :
Since g0 ^ T is O(T 1=2) and ^ T is O(T 3=2), ^ is O(T 1=2) (see p. 3 of Ch 4),
therefore,
p
T(^ ) L ! N(0;g0 2[Q ] 1g ):
Thus, each of the elements of ^ T individually is asymptotically Gaussian
and is O(T 1=2). The asymptotic distribution of the full vector pT(^ T ) is
multivariate Gaussian.
21