8-1
Chapter 8
Modelling volatility and correlation
8-2
1 An Excursion into Non-linearity Land
? Motivation,the linear structural (and time series) models
cannot explain a number of important features common to
much financial data
- leptokurtosis:尖峰性,厚尾
- volatility clustering or volatility pooling 波动性集群
- leverage effects 与价格同幅上升相比,价格大幅下降后,波动性上升
较多
? Our,traditional” structural model could be something like:
yt = ?1 + ?2x2t +,.,+ ?kxkt + ut,or y = X? + u.
We also assumed ut ? N(0,?2).
8-3A Sample Financial Asset
Returns Time Series
Daily S&P 500 Returns for January 1990 – December 1999
-0, 0 8
-0, 0 6
-0, 0 4
-0, 0 2
0, 0 0
0, 0 2
0, 0 4
0, 0 6
1 / 0 1 / 9 0 1 1 / 0 1 / 9 3 9 / 0 1 / 9 7
R e t u r n
D a t e
8-4
Non-linear Models,A Definition
? Campbell,Lo and MacKinlay (1997) define a non-linear data
generating process as one that can be written
yt = f(ut,ut-1,ut-2,…)
where ut is an iid error term and f is a non-linear function.
? They also give a slightly more specific definition as
yt = g(ut-1,ut-2,…)+ ut?2(ut-1,ut-2,…)
where g is a function of past error terms only and ?2 is a
variance term.
? Models with nonlinear g(?) are,non-linear in mean”,while
those with nonlinear ?2(?) are,non-linear in variance”,
? Models can be linear in mean and variance(CLRM,ARMA),
or linear in mean but non-linear in variance(GARCH).
8-5
1.1 Types of non-linear models
? The linear paradigm is a useful one,Many apparently non-
linear relationships can be made linear by a suitable
transformation,On the other hand,it is likely that many
relationships in finance are intrinsically non-linear.
? There are many types of non-linear models,e.g.
- ARCH / GARCH for modelling and forecasting volatility
- switching models, allow the behaviour of a series to follow
different processes at different points in time.
- bilinear models
8-6
1.2 Testing for Non-linearity
? The,traditional” tools of time series analysis (acf’s,spectral
analysis) may find no evidence that we could use a linear
model,but the data may still not be independent.
? General test (Portmanteau 多 用 途 tests) for non-linear
dependence have been developed,The simplest is Ramsey’s
RESET test (chapter 4).
? Many other non-linearity tests are available.
the BDS test(1996),检验数据是否是纯随机的 。 Eview4提供
The bispectrum test (Hinich,1982)
bicorrelation test (Hsieh,1993; Hinich,1996)
? One particular non-linear model that has proved very useful
in finance is the ARCH model due to Engle (1982).
? Specific tests, to find specific types of non-linear structure
8-7
2 Models for volatility
? 建模和预测股票市场波动性已经成为过去十年中实证和理论研
究中的一个重要主题,
? 波动性是金融中最重要的概念之一。通常用收益的标准差或方
差来衡量。波动性常常用于金融资产的总体风险的粗略测量。
? 许多测量市场风险的 VaR模型需要估计和预测波动性参数,在
Black-Scholes期权定价模型中也需要利用股票市场价格的波
动性。
? 描述波动性典型特征的一些模型
– Historical volatility 历史的波动性:计算过去一段时期的收益方差,
并用于未来的波动性预测。可以作为其他方法的比较基准 benchmark
– Implied volatility models:在给定期权价格的条件下,可以计算出基
础资产收益波动性的市场预测值。
– 指数加权移动平均模型 EWMA,近期数据对波动性的预测有更大的影响
– 自回归波动性模型, 对代表波动性的序列建立 ARMA模型并用于预测
8-8
Heteroscedasticity Revisited
? An example of a structural model is
with ut ? N(0,).
The assumption that the variance of the errors is constant is
known as homoscedasticity,i.e,Var (ut) =,
? What if the variance of the errors is not constant?
- heteroscedasticity
- would imply that standard error estimates could be wrong.
? Is the variance of the errors likely to be constant over time? Not
for financial data.
?u2
?u2
y t = ? 1 + ? 2 x 2 t + ? 3 x 3 t + ? 4 x 4 t + u t
8-93 Autoregressive Conditionally
Heteroscedastic (ARCH) Models
? a model which does not assume that the variance is constant.
? the definition of the conditional variance of ut:
= Var(ut? ut-1,ut-2,...) = E[(ut-E(ut))2? ut-1,ut-2,...]
We usually assume that E(ut) = 0
so = Var(ut? ut-1,ut-2,...) = E[ut2? ut-1,ut-2,...].
? What could the current value of the variance of the errors
plausibly depend upon? Previous squared error terms.
? This leads to the autoregressive conditionally heteroscedastic
model for the variance of the errors:
= ?0 + ?1
? This is known as an ARCH(1) model.
?t2
?t2
?t2 ut?12
8-10
ARCH Models (cont’d)
? One example of a full model would be
yt = ?1 + ?2x2t +,.,+ ?kxkt + ut,ut ? N(0,)
where = ?0 + ?1
? We can easily extend this to the general case where the error
variance depends on q lags of squared errors:
= ?0 + ?1 +?2 +...+?q
? This is an ARCH(q) model.
? Instead of calling the variance,in the literature it is usually
called ht,so the model is
yt = ?1 + ?2x2t +,.,+ ?kxkt + ut,ut ? N(0,ht)
where ht = ?0 + ?1 +?2 +...+?q
?t2
?t2
?t2
ut?12
ut q?2
ut q?2
?t2
21?tu 2
2?tu
21?tu 2
2?tu
8-11
Another Way of Writing ARCH Models
? For illustration,consider an ARCH(1),Instead of the above,we
can write
yt = ?1 + ?2x2t +,.,+ ?kxkt + ut,ut = vt?t
,vt ? N(0,1)
? The two are different ways of expressing exactly the same
model,The first form is easier to understand while the second
form is required for simulating from an ARCH model,for
example.
? 非负性约束:条件方差必须严格是正的, 从而通常要求条件方
差关系式中的系数是非负的 。
? ? ?t tu? ? ?0 1 12
8-12
Testing for,ARCH Effects”
1,run any postulated linear regression of the form given in the
equation above,e.g.
yt = ?1 + ?2x2t +,.,+ ?kxkt + ut
saving the residuals,.
2,Then square the residuals,and regress them on q own lags to
test for ARCH of order q,i.e,run the regression
where vt is iid.
Obtain R2 from this regression
3,The test statistic is defined as TR2 from the last regression,and
is distributed as a ?2(q).
tu?
tqtqttt vuuuu ?????? ??? 22 222 1102 ?...??? ????
8-13
Testing for,ARCH Effects” (cont’d)
4,The null and alternative hypotheses are
H0, ?1 = 0 and ?2 = 0 and ?3 = 0 and,.,and ?q = 0
H1, ?1 ? 0 or ?2 ? 0 or ?3 ? 0 or,.,or ?q ? 0.
If the value of the test statistic is greater than the critical value
from the ?2 distribution,then reject the null hypothesis.
? Note that the ARCH test is also sometimes applied directly to
returns instead of the residuals from Stage 1 above.
? Testing for ARCH effects in S&P500 index using Eviews p449
8-14
Problems with ARCH(q) Models
? How do we decide on q?
– One approach,Likelihoodratio test
? The required value of q might be very large to capture all of the
dependence in the conditional variance,so model would be not
parsimonious.
– Engle 1982,p452
? Non-negativity constraints might be violated.
– When we estimate an ARCH model,we require ?i >0 ? i=1,2,...,q (since
variance cannot be negative)
? A natural extension of an ARCH(q) model which gets around
some of these problems is a GARCH model,GARCH models
are widely employed in practice.
8-15
4 Generalised ARCH (GARCH) Models
? Due to Bollerslev (1986),Allow the conditional variance to be
dependent upon previous own lags
? The variance equation is now
(1)
? This is a GARCH(1,1) model,which is like an ARMA(1,1)
model for the variance equation,(p453)
? We could also write
? Substituting into (1) for ?t-12,
? t 2 = ? 0 + ? 1 2 1?tu + ? ? t - 1 2
? t - 1 2 = ? 0 + ? 1 2 2?tu + ? ? t - 2 2
? t - 2 2 = ? 0 + ? 1 2 3?tu + ? ? t - 3 2
? t 2 = ? 0 + ? 1 2 1?tu + ? ( ? 0 + ? 1 2 2?tu + ? ? t - 2 2 )
= ? 0 + ? 1 2 1?tu + ? 0 ? + ? 1 ? 2 2?tu + ? ? t - 2 2
(2)
8-16
GARCH Models (cont’d)
? Now substituting into (2) for ?t-22
? An infinite number of successive substitutions would yield
? So the GARCH(1,1) model can be written as an infinite order
ARCH model.
? t 2 = ? 0 + ? 1 2 1?tu + ? 0 ? + ? 1 ? 2 2?tu + ? 2 ( ? 0 + ? 1 2 3?tu + ? ? t - 3 2 )
? t 2 = ? 0 + ? 1 2 1?tu + ? 0 ? + ? 1 ? 2 2?tu + ? 0 ? 2 + ? 1 ? 2 2 3?tu + ? 3 ? t - 3 2
? t 2 = ? 0 (1+ ? + ? 2 ) + ? 1 2 1?tu (1+ ? L+ ? 2 L 2 ) + ? 3 ? t - 3 2
? t 2 = ? 0 (1+ ? + ? 2 +,., ) + ? 1 2 1?tu (1+ ? L+ ? 2 L 2 +,,,) + ? ? ? 0 2
?
?
????
?????
??
?
2
22
2
110
222
110
2 )1(
tt
tt
uu
LLu
???
?????
8-17
GARCH Models (cont’d)
? Why is GARCH Better than ARCH?
- more parsimonious - avoids overfitting
- less likely to breach 违反 non-negativity constraints
? We can again extend the GARCH(1,1) model to a GARCH(p,q):
? But in general a GARCH(1,1) model will be sufficient to capture
the volatility clustering in the data.
? t 2 = ? ?
? ?
?? ??
q
i
p
j
jtjiti u
1 1
22
0 ????
? t 2 = ? 0 + ? 1 2 1?tu + ? 2 2 2?tu +...+ ? q 2 qtu ? + ? 1 ? t - 1 2 + ? 2 ? t - 2 2 +...+ ? p ? t - p 2
8-18
The Unconditional Variance under the
GARCH Specification
? The unconditional variance of ut is given by
when
? is termed,non-stationarity” in variance
? is termed intergrated GARCH
? For non-stationarity in variance,the conditional variance
forecasts will not converge on their unconditional value as the
horizon increases.
V a r ( u t ) = )(1
1
0
??
?
??
?? ?1 < 1
?? ?1 ? 1
?? ?1 = 1
8-19
5 Estimation of ARCH / GARCH Models
? Since the model is no longer of the usual linear form,we cannot
use OLS,Because RSS depends only on the parameters in the
conditional mean equation,not the conditional variance.
? We use another technique known as maximum likelihood.
? The method works by finding the most likely values of the
parameters given the actual data.
? More specifically,a log-likelihood function is formed and the
values of the parameters that maximise it are sought.
? ML can be employed to find parameter values for both linear
and non-linear models.
8-20
Estimation of ARCH / GARCH Models
? The steps involved in actually estimating an ARCH or
GARCH model are as follows
1,Specify the appropriate equations for the mean and the
variance - e.g,an AR(1)- GARCH(1,1) model:
2,Specify the log-likelihood function to maximise:
3,The computer will maximise the function and give parameter
values and their standard errors
y t = ? + ? y t - 1 + u t,u t ? N ( 0,? t 2 )
? t 2 = ? 0 + ? 1 2 1?tu + ? ? t - 1 2
??
?
?
?
?????? T
t
ttt
T
t
t yy
TL
1
22
1
1
2 /)(
2
1)l o g (
2
1)2l o g (
2 ?????
8-21Appendix,Parameter Estimation
using Maximum Likelihood
Consider the bivariate regression case with homoscedastic errors
for simplicity:
? Assuming that ut ? N(0,?2),then yt ? N(,?2) so that the
probability density function for a normally distributed random
variable with this mean and variance is given by
? Successive values of yt would trace out the familiar bell-shaped
curve.
? Assuming that ut are iid,then yt will also be iid.
ttt uxy ??? 21 ??
tx21 ?? ?
??
?
??
? ?????
2
2
212
21
)(
2
1e x p
2
1),(
?
??
?????
tt
tt
xyxyf (1)
8-22Parameter Estimation using Maximum
Likelihood (cont’d)
? Then the joint pdf for all the y’s can be expressed as a product
of the individual density functions
? Substituting into equation (2) for every yt from equation (1),
??
?
??
? ????? ?
?
T
t
tt
TTtT
xyxyyyf
1
2
2
212
2121
)(
2
1e x p
)2(
1),,.,,,,(
?
??
?????
?
?
??
?
????
T
t
tt
T
tT
Xyf
Xyf
XyfXyfXyyyf
1
2
21
2
421
2
2212
2
1211
2
2121
),(
),(
),,,,(),(),,.,,,,(
???
???
?????????
(2)
(3)
8-23Parameter Estimation using
Maximum Likelihood (cont’d)
? The typical situation we have is that the xt and yt are given and
we want to estimate ?1,?2,?2,If this is the case,then f(?) is
known as the likelihood function,denoted LF(?1,?2,?2),so we
write
(4)
? Maximum likelihood estimation involves choosing parameter
values (?1,?2,?2) that maximise this function.
? We want to differentiate (4) w.r.t,?1,?2,?2,but (4) is a product
containing T terms,so it would be difficult.
?
?
?
?
?
? ???? ?
?
T
t
tt
TT
xyLF
1
2
2
212
21
)(
2
1e x p
)2(
1),,(
?
??
??
???
8-24
? Since,we can take logs of (4).
? Then,we obtain the log-likelihood function,LLF:
? which is equivalent to
? Differentiating (5) w.r.t,?1,?2,?2,we obtain
(6)
m a x ( ) m a x l o g ( ( ))x xf x f x?
Parameter Estimation using
Maximum Likelihood (cont’d)
?
?
?????? T
t
tt xyTTL L F
1 2
2
21 )(
2
1)2l o g (
2l o g ?
????
?
?
?????? T
t
tt xyTTLLF
1 2
2
212 )(
2
1)2l o g (
2l o g2 ?
????
? ????? 2 21
1
1.2).(
2
1
?
??
??
? tt xyLLF
(5)
8-25
(7)
(8)
? Setting (6)-(8) to zero to minimise the functions,and putting
hats above the parameters to denote the maximum likelihood
estimators,From (6)
?
Parameter Estimation using
Maximum Likelihood (cont’d)
? ????? 4 22122 )(2112 ? ?????? tt xyTL L F
? ????? 221
2
.2).(
2
1
?
??
??
? ttt xxyLLF
? ??? 0)??( 21 tt xy ??
? ? ??? 0?? 21 tt xTy ??
? ? ??? 01??1 21 tt xTyT ??
xy 21 ?? ?? ??
(9)
8-26
? From (7),
(10)
From (8),
Parameter Estimation using
Maximum Likelihood (cont’d)
? ??? 0)??( 21 ttt xxy ??
? ?? ??? 0?? 221 tttt xxxy ??
? ?? ??? tttt xxyxyx )?(? 222 ??
?? ??? 2222 ?? xTyxTxyx ttt ??
?? ??? yxTxyxTx ttt )(? 222?
)(
?
222 xTx
yxTxy
t
tt
?
?
?
???
? ??? 22142 )??(?1? tt xyT ????
8-27
? Rearranging,
(11)
? How do these formulae compare with the OLS estimators?
(9) & (10) are identical to OLS,(11) is different,The OLS
estimator was
? Therefore the ML estimator of the variance of the disturbances
is biased,although it is consistent.
? ML estimator is consistent and asymptotically efficient.
? But how does this help us in estimating heteroscedastic
models?
? ?? 2 21? ?T u t
? ?? 2 21? ? ?T k u t
Parameter Estimation using
Maximum Likelihood (cont’d)
? ??? 2212 )??(1? tt xyT ???
8-28Estimation of GARCH Models Using
Maximum Likelihood
? Now we have yt = ? + ?yt-1 + ut,ut ? N(0,)
? Unfortunately,the LLF for a model with time-varying
variances cannot be maximised analytically,except in the
simplest of cases,So a numerical procedure is used to
maximise the log-likelihood function.
? A potential problem:
1)local optima or multimodalities 多峰 in the likelihood surface;
2)LLF is flat around the maximum,p 459
? 不同的优化过程可能导致不同的系数估计结果 ( 尤其是标准
差 ) 。 此时, 初值的选取很重要 。
?t2
? t 2 = ? 0 + ? 1 2 1?tu + ? ? t - 1 2
??
?
?
?
?????? T
t
ttt
T
t
t yy
TL
1
22
1
1
2 /)(
2
1)l o g (
2
1)2l o g (
2 ?????
8-29
Using ML estimation in practice
? The way we do the optimisation is:
1,Set up LLF.
2,Use regression to get initial guesses for the mean
parameters.
3,Choose some initial guesses for the conditional
variance parameters.
4,Specify a convergence criterion - either by criterion
or by value.
? Different algorithm for optimisation
– BHHH(1974)
– Marquardt (in Eviews)
– BFGS(1965,1963)
8-30
Non-Normality and Maximum Likelihood
? Recall that the conditional normality assumption for ut is
essential.
? We can test for normality using the following representation
ut = vt?t vt ? N(0,1)
? The sample counterpart is standardised residual
? Are the normal? Typically are still leptokurtic,although
less so than the, 此时, GARCH能反映部分资产收益分布的特
征 。 Is this a problem? Not really。 尽管条件正态假设不成立, 如
果正确设定均值和方差方程, 参数估计将仍然是一致的 。 而且 we
can use the ML with a robust standard errors estimator which is
called Quasi- Maximum Likelihood or QML.
? ? ? ? ?t t tu? ? ?? ?0 1 12 2 12
v ut t
t
??
t
t
t
uv
??
?? ?
tv? tv?
tu?
8-31
Estimating GARCH models in Eviews
? P461---465
8-32
Extensions to the Basic GARCH Model
? Since the GARCH model was developed,a huge number of
extensions and variants have been proposed,Three of the most
important examples are EGARCH,GJR,and GARCH-M
models.
? Problems with GARCH(p,q) Models:
- Non-negativity constraints may still be violated
- GARCH models cannot account for leverage effects
? Possible solutions,the exponential GARCH (EGARCH) model
or the GJR model,which are asymmetric GARCH models.
8-33
The EGARCH Model
? Suggested by Nelson (1991),The variance equation is given by
? Advantages of the model
- Since we model the log(?t2),then even if the parameters are
negative,?t2
will be positive.
- We can account for the leverage effect,if the relationship
between
volatility and returns is negative,?,will be negative.
?
?
?
?
?
?
?
?
?????
?
?
?
?
? ?
?
?
?
????? 2)l o g ()l o g (
2
1
1
2
1
12
1
2
t
t
t
t
tt
uu
8-34
The GJR Model
? Due to Glosten,Jaganathan and Runkle
where It-1 = 1 if ut-1 < 0
= 0 otherwise
? For a leverage effect,we would see ? > 0.
? We require ?1 + ?? 0 and ?1 ? 0 for non-negativity.
? t 2 = ? 0 + ? 1 2 1?tu + ? ? t - 1 2 + ? u t - 1 2 I t - 1
8-35
An Example of the use of a GJR Model
? Using monthly S&P 500 returns,December 1979- June 1998
? Estimating a GJR model,we obtain the following results.
)198.3(
172.0?ty
)772.5()999.14()437.0()372.16(
604.0498.0015.0243.1 12 1212 12 ???? ???? ttttt Iuu ??
8-36
News Impact Curves
The news impact curve plots the next period volatility (ht) that would arise from various
positive and negative values of ut-1,given an estimated model.
News Impact Curves for S&P 500 Returns using Coefficients from GARCH and GJR
Model Estimates:
0
0, 0 2
0, 0 4
0, 0 6
0, 0 8
0, 1
0, 1 2
0, 1 4
-1 -0, 9 -0, 8 -0, 7 -0, 6 -0, 5 -0, 4 -0, 3 -0, 2 -0, 1 0 0, 1 0, 2 0, 3 0, 4 0, 5 0, 6 0, 7 0, 8 0, 9 1
V a l u e o f L a g g e d S h o c k
V
a
l
u
e
o
f
C
o
n
d
i
t
i
o
n
a
l
V
a
r
i
a
n
c
e
G A R C H
G J R
8-37
GARCH-in Mean
? We expect a risk to be compensated by a higher return,So why
not let the return of a security be partly determined by its risk?
? Engle,Lilien and Robins (1987) suggested the ARCH-M
specification,A GARCH-M model would be
? ? can be interpreted as a sort of risk premium.
? It is possible to combine all or some of these models together to
get more complex,hybrid” models - e.g,an ARMA-
EGARCH(1,1)-M model.
y t = ? + ? ? t - 1 + u t,u t ? N ( 0,? t 2 )
? t 2 = ? 0 + ? 1 2 1?tu + ? ? t - 1 2
8-38
What Use Are GARCH-type Models?
? GARCH can model the volatility clustering effect since the conditional
variance is autoregressive.Such models can be used to forecast volatility.
? Wecould show that
Var (yt ? yt-1,yt-2,...) = Var (ut ? ut-1,ut-2,...)
? So modelling?t2 will give us models and forecasts for yt as well.
? Variance forecasts are additive over time,
8-39
Forecasting Variances using GARCH Models
? Producing conditional variance forecasts from GARCH models uses a very
similar approach to producing forecasts from ARMA models.
? It is again an exercise in iterating with the conditional expectations operator.
? Consider the following GARCH(1,1) model:
,ut ? N(0,?t2),
? What is needed is to generate are forecasts of ?T+12 ??T,?T+22 ??T,...,?T+s2
??T where ?T denotes all information available up to and including
observation T.
? Adding one to each of the time subscripts of the above conditional variance
equation,and then two,and then three would yield the following equations
?T+12 = ?0 + ?1 +??T2,?T+22 = ?0 + ?1 +??T+12,?T+32 = ?0 + ?1 +??T+22
tt uy ?? ? 212 1102 ?? ??? ttt u ?????
8-40
Forecasting Variances
using GARCH Models (Cont’d)
? Let be the one step ahead forecast for ?2 made at time T,
This is easy to calculate since,at time T,the values of all the
terms on the RHS are known,
? would be obtained by taking the conditional expectation of
the first equation at the bottom of slide 36:
? Given,how is,the 2-step ahead forecast for ?2 made at
time T,calculated? Taking the conditional expectation of the
second equation at the bottom of slide 36:
= ?0 + ?1E( ??T) +?
? where E( ??T) is the expectation,made at time T,of,
which is the squared disturbance term.
2,1fT?
2,1fT?
2,1 f T? = ? 0 + ? 1 2Tu + ? ? T 2
2,1fT? 2
,2fT?
2,2fT? 21?Tu 2,1fT?
21?Tu 21?Tu
8-41
Forecasting Variances
using GARCH Models (Cont’d)
? We can write
E(uT+12??t) = ?T+12
? But ?T+12 is not known at time T,so it is replaced with the forecast for it,,
so that the 2-step ahead forecast is given by
= ?0 + ?1 +?
= ?0 + (?1+?)
? By similar arguments,the 3-step ahead forecast will be given by
= ET(?0 + ?1 + ??T+22)
= ?0 + (?1+?)
= ?0 + (?1+?)[?0 + (?1+?) ]
= ?0 + ?0(?1+?) + (?1+?)2
? Any s-step ahead forecast (s ? 2) would be produced by
2,1fT?
2,2fT? 2
,1fT? 2,1fT?
2,2fT? 2,1fT?
2,3fT?
2,2fT?
2,1fT?
2,1fT?
f Tss
i
if Ts hh,1111
1
110,)()( ??
?
? ???? ? ?????
8-42
What Use Are Volatility Forecasts?
1,Option pricing
C = f(S,X,?2,T,rf)
2,Conditional betas
3,Dynamic hedge ratios
The Hedge Ratio - the size of the futures position to the size of the
underlying exposure,i.e,the number of futures contracts to buy or sell
per unit of the spot good.
? ??i t im t
m t
,,
,
? 2
8-43
What Use Are Volatility Forecasts? (Cont’d)
? What is the optimal value of the hedge ratio?
? Assuming that the objective of hedging is to minimise the variance of the
hedged portfolio,the optimal hedge ratio will be given by
where h = hedge ratio
p = correlation coefficient between change in spot price (S) and
change in futures price (F)
?S = standard deviation of S
?F = standard deviation of F
? What if the standard deviations and correlationare changing over time?
Use
h p s
F
? ??
tF
ts
tt ph
,
,
?
??
8-44
Testing Non-linear Restrictions or
Testing Hypotheses about Non-linear Models? Usual t- and F-tests are still valid in non-linear models,but they are not
flexible enough.
? There are three hypothesis testing procedures based on maximum likelihood
principles,Wald,Likelihood Ratio,Lagrange Multiplier.
? Consider a single parameter,? to be estimated,Denote the MLE as and a
restricted estimate as,
~?
??
8-45
Likelihood Ratio Tests
? Estimate under the null hypothesis and under the alternative.
? Then compare the maximised values of the LLF.
? So we estimate the unconstrained model and achieve a given maximised
value of the LLF,denoted Lu
? Then estimate the model imposing the constraint(s) and get a new value
of the LLF denoted Lr.
? Which will be bigger?
? Lr ? Lu comparable to RRSS? URSS
? The LR test statistic is given by
LR = -2(Lr - Lu) ??2(m)
where m = number of restrictions
8-46
Likelihood Ratio Tests (cont’d)
? Example,We estimate a GARCH model and obtain a maximised LLF of
66.85,We are interested in testing whether ? = 0 in the following
equation.
yt = ? + ?yt-1 + ut,ut ? N(0,)
= ?0 + ?1 + ?
? We estimate the model imposing the restriction and observe the
maximised LLF falls to 64.54,Can we accept the restriction?
? LR = -2(64.54-66.85) = 4.62.
? The test follows a ?2(1) = 3.84 at 5%,so reject the null.
? Denoting the maximised value of the LLF by unconstrainedML as L( )
and the constrained optimum as, Then we can illustrate the 3 testing
procedures in the following diagram:
?t2
?t2 ut?12
L(~)?
21?t?
??
8-47
Comparison of Testing Procedures under Maximum
Likelihood,Diagramatic Representation
? ??L
A
? ?? ?L
B
? ??
~
L
?
~
?
?
?
8-48
Hypothesis Testing under Maximum Likelihood
? The vertical distance forms the basis of the LR test.
? The Wald test is based on a comparison of the horizontal distance.
? The LM test compares the slopes of the curve at A and B.
? Weknow at the unrestricted MLE,L( ),the slope of the curve is zero.
? But is it,significantlysteep”at?
? This formulation of the test is usually easiest to estimate.
L(~)?
??
8-49
An Example of the Application of GARCH Models
- Day & Lewis (1992)
? Purpose
? To consider the out of sample forecasting performance of GARCH and
EGARCH Models for predicting stock index volatility.
? Implied volatility is the markets expectation of the,average” level of
volatility of an option:
? Which is better,GARCH or implied volatility?
? Data
? Weekly closing prices (Wednesday to Wednesday,and Friday to Friday)
for the S&P100 Index option and the underlying 11 March 83 - 31 Dec.
89
? Implied volatility is calculated using a non-linear iterativeprocedure.
8-50
The Models
? The,Base” Models
For the conditional mean
(1)
And for the variance (2)
or (3)
where
RMt denotes the return on the market portfolio
RFt denotes the risk-free rate
ht denotes the conditional variance from the GARCH-type models while
?t2 denotes the implied variance from option prices.
ttFtMt uhRR ???? 10 ??
112 110 ?? ??? ttt huh ???
)2()l n()l n(
2/1
1
1
1
1
1110 ??
?
?
???
?
???????????
?
?
?
?
? ??????
t
t
t
t
tt h
u
h
uhh
8-51
The Models (cont’d)
? Add in a lagged value of the implied volatility parameter to equations (2)
and (3).
(2) becomes
(4)
and (3) becomes
(5)
? Weare interested in testing H0, ? = 0 in (4) or (5).
? Also,we want to test H0, ?1 = 0 and?1 = 0 in (4),
? and H0, ?1 = 0 and ?1 = 0 and ? = 0 and ? = 0 in (5).
2 1112 110 ??? ???? tttt huh ?????
)l n()2()l n()l n( 2 1
2/1
1
1
1
1
1110 ?
?
?
?
?
? ???
?
?
???
?
??????????? t
t
t
t
t
tt h
u
h
uhh ??
??????
8-52
The Models (cont’d)
? If this second set of restrictions holds,then (4) & (5) collapse to
(4’)
? and (3) becomes
(5’)
? Wecan test all of these restrictions using a likelihood ratio test.
2 102 ??? tth ???
)ln ()ln ( 2 102 ??? tth ???
8-53
In-sample Likelihood Ratio Test Results:
GARCH Versus Implied Volatility
ttFtMt
uhRR ????
10
?? ( 8, 7 8 )
11
2
110 ??
???
ttt
huh ??? ( 8, 7 9)
2
111
2
110 ???
????
tttt
huh ????? ( 8, 8 1 )
2
10
2
?
??
tt
h ???
( 8, 8 1 ? )
Eq u a t i o n f o r
V a r i a n c e
s p e c i f i c a t i o n
?
0
?
1
?
0
? 10
- 4
?
1
?
1
? Lo g - L ?
2
( 8, 7 9 ) 0, 0 0 7 2
( 0, 0 0 5 )
0, 0 7 1
( 0, 0 1 )
5, 4 2 8
( 1, 6 5 )
0, 0 9 3
( 0, 8 4 )
0, 8 5 4
( 8, 1 7 )
- 7 6 7, 3 2 1 1 7, 7 7
( 8, 8 1 ) 0, 0 0 1 5
( 0, 0 2 8 )
0, 0 4 3
( 0, 0 2 )
2, 0 6 5
( 2, 9 8 )
0, 2 6 6
( 1, 1 7 )
- 0, 0 6 8
( - 0, 5 9 )
0, 3 1 8
( 3, 0 0 )
7 7 6, 2 0 4 -
( 8, 8 1 ? ) 0, 0 0 5 6
( 0, 0 0 1 )
- 0, 1 8 4
( - 0, 0 0 1 )
0, 9 9 3
( 1, 5 0 )
- - 0, 5 8 1
( 2, 9 4 )
7 6 4, 3 9 4 2 3, 6 2
N ot e s, t - r a t i os i n pa r e nt he s e s,L og - L de not e s t he m a xi m i s e d v a l u e of t he l og - l i ke l i hoo d f unc t i on i n
e a c h c a s e, ?
2
de not e s t he v a l ue of t he t e s t s t a t i s t i c,w hi c h f ol l o w s a ?
2
( 1) i n t he c a s e of ( 8.8 1) r e s t r i c t e d
t o ( 8.7 9),a nd a ?
2
( 2) i n t he c a s e of ( 8.8 1) r e s t r i c t e d t o ( 8.8 1 ? ), S our c e, D a y a nd L e w i s ( 199 2),
R e pr i nt e d w i t h t he pe r m i s s i on of El s e v i e r S c i e nc e,
8-54
In-sample Likelihood Ratio Test Results:
EGARCH Versus Implied Volatility
ttFtMt
uhRR ????
10
?? (8, 7 8 )
)
2
()l n ()l n (
2/1
1
1
1
1
1110
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?????
?
?
?
?
?
?
?????
t
t
t
t
tt
h
u
h
u
hh (8, 8 0 )
)l n ()
2
()l n ()l n (
2
1
2/1
1
1
1
1
1110 ?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?????
t
t
t
t
t
tt
h
u
h
u
hh ??
?
????? (8, 8 2 )
)l n ()l n (
2
10
2
?
??
tt
h ???
( 8, 8 2 ? )
Eq u a t i o n f o r
V a r i a n c e
s p e c i f i c a t i o n
?
0
?
1
?
0
? 10
- 4
?
1
? ? ? Lo g - L ?
2
( c ) - 0, 0 0 2 6
( - 0, 0 3 )
0, 0 9 4
( 0, 2 5 )
- 3, 6 2
( - 2, 9 0 )
0, 5 2 9
( 3, 2 6 )
- 0, 2 7 3
( - 4, 1 3 )
0, 3 5 7
( 3, 1 7 )
- 7 7 6, 4 3 6 8, 0 9
( e ) 0, 0 0 3 5
( 0, 5 6 )
- 0, 0 7 6
( - 0, 2 4 )
- 2, 2 8
( - 1, 8 2 )
0, 3 7 3
( 1, 4 8 )
- 0, 2 8 2
( - 4, 3 4 )
0, 2 1 0
( 1, 8 9 )
0, 3 5 1
( 1, 8 2 )
7 8 0, 4 8 0 -
(e ? ) 0, 0 0 4 7
( 0, 7 1 )
- 0, 1 3 9
( - 0, 4 3 )
- 2, 7 6
( - 2, 3 0 )
- - - 0, 6 6 7
( 4, 0 1 )
7 6 5, 0 3 4 3 0, 8 9
N ot e s, t - r a t i os i n pa r e nt he s e s,L og - L de not e s t he m a xi m i s e d v a l u e of t he l og - l i ke l i hood f unc t i on i n
e a c h c a s e, ?
2
de not e s t he v a l u e of t he t e s t s t a t i s t i c,w hi c h f ol l o w s a ?
2
( 1) i n t he c a s e of ( 8.8 2) r e s t r i c t e d
t o ( 8.80),a nd a ?
2
( 2) i n t he c a s e of ( 8.82) r e s t r i c t e d t o ( 8.82 ? ), S our c e, D a y a nd L e w i s ( 1992),
R e pr i nt e d w i t h t he pe r m i s s i on of El s e v i e r S c i e nc e,
8-55
Conclusions for In-sample Model Comparisons &
Out-of-Sample Procedure
? IV has extra incremental power for modelling stock volatility beyond
GARCH.
? But the models do not represent a true test of the predictive ability of IV.
? So the authors conduct an out of sample forecasting test.
? There are 729 data points,They use the first 410 to estimate the models,and
then make a 1-step ahead forecast of the followingweek’s volatility.
? Then they roll the sample forward one observation at a time,constructing a
new one step ahead forecast at each step.
8-56
Out-of-Sample Forecast Evaluation
? They evaluate the forecasts in two ways:
? The first is by regressing the realised volatility series on the forecasts plus
a constant:
(7)
where is the,actual” value of volatility,and is the value forecasted
for it during period t.
? Perfectly accurate forecasts imply b0 = 0 and b1 = 1.
? But what is the,true”value of volatility at time t?
Day & Lewis use 2 measures
1,The square of the weekly return on the index,which they call SR.
2,The variance of the week’s daily returns multiplied by the
number of trading days in that week.
? ? ?t ft tb b? ?? ? ?12 0 1 2 1
?t?12 ?ft2
8-57
Out-of Sample Model Comparisons
1
2
10
2
1 ??
???
tftt
bb ??? (8, 8 3 )
F o r e c a s t i n g M o d e l P r o x y f o r ex
p o s t v o l a t i l i t y
b
0
b
1
R
2
H i s t o r i c SR 0, 0 0 0 4
( 5, 6 0 )
0, 1 2 9
( 2 1, 1 8 )
0, 0 9 4
H i s t o r i c WV 0, 0 0 0 5
( 2, 9 0 )
0, 1 5 4
( 7, 5 8 )
0, 0 2 4
G A R C H SR 0, 0 0 0 2
( 1, 0 2 )
0, 6 7 1
( 2, 1 0 )
0, 0 3 9
G A R C H WV 0, 0 0 0 2
( 1, 0 7 )
1, 0 7 4
( 3, 3 4 )
0, 0 1 8
EG A R C H SR 0, 0 0 0 0
( 0, 0 5 )
1, 0 7 5
( 2, 06)
0, 0 2 2
EG A R C H WV - 0, 0 0 0 1
( - 0, 4 8 )
1, 5 2 9
( 2, 5 8 )
0, 0 0 8
I m p l i e d V o l a t i l i t y SR 0, 0 0 2 2
( 2, 2 2 )
0, 3 5 7
( 1, 8 2 )
0, 0 3 7
I m p l i e d V o l a t i l i t y WV 0, 0 0 0 5
( 0, 3 8 9 )
0, 7 1 8
( 1, 9 5 )
0, 0 2 6
N ot e s,H i st or i c r e f e r s t o t he use of a si m pl e hi st or i c a l a v e r a g e of t he squa r e d r e t ur n s t o f or e c a st
v ol a t i l i t y ; t - r a t i os i n pa r e nt he se s; S R a nd W V r e f e r t o t he squa r e of t he w e e kl y r e t ur n on t he S & P 100,
a nd t he v a r i a nc e of t he w e e k’ s d a i l y r e t ur ns m ul t i pl i e d by t he nu m be r of t r a di n g da y s i n t ha t w e e k,
r e spe c t i v e l y, S our c e, D a y a nd L e w i s ( 1 992 ), R e pr i nt e d w i t h t he pe r m i ss i on o f E l se v i e r S c i e nc e,
8-58
Encompassing Test Results,Do the IV Forecasts
Encompass those of the GARCH Models?
1
2
4
2
3
2
2
2
10
2
1 ??
??????
tHtEtGtItt
bbbbb ?????? (8, 8 6 )
F o r e c a s t c o m p a r i s o n b
0
b
1
b
2
b
3
b
4
R
2
I m p l i e d v s, G A R C H
- 0, 0 0 0 1 0
( - 0, 0 9 )
0, 6 0 1
( 1, 0 3 )
0, 2 9 8
( 0, 4 2 )
- - 0, 0 2 7
I m p l i e d v s, G A R C H
v s, H i s t o r i c a l
0, 0 0 0 1 8
( 1, 1 5 )
0, 6 3 2
( 1, 0 2 )
- 0, 2 4 3
( - 0, 2 8 )
- 0, 1 2 3
( 7, 0 1 )
0, 0 3 8
I m p l i e d v s, EG A R C H
- 0, 0 0 0 0 1
( - 0, 0 7 )
0, 6 9 5
( 1, 6 2 )
- 0, 1 7 6
( 0, 2 7 )
- 0, 0 2 6
I m p l i e d v s, EG A R C H
v s, H i s t o r i c a l
0, 0 0 0 2 6
( 1, 3 7 )
0, 5 9 0
( 1, 4 5 )
- 0, 3 7 4
( - 0, 5 7 )
- 0, 1 1 8
( 7, 7 4 )
0, 0 3 8
G A R C H v s, EG A R C H
0, 0 0 0 0 5
( 0, 3 7 )
- 1, 0 7 0
( 2, 7 8 )
- 0, 0 0 1
( - 0, 0 0 )
- 0, 0 1 8
N ot e s,t - r a t i os i n pa r e nt he se s; t he e x po st m e a sur e use d i n t hi s t a bl e i s t he v a r i a nc e of t he w e e k’ s da i l y
r e t ur ns m ul t i pl i e d by t he nu m b e r of t r a di ng da y s i n t ha t w e e k,S ou r c e, D a y a nd L e w i s ( 19 92 ),
R e pr i nt e d w i t h t he pe r m i ss i on of E l se v i e r S c i e nc e,
8-59
Conclusions of Paper
? Within sample results suggest that IV contains extra information not
contained in the GARCH / EGARCH specifications.
? Out of sample results suggest that nothing can accurately predict volatility!
8-60
Multivariate GARCH Models
? Multivariate GARCH models are used to estimate and to forecast
covariances and correlations,The basic formulation is similar to that of the
GARCH model,but where the covariances as well as the variances are
permitted to be time-varying.
? There are 3 main classes of multivariate GARCH formulation that are
widely used,VECH,diagonal VECH and BEKK.
VECH and Diagonal VECH
? e.g,suppose that there are two variables used in the model,The conditional
covariance matrix is denoted Ht,and would be 2 ? 2,Ht and VECH(Ht) are
?
?
?
?
?
?
?
?
?
?
?
t
t
t
t
h
h
h
HV E C H
12
22
11
)(?
?
?
??
??
tt
tt
t hh
hhH
2221
1211
8-61
VECH and Diagonal VECH
? In the case of the VECH,the conditional variances and covariances
would each depend upon lagged values of all of the variances and
covariances and on lags of the squares of both error terms and their
cross products.
? In matrix form,it would be written
? Writing out all of the elements gives the 3 equations as
? Such a model would be hard to estimate,The diagonal VECH is much
simpler and is specified,in the 2 variable case,as follows:
112212111012
1222
2
121022
1112
2
111011
???
??
??
???
???
???
tttt
ttt
ttt
huuh
huh
huh
???
???
???
? ? ? ? ? ?111 ??? ?? ???? tttt HV E C HBV E C HACHV E C H ? ?ttt HN,0~1?? ?
1123312232111312133
2
232
2
1313112
1122312222111212123
2
222
2
1212122
1121312212111112113
2
212
2
1111111
???
???
???
???????
???????
???????
tttttttt
tttttttt
tttttttt
hbhbhbuuauauach
hbhbhbuuauauach
hbhbhbuuauauach
8-62
BEKK and Model Estimation for M-GARCH
? Neither the VECH nor the diagonal VECH ensure a positive definite variance-
covariance matrix.
? An alternative approach is the BEKK model (Engle & Kroner,1995).
? In matrix form,the BEKK model is
? Model estimation for all classes of multivariate GARCH model is again performed
using maximum likelihood with the following LLF:
where N is the number of variables in the system (assumed 2 above),? is a vector
containing all of the parameters to be estimated,and T is the number of observations,
BBAHAWWH tttt 111 ??? ? ????????
? ? ? ??
?
? ?????? T
t tttt
HHTN
1
1'l o g212l o g2 ???
8-63
An Example,Estimating a Time-Varying Hedge Ratio
for FTSE Stock Index Returns
(Brooks,Henry and Persand,2002).
? Data comprises 3580 daily observations on the FTSE 100 stock index and
stock index futures contract spanning the period 1 January 1985 - 9 April
1999,
? Several competing models for determining the optimal hedge ratio are
constructed,Define the hedge ratio as ?.
– No hedge (?=0)
– Na?ve hedge (?=1)
– Multivariate GARCH hedges:
? Symmetric BEKK
? Asymmetric BEKK
In both cases,estimating the OHR involves forming a 1-step ahead
forecast and computing
t
tF
tCF
t h
hOHR ???
?
?
?
1,
1,
1
8-64
OHR Results
I n Sampl e
Unhe dge d
? = 0
Na ? ve H e dge
? = 1
S ym m e t r i c T i m e
V a r y i ng
He dge
tF
tFC
t
h
h
,
,
??
A s ym m e t r i c
T i m e V a r yi ng
He dge
tF
tFC
t
h
h
,
,
??
R e t ur n 0,038 9
{2,371 3}
- 0,000 3
{- 0,035 1}
0,006 1
{0,956 2}
0,006 0
{0,958 0}
V a r i a n c e 0,828 6 0,171 8 0,124 0 0,121 1
Out of Sampl e
Unhe dge d
? = 0
Na ? ve H e dge
? = 1
S ym m e t r i c T i m e
V a r y i ng
He dge
tF
tFC
t
h
h
,
,
??
A s ym m e t r i c
T i m e V a r yi ng
He dge
tF
tFC
t
h
h
,
,
??
R e t ur n 0,081 9
{1,495 8}
- 0,000 4
{0,021 6}
0,012 0
{0,776 1}
0,014 0
{0,908 3}
V a r i a n c e 1,497 2 0,169 6 0,118 6 0,118 8
8-65
Plot of the OHR from Multivariate GARCH
Conclusions
- OHR is time-varying and less
than 1
- M-GARCH OHR provides a
better hedge,both in-sample
and out-of-sample.
- No role in calculating OHR for
asymmetries
S y m m e t r i c B E K K
A s y m m e t r i c B E K K
Tim e Var yi ng Hed ge Ra tios
500 1 0 0 0 1 5 0 0 2 0 0 0 2 5 0 0 3 0 0 0
0,6 5
0,7 0
0,7 5
0,8 0
0,8 5
0,9 0
0,9 5
1,0 0