Chapter 9
Regression on Dummy
Explanatory Variables
9.1 The Nature of Dummy Variables
? 1.Concept:
Dummy variables (also indicator variables; binary variables;
categorical variables; dichotomous variables.)--
Qualitative variables in regression model.
For example,sex,race,color,religion,nationality,marital
status,etc.
Qualitative variables can be quantified by constructing
artificial variables that take on values of 1 or 0:
0,indicating the absence of an attribute
1,indicating the presence (or possession) of that attribute.
Dummy variable( D), variables that assume values
such as 0 and 1.
? 2,ANOVA model:
? ——Regression models that contain only dummy
explanatory variables are called analysis-of-variance
(ANOVA) models.
Yi = B1+B2Di +ui
ANOVA models are usually used in the fields of
sociology,psychology,education,market research.
( 1) Dummy variables generally take on values of 1 or
0,they are nonstochastic; that is,their values are fixed.
( 2) Estimation:
Dummy explanatory variables do not pose any new
estimation problems,under the assumptions of CLRM,we
can use the customary OLS method to estimate the
parameters of models that contain dummy variables,
9.2 Regression with one Quantitative Variable and
One Qualitative Variable,with Two Categories
-- ANCOVA models
Yi = B1+B2Di +B3Xi +ui ( 9.6)
Features,
1,If a qualitative variable has m categories,introduce (m – 1)
dummy variables.
If there are only two categories,use only one dummy variables
2,The assignment of 1 and 0 values to two categories,such as male
and female,is arbitrary.
3,The category that is assigned the value of 0 is often referred to as
the base,bench mark,control,comparison,or omitted category.
4,The coefficient B2 attached to the dummy variable D can be called
the differential intercept coefficient because it tells by how much the
value of the intercept term of the category that receives the value of 1
differs from the intercept coefficient of the base category.
9.3 Regression on a Quantitative Variable and a
Qualitative Variable with More Than Two Classes or
Categories——Introduce m- 1 dummy variables
? 1,Model:
Yi = B1+B2D2i +B3 D3i +B4Xi +ui (9.13)
E(Yi|D2=0′D3=0′Xi)=B1+B4Xi (9.14)
E(Yi|D2=1′D3=0′Xi)=(B1+B2)+B4Xi (9.15)
E(Yi|D2=0′D3=1′Xi)=(B1+B3)+B4Xi (9.16)
? 2,Estimate
9.4 Regression on One Quantitative
Variable and Two Qualitative Variables
? 1.Model
Yi = B1+B2D2i +B3D3i +B4Xi +ui (9.18)
E(Yi|D2=0′D3=0′Xi)=B1+B4Xi (9.19)
E(Yi|D2=1′D3=0′Xi)=(B1+B2)+B4Xi (9.20)
E(Yi|D2=0′D3=1′Xi)=(B1+B3)+B4Xi (9.21)
E(Yi|D2=1′D3=1′Xi)=(B1+B2+B3)+B4Xi (9.22)
? 2,Estimate
9.5 A Generalization
We can extend our model to include more
than on quantitative variable and more than two
qualitative variables,but the number of
dummies for each qualitative variable is one less
than the number of categories of that variable.
9.6 Structural Stability of Regression Models,
The Dummy Variable Approach
The Chow test did not tell us whether the difference in these two
regressions is in their intercept values or the slope values or both.
Yt =A1 +A2Xt +u1t (9.23)
Yt =B1 +B2Xt +u2t (9.24)
1,A1=B1 and A2=B2; coincident regression,the two regressions are
identical
2,A1≠B1 but A2=B2; parallel regressions,the two regressions have
different intercepts but same slope.
3,A1=B1 but A2≠B2; concurrent regressions,the two regressions have
same intercepts but different slope.
4,A1≠B1 and A2≠B2; dissimilar regressions,the two regressions are
different in both intercepts and slope.
To find out which possibility will be,we can use
dummy variable technique to check the model:
Yt=C1+C2Dt+C3Xt+C4(Dt·X t)+ut (9.25)
E(Yt|Dt=0′Xt)=C1+C3Xt (9.26)
E(Yt|Dt=1′Xt)=(C1+C2)+(C3+C4)Xt (9.27)
C2,differential intercept;
C4,differential slope coefficient
Yt =A1 +A2Xt +u1t (9.23)
Yt =B1 +B2Xt +u2t (9.24)
When A1= C1,A2= C3,then (9.26) =(9.23).
B1 = C1+C2,B2= C3+C4,,then (9.27) =(9.24).
Advantages to the dummy variable approach.
1,Instead of running three regressions
(7.54),(7.55),and (7.56) under the Chow test,in
the dummy variable approach all we have to do is
to run just one regression.
2,From the differential dummy and
intercept coefficients,we can point out the
source(s) of the difference.
9.7 The Use of Dummy Variables in Seasonal
Analysis
The process of removing the seasonal component from a time
series is known as deseasonalization,or seasonal adjustment.
Use the dummy variables to do decentralization:
? 1.We assume that the seasonal effect only affects the intercept
term and not the slope coefficient.-- A model with
differential intercept but same slope:
Yt=B1 +B2D2t + B3D3t+ B4D4t6 + B5Xt+ut (9.35)
the seasonal dummies,the Ds,are defined as:
D2t =1 if the observation lies in the IInd quarter
=0 otherwise
D3t =1 if the observation lies in the IIIth quarter
=0 otherwise
D4t =1 if the observation lies in the IVth quarter
=0 otherwise
Mean consumption expenditure in quarter I:
E(Yt|D2=0′D3=0′D4=0′Xt)=B1+B5Xt (9.36)
Mean consumption expenditure in quarter II:
E(Yt|D2=1′D3=0′D4=0′Xt)=(B1+B2)+B5Xt (9.37)
Mean consumption expenditure in quarter Ⅲ,
E(Yt|D2=0′D3=1′D4=0′Xt)=(B1+B3)+B5Xt (9.38)
Mean consumption expenditure in quarter Ⅳ,
E(Yt|D2=0′ D3=0′ D4=1′ Xt)=(B1+B3)+B5Xt (9.39)
2.A general model with differential slope and differential
intercept.
-- the best practical strategy to avoid model
specification error,or model specification bias.
Yt=B1+ B2D2t+ B3D3t+ B4D4t+ B5Xt+ B6(D2tXt)+
B7(D3tXt)+ B8(D4tXt)+ut (9.40)
Summary:
1,Regression with one qualitative variable
Yi = B1+B2Di +ui
2,Regression with one quantitative variable and one
qualitative variable,with two Categories,introduce one
dummy variable.
Yi = B1+B2Di +B3Xi +ui
3,Regression with one quantitative variable and one
qualitative variable with m Categories,introduce m- 1
dummy variables
Yi = B1+B2D2i +B3 D3i +B4Xi +ui
4,Regression with one quantitative variable and two
qualitative variables
Yi = B1+B2D2i +B3D3i +B4Xi +ui