Chapter 16
Multiple Regression and
Correlation
to accompany
Introduction to Business Statistics
fourth edition,by Ronald M,Weiers
Presentation by Priscilla Chaffe-Stengel
Donald N,Stengel
? 2002 The Wadsworth Group
Chapter 16 Learning Objectives
? Obtain and interpret the multiple regression
equation
? Make estimates using the regression model:
– Point value of the dependent variable,y
– Intervals:
?Confidence interval for the conditional mean of y
?Prediction interval for an individual y observation
? Conduct and interpret hypothesis tests on the
– Coefficient of multiple determination
– Partial regression coefficients
? 2002 The Wadsworth Group
Chapter 16 - Key Terms
? Partial regression coefficients
? Multiple standard error of the estimate
? Conditional mean of y
? Individual y observation
? Coefficient of multiple determination
? Coefficient of partial determination
? Global F-test
? Standard deviation of bi
? 2002 The Wadsworth Group
The Multiple Regression Model
? Probabilistic Model
yi = b0 + b1x1i + b2x2i +,.,+ bkxki + ei
where yi = a value of the dependent variable,y
b0 = the y-intercept
x1i,x2i,...,xki = individual values of the
independent variables,x1,x2,...,xk
b1,b2,...,bk = the partial regression coefficients
for the independent variables,x1,x2,...,xk
ei = random error,the residual
? 2002 The Wadsworth Group
The Multiple Regression Model
? Sample Regression Equation
= b0 + b1x1i + b2x2i +,.,+ bkxki
where = the predicted value of the dependent
variable,y,given the values of x1,x2,...,xk
b0 = the y-intercept
x1i,x2i,...,xki = individual values of the
independent variables,x1,x2,...,xk
b1,b2,...,bk = the partial regression coefficients
for the independent variables,x1,x2,...,xk
y i
y i
? 2002 The Wadsworth Group
The Amount of Scatter in the Data
? The multiple standard error of the estimate
where yi = each observed value of y in the data set
= the value of y that would have been
estimated from the regression equation
n = the number of data values in the set
k = the number of independent (x) variables
measures the dispersion of the data points
around the regression hyperplane.
se = (yi– ? y i)2?n–k–1
y i
? 2002 The Wadsworth Group
Approximating a Confidence
Interval for a Mean of y
? A reasonable estimate for interval bounds on the
conditional mean of y given various x values is
generated by:
where = the estimated value of y based on the
set of x values provided
t = critical t value,(1–a)% confidence,df = n – k – 1
se = the multiple standard error of the estimate
n
esty? ×±
y
? 2002 The Wadsworth Group
Approximating a Prediction
Interval for an Individual y Value
? A reasonable estimate for interval bounds on an
individual y value given various x values is
generated by:
where = the estimated value of y based on the
set of x values provided
t = critical t value,(1–a)% confidence,df = n – k – 1
se = the multiple standard error of the estimate
? y ± t× se
y
? 2002 The Wadsworth Group
Coefficient of Multiple
Determination
? The proportion of variance in y that is
explained by the multiple regression
equation is given by:
R2 = 1–S
(yi – ? y i)2
S(yi –y )2
= 1 – SSESST = SSRSST
? 2002 The Wadsworth Group
Coefficients of Partial
Determination
? For each independent variable,the
coefficient of partial determination
denotes the proportion of total variation
in y that is explained by that one
independent variable alone,holding the
values of all other independent variables
constant,The coefficients are reported
on computer printouts.
? 2002 The Wadsworth Group
Testing the Overall Significance
of the Multiple Regression Model
? Is using the regression equation to predict y
better than using the mean of y?
The Global F-Test
I,H0,b1 = b2 =,.,= bk = 0
The mean of y is doing as good a job at predicting the
actual values of y as the regression equation.
H1,At least one bi does not equal 0.
The regression model is doing a better job of
predicting actual values of y than using the mean of y.
? 2002 The Wadsworth Group
Testing Model Significance
II,Rejection Region
Given a and
numerator df = k,denominator df = n – k – 1
Decision Rule,If F > critical value,
reject H0.
a??a
Do Not R ej ec t H
0
Re je ct H
0
F
? 2002 The Wadsworth Group
Testing Model Significance
III,Test Statistic
where SSR = SST – SSE
SST =
SSE =
If H0 is rejected:
? At least one bi differs from zero.
?The regression equation does a better job of predicting
the actual values of y than using the mean of y.
F = SSR k SSE ( n – k – 1 )
S(yi – y )2
S(yi – ? y )2
? 2002 The Wadsworth Group
Testing the Significance of a
Single Regression Coefficient
? Is the independent variable xi useful in predicting
the actual values of y?
The Individual t-Test
I,H0,bi = 0
The dependent variable (y) does not depend on values of the
independent variable xi,(This can,with reason,be structured as
a one-tail test instead.)
H1,bi ? 0
The dependent variable (y) does change with the values of
the independent variable xi,? 2002 The Wadsworth Group
Testing the Impact on y of a
Single Independent Variable
II,Rejection Region
Given a and df = n – k – 1
Decision Rule:
If t > critical value
or t < critical value,
reject H0.
a? ? a? ???a
-t +t
Do Not
Rejec t H
0
00
Rejec t HRejec t H
? 2002 The Wadsworth Group
Testing the Impact on y of a
Single Independent Variable
III,Test Statistic
where bi = estimate for bi for the multiple
regression equation
= the standard deviation of bi
If H0 is rejected:
? The dependent variable (y) does change with the
independent variable (xi).
t = bi – 0s
bi
s
b i
? 2002 The Wadsworth Group
Multiple Regression and
Correlation
to accompany
Introduction to Business Statistics
fourth edition,by Ronald M,Weiers
Presentation by Priscilla Chaffe-Stengel
Donald N,Stengel
? 2002 The Wadsworth Group
Chapter 16 Learning Objectives
? Obtain and interpret the multiple regression
equation
? Make estimates using the regression model:
– Point value of the dependent variable,y
– Intervals:
?Confidence interval for the conditional mean of y
?Prediction interval for an individual y observation
? Conduct and interpret hypothesis tests on the
– Coefficient of multiple determination
– Partial regression coefficients
? 2002 The Wadsworth Group
Chapter 16 - Key Terms
? Partial regression coefficients
? Multiple standard error of the estimate
? Conditional mean of y
? Individual y observation
? Coefficient of multiple determination
? Coefficient of partial determination
? Global F-test
? Standard deviation of bi
? 2002 The Wadsworth Group
The Multiple Regression Model
? Probabilistic Model
yi = b0 + b1x1i + b2x2i +,.,+ bkxki + ei
where yi = a value of the dependent variable,y
b0 = the y-intercept
x1i,x2i,...,xki = individual values of the
independent variables,x1,x2,...,xk
b1,b2,...,bk = the partial regression coefficients
for the independent variables,x1,x2,...,xk
ei = random error,the residual
? 2002 The Wadsworth Group
The Multiple Regression Model
? Sample Regression Equation
= b0 + b1x1i + b2x2i +,.,+ bkxki
where = the predicted value of the dependent
variable,y,given the values of x1,x2,...,xk
b0 = the y-intercept
x1i,x2i,...,xki = individual values of the
independent variables,x1,x2,...,xk
b1,b2,...,bk = the partial regression coefficients
for the independent variables,x1,x2,...,xk
y i
y i
? 2002 The Wadsworth Group
The Amount of Scatter in the Data
? The multiple standard error of the estimate
where yi = each observed value of y in the data set
= the value of y that would have been
estimated from the regression equation
n = the number of data values in the set
k = the number of independent (x) variables
measures the dispersion of the data points
around the regression hyperplane.
se = (yi– ? y i)2?n–k–1
y i
? 2002 The Wadsworth Group
Approximating a Confidence
Interval for a Mean of y
? A reasonable estimate for interval bounds on the
conditional mean of y given various x values is
generated by:
where = the estimated value of y based on the
set of x values provided
t = critical t value,(1–a)% confidence,df = n – k – 1
se = the multiple standard error of the estimate
n
esty? ×±
y
? 2002 The Wadsworth Group
Approximating a Prediction
Interval for an Individual y Value
? A reasonable estimate for interval bounds on an
individual y value given various x values is
generated by:
where = the estimated value of y based on the
set of x values provided
t = critical t value,(1–a)% confidence,df = n – k – 1
se = the multiple standard error of the estimate
? y ± t× se
y
? 2002 The Wadsworth Group
Coefficient of Multiple
Determination
? The proportion of variance in y that is
explained by the multiple regression
equation is given by:
R2 = 1–S
(yi – ? y i)2
S(yi –y )2
= 1 – SSESST = SSRSST
? 2002 The Wadsworth Group
Coefficients of Partial
Determination
? For each independent variable,the
coefficient of partial determination
denotes the proportion of total variation
in y that is explained by that one
independent variable alone,holding the
values of all other independent variables
constant,The coefficients are reported
on computer printouts.
? 2002 The Wadsworth Group
Testing the Overall Significance
of the Multiple Regression Model
? Is using the regression equation to predict y
better than using the mean of y?
The Global F-Test
I,H0,b1 = b2 =,.,= bk = 0
The mean of y is doing as good a job at predicting the
actual values of y as the regression equation.
H1,At least one bi does not equal 0.
The regression model is doing a better job of
predicting actual values of y than using the mean of y.
? 2002 The Wadsworth Group
Testing Model Significance
II,Rejection Region
Given a and
numerator df = k,denominator df = n – k – 1
Decision Rule,If F > critical value,
reject H0.
a??a
Do Not R ej ec t H
0
Re je ct H
0
F
? 2002 The Wadsworth Group
Testing Model Significance
III,Test Statistic
where SSR = SST – SSE
SST =
SSE =
If H0 is rejected:
? At least one bi differs from zero.
?The regression equation does a better job of predicting
the actual values of y than using the mean of y.
F = SSR k SSE ( n – k – 1 )
S(yi – y )2
S(yi – ? y )2
? 2002 The Wadsworth Group
Testing the Significance of a
Single Regression Coefficient
? Is the independent variable xi useful in predicting
the actual values of y?
The Individual t-Test
I,H0,bi = 0
The dependent variable (y) does not depend on values of the
independent variable xi,(This can,with reason,be structured as
a one-tail test instead.)
H1,bi ? 0
The dependent variable (y) does change with the values of
the independent variable xi,? 2002 The Wadsworth Group
Testing the Impact on y of a
Single Independent Variable
II,Rejection Region
Given a and df = n – k – 1
Decision Rule:
If t > critical value
or t < critical value,
reject H0.
a? ? a? ???a
-t +t
Do Not
Rejec t H
0
00
Rejec t HRejec t H
? 2002 The Wadsworth Group
Testing the Impact on y of a
Single Independent Variable
III,Test Statistic
where bi = estimate for bi for the multiple
regression equation
= the standard deviation of bi
If H0 is rejected:
? The dependent variable (y) does change with the
independent variable (xi).
t = bi – 0s
bi
s
b i
? 2002 The Wadsworth Group