第 10章 方差分析 (ANOVA)
Analysis of Variance
? The Completely Randomized Model,
One-Factor Analysis of Variance
? F-Test for Difference in c Means
? The Tukey-Kramer Procedure
? ANOVA Assumptions
? The Factorial Design Model,
Two-Way Analysis of Variance
? Examine Effect of Factors and Interaction
? Kruksal-Wallis Rank Test for Differences in c
Medians
本章概要
One-Factor Analysis of Variance
单因子方差分析
? Evaluate the Difference Among the
Means of 2 or More (c) Populations
? e.g.,Several Types of Tires,Oven Temperature Settings
? Assumptions:
? Samples are Randomly and Independently Drawn
(This condition must be met.)
? Populations are Normally Distributed
(F test is Robust to moderate departures from
normality.)
? Populations have Equal Variances
多个独立样本均值的比较 --单因素方差分析
1、资料类型
样本 1 样本 2 …… 样本 k
注意,k 个样本含量不必相等!!!
?
?
?
?
?
?
?
?
?
?
?
?
?
?
knnn
k
k
k
xxx
xxx
xxx
.,,
.,,.,,.,,.,,
.,,
.,,
21
22221
11211
21
2,总变异的分解 …… 方差分析的关键!!!
),1(~/
)/(),1/(
)()(
])([;,.,,,2,1,/][;.,,,/][
i n t
i n t
1 1
2
1
2
1
2
1
1
21
1 1
knkFMSMSF
knSSMSkSSMS
SSSS
XxXXn
XxSS
kjnxX
nnnnnxX
ib
e r n a lib e t w e e nb
e r n a lb e t w e e n
k
j
n
i
jij
k
j
t o t a ljj
k
j
n
i
t o t a lijt o t a l
j
n
i
ijj
k
k
j
n
i
ijt o t a l
j
j
j
j
???
????
??
????
??
??
?????
? ??
? ?
?
? ?
? ??
? ?
?
? ?
?
One-Factor ANOVA
Test Hypothesis
H0,?1 = ?2 = ?3 =,.,= ?c
?All population means are equal
?No treatment effect (NO variation in means
among groups)
H1,not all the ?k are equal
?At least ONE population mean is different
(Others may be the same!)
?There is treatment effect
Does NOT mean that all the means are different:
?1 ??2 ?,.,??c
3,[实例分析 ]
三组销售不同包装饮料的商品的日均销售量
瓶装组 罐装组 袋装(老包装)
75 74 60
70 78 64
66 72 65
69 68 55
71 63
58
问题:三种包装的日平均销售量是否有显著差异?
方差分析结果
ANOVA for SALE
SS df Mean Square F Sig.
Between 420.367 2 210.183 14.661,001
Within 172.033 12 14.336
Total 592.400 14
结论,P=0.001<0.05,可以认为不同包装下的饮
料平均销售量整体上表现出统计学意义上的
差。注意,要想知道是哪两种包装之间有差
异,尚须做两量两比较。
3210, ??? ??H
One-Factor ANOVA:
No Treatment Effect
????????????
H0,?1 = ?2 = ?3 =,.,= ?c
H1,not all the ?k are equal
The Null
Hypothesis is
True
请注意其含义
One Factor ANOVA:
Treatment Effect Present
???????????
H0,?1 = ?2 = ?3 =,.,= ?c
H1,not all the ?k are equal The Null
Hypothesis is
NOT True
One-Factor ANOVA
Partitions of Total Variation
Variation Due to
Treatment SSA
Variation Due to Random
Sampling SSW
Total Variation SST
Commonly referred to as:
? Sum of Squares Within,or
? Sum of Squares Error,or
? Within Groups Variation
Commonly referred to as:
? Sum of Squares Among,or
? Sum of Squares Between,or
? Sum of Squares Model,or
? Among Groups Variation
= +
Total Variation 总变异
2
1 1
)XX(S S T
c
j
n
i
ij
j
?? ??
? ?
n
X
X
c
j
n
i
ij
j
? ?
? ? ?1 1
the overall or grand mean
Xij = the ith observation in group j
nj = the number of observations in group j
n = the total number of observations in all groups
c = the number of groups
Among-Group Variation
组间变异
2
1
)XX(nSSA j
c
j
j ???
?
nj = the number of observations in group j
c = the number of groups
the sample mean of group j
the overall or grand mean
??????j Variation Due to Differences Among Groups.
1?
?
c
SSA
M S A
Xj
X
_
__
Within-Group Variation
组内变异
2
1 1
)XX(SSW j
c
j
n
i
ij
j
?? ??
? ?
?ijX the ith observation in group j
?jX the sample mean of group j
??j
Summing the variation within
each group and then adding
over all groups.
cn
S S WM S W
?
?
Within-Group Variation
??j
)n()n()n(
S)n(S)n(S)n(
cn
SSW
M S W
c
cc
111
111
21
22
22
2
11
?????????
?????????
?
?
?
For c = 2,this is the
pooled-variance in the
t-Test.
?If more than 2 groups,
use F Test.
?For 2 groups,use t-Test,
F Test more limited.
One-Way ANOVA Summary Table
单因子方差分析表
Source of
Variation
Degrees
of
Freedom
Sum of
Squares
Mean
Square
(Variance)
Among
(Factor)
c - 1 SSA MSA =
SSA/(c - 1)
MSA
MSW
Within
(Error)
n - c SSW MSW =
SSW/(n - c)
Total n - 1 SST =
SSA+SSW
F Test
Statistic
=
One-Factor ANOVA
F Test Example
As production manager,you
want to see if 3 filling
machines have different mean
filling times,You assign 15
similarly trained &
experienced workers,5 per
machine,to the machines,At
the,05 level,is there a
difference in mean filling
times?
Machine1 Machine2
Machine3
25.40 23.40 20.00
26.31 21.80 22.20
24.10 23.50 19.75
23.74 22.75 20.60
25.10 21.60 20.40
One-Factor ANOVA
Example,Scatter Diagram
27
26
25
24
23
22
21
20
19
X
X
x
x
X = 24.93 X = 22.61 X = 20.59
X = 22.71
Machine1 Machine2
Machine3
25.40 23.40 20.00
26.31 21.80 22.20
24.10 23.50 19.75
23.74 22.75 20.60
25.10 21.60 20.40
__
_ _
__
_
_
_
_
One-Factor ANOVA
Example Computations
X1 = 24.93
X2 = 22.61
X3 = 20.59
X = 22.71
SSA = 5 [(24.93 - 22.71) 2+ (22.61 - 22.71)2 + (20.59 - 22.71) 2]
= 47.164
SSW = 4.2592+3.112 +3.682 = 11.0532
MSA = SSA/(c-1) = 47.16/2 = 23.5820
MSW = SSW/(n-c) = 11.0532/12 =,9211
nj =5
c = 3
n = 15
Machine1 Machine2
Machine3
25.40 23.40 20.00
26.31 21.80 22.20
24.10 23.50 19.75
23.74 22.75 20.60
25.10 21.60 20.40
_
_
_
_
_
Summary Table
Source of
Variation
Degrees of
Freedom
Sum of
Squares
Mean
Square
(Variance)
F =
= 25.60
Among
(Machines)
3 - 1 = 2 47.1640 23.5820
Within
(Error)
15 - 3 = 12 11.0532,9211
Total 15 - 1 = 14 58.2172
MSA
MSW
F0 3.89
One-Factor ANOVA
Example Solution
H0,?1 = ?2 = ?3
H1,Not All Equal
? =,05
df1= 2 df2 = 12
Critical Value(s):
Test Statistic,
Decision:
Conclusion:
Reject at ? = 0.05
There is evidence that at least
one ?i differs from the rest.
? = 0.05
F MSAMSW? ? ?23 58209211 25 6..,
The Tukey-Kramer Procedure
均值间的 两两比较
? Tells Which Population Means
Are Significantly Different
? e.g.,?1 = ?2 ??3
? Post Hoc (a posteriori)
Procedure
? Done after rejection
of equal means in
ANOVA
? Ability for Pairwise Comparisons:
? Compare absolute mean differences with
critical range
X
f(X)
?1= ?2 ?3
2 groups whose means may be
significantly different.
The Tukey-Kramer Procedure,
Example
1,Compute absolute mean
differences:Machine1 Machine2 Machine325.40 23.40 20.00
26.31 21.80 22.20
24.10 23.50 19.75
23.74 22.75 20.60
25.10 21.60 20.40
02259206122
34459209324
32261229324
32
31
21
...XX
...XX
...XX
????
????
????
2,Compute Critical Range:
3,Each of the absolute mean difference is greater,There is a
significance difference between each pair of means,
?
?
?
?
?
?
?
?
?? ?
'jj
)cn,c(U nn
MS W
QR a n g eC r i t i ca l
11
2= 1.618
Two-Way ANOVA
双因素方差分析
? Examines the Effect of:
? Two Factors on the Dependent Variable
e.g.,Percent Carbonation and Line Speed on
Soft Drink Bottling Process
? Interaction Between the Different Levels of
these Two Factors
e.g.,Does the effect of one particular
percentage of Carbonation depend on
which level the line speed is set?
Two-Way ANOVA Assumptions
双因素方差分析的假设
? Normality( 正态性假设 )
? Populations are normally distributed
? Homogeneity of Variance(方差齐性假设)
? Populations have equal variances
? Independence of Errors(独立抽样假设)
? Independent random samples are drawn
Two-Way ANOVA
Total Variation Partitioning
Variation Due to
Treatment A
Variation Due to
Random Sampling
Variation Due to
Interaction
SSE
SSFA +
SSAB +SST =
Variation Due to
Treatment B SSFB +Total Variation
总变异的分解
Two Way ANOVA:
The F Test Statistic
F Test for Factor A Effect
MSFA
MSEF =
F Test for Factor B Effect
F = MSFBMSE
F Test for Interaction Effect
F = MSFABMSE
H0,?1,= ?2, =..,= ?r,
H1,Not all ?i, are equal
H0,??ij = 0 (for all i and j)
H1,??ij ? 0
H0,???1 = ?,2 =...= ???c
H1,Not all ?i are equal
Reject if
F > FU
Reject if
F > FU
Reject if
F > FU
Source of
Variation
Degrees of
Freedom
Sum of
Squares
Mean
Square F Statistic:
A
(Row)
r - 1 SSFA MSFA MSFA
MSE
B
(Column)
c - 1 SSFB MSFB MSFB
MSE
AB
(Interaction)
(r-1)(c-1) SSAB MSAB MSAB
MSE
Error R c (p-1) SSE MSE
Total Rcp-1 SST
Two-Way ANOVA
Summary Table
双因素方差分析表
=
=
=
两因素方差分析
无重复试验 的两因素方差分析
1、资料类型
因素 B的 c 个水平
因素
A的
r 个
水平
?
?
?
?
?
?
?
?
?
?
?
?
?
?
rcrr
c
c
xxx
xxx
xxx
?
????
?
?
21
22221
11211
2、变异分解及假设检验
r e s
B
B
r e s
A
A
r e sBAt ot al
r e sBAt ot al
MS
MS
F
MS
MS
F
crcr
crn
SSSSSSSS
??
???????
????????
???;
)1)(1()1()1(
11 ????
3,[实例分析 ]某公司分别在 5个地区建立了某种
小型车床的销售点,共随机抽取了 5个不同时
点的销售资料如下:
问题:不同销售点及销售季节的销售量有无统计
学意义上的差异?
2.56.49.48.42.6
8.23.27.15.14.2
5.76.82.74.94.13
6.129.88.101.72.14
6.77.36.38.15.6
5
4
3
2
1
54321
r
r
r
r
r
ttttt
Kruskal-Wallis Rank Test for c Medians
多个中位数的秩和检验
? Extension of Wilcoxon Rank Sum Test
Tests the equality of more than 2 (c)
population medians
? Distribution-free test procedure
? Used to analyze completely randomized
experimental designs
? Use ?2 distribution to approximate if each
sample group size nj > 5,df = c - 1
Kruskal-Wallis Rank Test
? Assumptions:
? Independent random samples are drawn
? Continuous dependent variable
? Data may be ranked both within and among
samples
? Populations have same variability
? Populations have same shape
? Robust with regard to last 2 conditions
? Use F Test in completely randomized designs and
when the more stringent assumptions hold.
Kruskal-Wallis Rank Test
Procedure
?Obtain Rank combined data values,In event of
tie,each of the tied values gets their average
rank.
?Add the ranks for data from each of the c groups.
Square to obtain Tj2
n = n1 + n2 +?+ nc nj = # of observations
in jth sample
?Compute Test
Statistic:
)n(
n
T
)n(n
H
c
j j
j
13
1
12
1
2
??
?
?
?
?
?
?
?
?
??
?
?
?
Kruskal-Wallis Rank
Test Procedure
?Test Statistic,H may be approximated byChi-
square distribution (df = c -1)
?Critical Value for a given ?,Upper tail
?Decision Rule,Reject H0,M1 = M2 =,..= Mc
if Test Statistic H >
otherwise do not reject H0
2?U
2?U
As production manager,you
want to see if 3 filling
machines have different
median filling times,You
assign 15 similarly trained &
experienced workers,
5 per machine,to the machines,
At the,05 level,is there a
difference in median filling
times?
Kruksal-Wallis Rank Test,
Example
Machine1 Machine2
Machine3
25.40 23.40 20.00
26.31 21.80 22.20
24.10 23.50 19.75
23.74 22.75 20.60
25.10 21.60 20.40
Example Solution,Step 1
Obtaining a Ranking
Raw Data Ranks
65 38 17
Machine1 Machine2
Machine3
25.40 23.40 20.00
26.31 21.80 22.20
24.10 23.50 19.75
23.74 22.75 20.60
25.10 21.60 20.40
Machine1 Machine2
Machine3
14 9
2 15 6
7
12 10 1
11 8 4
13 5 3
Example Solution,Step 2
Test Statistic Computation )n(
c
j j
n
j
T
)n(n
H 13
1
2
1
12 ??
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
)(
)(
1153
5
17
5
38
5
65
11515
12 222
???
?
?
?
?
?
??
?
?
??
?
?
???
?
?
= 11.58
H0,M1 = M2 = M3
H1,Not all equal
? =,05
df = c - 1 = 3 - 1 = 2
Critical Value(s),Reject at
? 20 5, 9 9 1 ?
2
0 5, 9 9 1
H ? 11 58..
Kruskal-Wallis Test
Example Solution
Test Statistic:
Decision:
Conclusion:
There is evidence that
population medians are
not all equal.
? =,05
? =,05
有交互作用的双因子方差分析
1、资料类型( 每一搭配水平重复进行 p次试验 )
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
crrrr
c
c
C
BABABAA
BABABAA
BABABAA
BBBf a c t o r s
?
?????
?
?
?
22
222122
121111
21
2、变异分解
r e s
AB
AB
r e s
B
B
r e s
A
A
r e sABBAt ot al
r e sABBAt ot al
MS
MS
F
MS
MS
F
MS
MS
F
prccrcr
pcrn
SSSSSSSSSS
?
??
?????????
??????????
????;
)1()1)(1()1()1(
11 ?????
3,[实例分析 ]研究 3种不同工艺和 3种不同灯丝配
方对灯泡寿命的影响,试验数据如下:
灯丝配方
I II III


问题:不同工艺及灯丝配方是否影响灯泡寿命?
二者是否存在交互作用?
1.161.171.173.166.130.143
7.155.143.147.136.154.142
0.170.183.171.160.152.131
在用 SPSS编程时的注意事项
1、关键在于对资料中数据的下标的理解
2、具体解决办法是设立合理的分组变量
Group Variables
3、在 SPSS数据编辑窗口中演示
pkcjriX
cjriX
i j k
ij
??????
????
1,1,1,)2(
1,1,)1(
? Described The Completely Randomized Model:
One-Factor Analysis of Variance
? F-Test for Difference in c Means
? The Tukey-Kramer Procedure
? ANOVA Assumptions
? Discussed The Factorial Design Model:
Two-Way Analysis of Variance
? Examined effect of Factors and Interaction
? Addressed the Kruskal-Wallis Rank Test for
Differences in c Medians
本章小结