??c s ? F
? 9.0 ?y
? 9.1 s ? F I
? 9.2 s ? F II
9.0 ?y
??
? > í?
9.0 ?y
? ó?B?T
"'" SYVD V[¤
?B?s ?  fib
e
^?
Lf
f¥B?
9b
? Fs ?  ——L
!á
ìμ L?s ? 
{f1,…fL } ?|
ìF  ?¤?÷z
¥s ? b
Ensemble Method
9.0 ?y
? Fs ? 1 ???s ? ÷ú ?¥
 sA1Hq ——??s ? 
^,ú ?,

accurate¥i Os ? -W
^,Ms,

diverse¥b
?,ú ?,——1
??z
?,Ms,——p
^? ?¥
9.0 ?y
? +?¥) ?
? ¨]"¥+?Vr
? ¨?]¥+?Vr
? "'¥) ?
? ¨]"¥"'
? ¨?]¥"'
9.0 ?y
? s ? ¥) ?
? ¨]"¥s ? 
? ¨?]¥s ? 
? s ? 
{¥a) ?
9.1 s ? F I
9.1 s ? F I
? s ? ¥) ?
? ?]s ? F
? s ? 
{¥a) ? ——Stacking
9.1 s ? F I
? ?
?'s ? ¥
{
^
L

1 ?
?
?¥à
q5YV?t
{
2TMFaMe?ZE |
+
ü
(óF2T9 V[ |KlaKv
?ZEóF2T
9.1 s ? F I
? ?s ? o
{ ?YS|5 V[YV
g
D¥Z
T¨
a;E¤?F2
T
9.1 s ? F I
? L ?'s ? -WMo,? ?,o1
?s ? ¥? ?
q?v? 50%K?
2TB?1
?s ? ??¥2Tz
? ?L
!μ 21?s ? 
?s ? ¥p

q1 0.3i O? ?b5
g
DE¥p

q1
3.0026.0)3.01(3.0
21
11
21
21
<<=?=

=
i
iii
error
CP
9.1 s ? F I
? F ?
E
Weighted Majority ——
?s ? ?íB?¥ ?×bT
V?
?sp¥s ? é?ò@h
 
?×b
9.1 s ? F I
? ù5
L=

4s??? ?? ?¥
s ?
Eb ?T's ? -W?? ?
?
sM¤
Es ?
4b
L2T9A
U??ZEi?
Esμr
.à? ?V
's ? ?GB?? ?
qKú¥b
6?9μ |?C? ?¥s ? F?
B?1?? ?¥s ? Fz
9.1 s ? F I
? Wolpert? 1992
M4¥ Stacked
Generalization
Stacking
? Stacking¥
±X
^ü's ? ¥
{T
1/B)s ? ¥
{ ?é?D
9.1 s ? F I
? L
!Xμ ?'s ?  b
? ?¨ Leave-one-out
±X
QVT
"?G
B?"' ¨ 
μ"'sYT

¤?¥ ?s ?f
 ¥ ?
Yòμ
:S
? ? ?
L=

^"' ¥B? ??¥
í

?
a?¥+?b
c
ò?s ? [#
T
"? 
μ"'¥?
K
12
,,,
K
CC CL
i
x
12
,,,
K
CC CL
K
i
x
12
(),(),,()
iiKi
Cx Cx C xL
K
i
x
9.1 s ? F I
? ?
¥T
"?
?"'?
3?B
? ?_
T1+?b
? ?"¤?B??¥T
"'"
?
"'¥+?
^?'s ? 
{F?¥
?_

? ¨???¥T
"T
B?,K? %|,
s ? b
K
K
9.1 s ? F I
?
MY?
k"'
?
n5| 
¨T
"?¥ ??"
'é?T
¤?¥
Yf
T¨?

¤??
k"'¥?¥+?
? |

TT1
{ ??ó,K? %|,s ? 

3Ka¥
2Tb
x
12
,,,
K
CC CL
x
12
((),(),,()
K
CxCx C xL
9.1 s ? F I
? Stackingw<
?-_b
L=
?¨
B?+
y¥,K? %|,s ? ü V[¤
? Leave-one-out?-_ê |Kas ?
¥2Tb h
± IK? %|,s ? ??
^
I
1"¥$
? Stacking ?¨'s ? ¥
{T1"'
¥?¥+?a ?$?1 meta level
learningmeta-learningb
9.1 s ? F I
? StackingKv¥ù5? ?
μr¥ ?
$b Wolpert
Stacking¥4?
.
à?-1 black art
E? b#
?Hq/ Stackingμr1
I
1μr
9.2 s ? F II
9.2 s ? F II
? "'¥) ? ——"'×?"
? o

?ZE Bootstrap Aggregating
(Bagging) Adaptive Boosting
(AdaBoost),?
?ZE?μ
' Y¥ ?
?v
¥a?ò? ??¥

"¥
L?? |¤
A÷¥rTb
9.21 Bagging
? #BHHJOH
^d9DE#SFJNBO?
M4
¥
¥
±X?÷
^d9D?dè×
1¥#PPUTUSBQ ?
b
#PPUTUSBQ
? I
n/
¥B?'?dè
' Y¥d
9ù5μ ? i.i.d. |¥"' 
oμ?t"'¥Hq/ù? (′9
 ¥d9+?b
? ?i (′9
'
&9
^B?
M
bX1ù?
¥d9
1 ?Zμb
12
,,,
n
xx xL
n
1
1
n
i
i
xx
n
=
=

#PPUTUSBQ
? ù5ù?
M
¥d9+?31v
"'?? o
¤?B?
"'? V
ù?b
? ??
± I ?Tá
ì??4?"'?
V¥
^
I
1s?á
ì? ??s?á
3#F
Fc n?"'¥
ü
V[¤?#?, (′,¥"'b
1
1
n
i
i
xx
n
=
=

#PPUTUSBQ
? ?
4?
L¥s???? b
? 3 %÷E Bootstrap
1 
?
±X

E?
L¥s?b?v,

E,¥s?
3??F
Fc n?"'¥
b
? ?

E
? ?
^ ??¥s?f
7
^B? ? ?¥à
q b
? ?n?"'?

?¥à
q1
1/n 
μ1Z1 0 b
12
,,,
n
xx xL
#PPUTUSBQ
? Bootstrap
"¥
3?
?
Fc n?"'¥
b[ 1/n¥à
qV
?
 | n?
T1BFb
?
F ú
V
μ×ˉ¥
9μ
à$?
¥b
?
F9
B? (′ AT (′
T1
M
¥B?"'bμ
v
"'a?¨Y
è¥d9ZE V[¤? (′9
¥ú¨d
9+?b
12
,,,
n
xx xL
#PPUTUSBQ
? 1?"

E

E¥i?
^à
q
áf

7
^à
q
áf
¥s ——à
qs?f
b
 n?"'?

?¥à
q1 1/nT1à
q
á
f
¥s?
^d9D?¥üs?f

^?
L¥à
qs?f
¥z¥/íb
? Bootstrap

E¥ZE ?
 μKa?)
?
á
ì¥S??óD Efron 1979
#PPUTUSBQ
? Bootstrapó¥ (′d9
¥=¨ 
9
T
() ( )
1
1

B
b
b
xx
B

=
=

() () 2
1
1

[] [ ]
B
b
boot
b
Var x x x
B

=
=?

F
¤?¥B
? (′d9
"'
(′d9
¥
D
ù?¥9
(′d9
¥Zμ
9.21 Bagging
? Baggingü Bootstrap°¤?¨?

T
MY
s ? ¥
!9?,
? YV n?"'
!9¥s ? 
L=

^μ
?¥
?
? ?÷?O?"'
^?v
?à
qs?
JJE
 |¥b
? ?TμF
FO?"'¥T
"á
ì
V[T
?s ?f
b
9.21 Bagging
? °4¥XE
??
k"' ?T¨?t
s ?f
¥2Tg
D[
1a;|¤
?B?,
ü (,¥2T??
ü (¥2T V
1o¨B?
¥ n?"'¥T
"¤?¥
Yf
1z?
a÷,×?,b
9.21 Bagging
? Breiman ?¨ boostrapZE
3?F
F n?"'¥T
" pB?g
D¥

ü (,2Tb
e???ZE1 Bootstrap
Aggregatinge? Baggingb
9.21 Bagging
? v

L2TV
ü?,?×?,¥s
? 
9ü
^
a ?TT
"'
F?M
¤?¥
Yf
üá
3?vM
Bagging
A÷4ú
¥? ?
qb
9.22 AdaBoost
? AdaBoostá
3?9
D ?

Computational Learning Theory Valiant
1984
?? PAC
Probably Approximately Correct
Db
?
? ?T?1 PAC learnable ?T
iB?
E
[v¥à
qD?B?
ú¥ú ?i O[
T
HW
=??
HW
^
{ ?M
aà
qaú ?¥[
Tf
b
9.22 AdaBoost
? è1S???¥ú ?
^ 13
ü??
?ú ??i?
^?s-?¥7
^[B
?v¥à
q £¥ (probably)ù?"
d¥D??-1 V L?b1S???¥
V L?
^ 95%[ 95%¥à
q £
?

"S 13
ü¥S?
=
9.22 AdaBoost
? PACD¥'i
^ò?
?é?s ?s
't
?
^ PAC VD¥b
L=
??31
?i ú¥à
qú ?? VDoμ
¥
?
^
? PAC VDS¥ ;“ <, VD

?b
?, D, VD
?b
o
 £[ ?iú¥à
q
D?1
?zB?
[
T
HW
=??b
 ?ù5
?¥? ?
q1 50%b
? B?¥XE, D, VD
???1, <, VD

?S?÷v
9.22 AdaBoost
? Schapire
1990£
ü
B?
7 |ù ¥2T
“ D, VD
?D, <, VD
??
?à
Q
^?N¥b'o1
?
^ D VD¥ü
s?B?
E[
T
HW
=[ ?i ú¥

qú ?D??
?b
?1o#PPTUJOH/
b?B? D VD¥

?á
ì#PPTUJOH
9 <
¥D
Eb?
/a¥
Eü
@ <D¥1 pb
BoostingZE
?
n5? Xμ¥T
"'"
!9B?0
s ? 
P s ?rT1
?zB?
GQT
BF0s ?  ?
?0s
? ¥T
"??- -ò?0s ? 
ó¥c?
Kv¥"'?F?
K?¥
%2T?ò0s ? ¥2T
] %?i OT
"'"¥ ?
q
 ?i¥úb
BoostingZE
? è
 ?ù5 ?
P¨ BoostingZE7y
??0s ? 
?
n5Vvl1 ¥ e
S"'" ?
ê
| ?"'?
?bíF?"'"
? ? "'" T
?B?0s ? 
1
?zB?b
? V
,?¥"'??Gê ? F?
P ? B?¥"'
$ ? ?s ?
6B
?$ ps
n
D
1
n
1
D
1
D
1
C
D
2
n
2
D
2
D
1
C
1
C
BoostingZE
? ? "'" T
?=?0s ? 
? 
:?¥"'??ê |"'?¨
é?s ? ?T
%¥2T?]
üü??"'F ?
? ? "'" T
? ??0s ? 
¨ ??0s ? B??"' s ? ?
¥
%2TM]S:1?? ?Y ?
?]S:1 ¤?¥ ?Y
2
D
2
C
D
1
C
2
C
1
C
2
C
3
D
3
D
3
C
x
3
C
1
C
2
C
9.22 AdaBoost
? Boosting¥ ?
£
üL?μí k¥"'
?
L=?
^? V
¥b
? ?%?¥T
"'"
| Boosting
L
¨ 1996
M Freund Schapire
 |4

 AdaBoost
Adaptive Boosting
E
L??VCdèz¥?
b
AdaBoost
E
? AdaBoost ?¨B?'s ? 
AT Ds ?
?
Q¥
{2T1
a?1?M
"' ?×
X$? ?s ?¥"' ?′?l$
ps¥"' ?×4úT
B"
s ? 
? s ? ¥VCF ?g
DóK?
%2Tb
? V
Ue"'" ?¥"' V
U  ?YS
| V
U? kQY}
H ?8"'¥ ?×s?
AdaBoost
E ?/
i
x
D
i
y
()
k
Wi
AdaBoost
E
? 1 begin initialize
? 2
? 3 do
? 4 T

P¨?v ?"¥ ¥
Ds ? 
? 5 
P¨ ¥ ?
 ¥
¥T
μ
1
1
max 1
{,,,,},
,()1/,1,,
n
n
Dxy x y
kWi ni n
=
==
L
L
0k ←
1kk← +
()
k
Wi D
k
C
k
E ← ()
k
Wi
D
k
C
AdaBoost
E
? 6
? 7
? 8 until
? 9 return and,
? 10 end
1
()
()
()
()
k
k
i
ik
k
k
i
k
ik
ehxy
Wi
Wi
Z
ehxy
α
α
+
=
←×

JG
JG
max
kk=
k
C
k
α
max
1,,kk=L
]/)1ln[(
2
1
kkk
EE?←α
AdaBoost
E
? ? 1BB"
 1 ó 
¥ ¥S:
? 9s ? ?ò0s ? F ?
ü (¤?
? 
f ? ?
?0s ? ?
^ DD
 @v9s ? ¥T
μ
q
 ?i¥l
k
Z
()
i
k
hx
k
C
i
x
max
1
() ()
k
kk
k
gx h xα
=

=



max
k
9.22 AdaBoost
? D5 Boostingμ"T
L¥ ?
$?
^ ?XHq/? ?¥b3
d AdaBoost

L=ù5?¥asVCB°
^D?
ì
¥
"¥?
^à?
àμ?
¥s?b
?
" -K ?¥3
d
? Boosting the margin[
 ?s ?ù51 è
QY}¤?¥F2T
P¤T

¥
margin9vb
9.22 AdaBoost
? AdaBoostμB?+?'
PüV ??Q
Y}T

¥
MY
qXür? 100%
??Y}
4úFs ? ?
k"
'
¥
MY
qb9v marginó??C`
B?1? ?¥3
d
?4ús ?
¥w<
?b
? IóD
? [1] D.H,Wolpert,“Stacked Generalization”,Neural Networks,
Vol,5,pp,241-259,1992.
? [2] R,E,Schapire,“The Strength of Weak Learnability,”
Machine Learning,Vol,5(2),pp,197-227,1990.
? [3]L,Breiman,“Bagging Predictors”,Machine Learning,Vol.24,
pp,123-140,1996.
? [4]L,G,Valiant,“A Theory of the learnable,” Communications of
the ACM,Vol,27(11),pp,1134-1142,1984.
? [5]Y,Freund,and R,E,Schapire,“A decision-theoretic
generalization of on-line learning and an application to boosting,”
Journal of Computer and System Sciences,55(1),pp,119-139,
1997.