AdaBoost
2021-1νκΈ°, λνμμ βν΅κ³μ λ°μ΄ν°λ§μ΄λβ μμ μ λ£κ³ 곡λΆν λ°λ₯Ό μ 리ν κΈμ λλ€. μ§μ μ μΈμ λ νμμ λλ€ :)
<Boosting>μ λν κ°μλ μλμ ν¬μ€νΈμμ νμΈν μ μμ΅λλ€ π
π Boosting
AdaBoostPermalink
<AdaBoost>λ βAdaptive Boostingβμ μ½μλ‘ <Boosting> μκ³ λ¦¬μ¦μ μ²μμΌλ‘ κ³ μν Schapire & Freundμ μν΄ 1996λ κ°λ°λ μ΅μ΄μ Boosting λͺ¨λΈμ΄λ€.
μκ³ λ¦¬μ¦μ μ΄ν΄λ³΄κΈ° μ μ λ¨Όμ μ μ μ νμΈνμ.
[Data]
- p-dimensional input:
- Binary output:
// 1, -1λ‘ μΈμ½λ©
[Weak Classifier]
[Prediction]
predictions are based on a weighted majority vote!
where
Algorithm. AdaBoost
Initialize the weights for data points
// training!
For
ββ Obtain
ββ Compute error of
ββ Compute weights for
ββ // μ λΆλ₯
ββ Update weights for data points
Output
Boosting as Additive ExpansionPermalink
<AdaBoost> μκ³ λ¦¬μ¦μ κ²°κ³ΌμΈ
λ₯Ό additive/basis expansion λͺ¨λΈλ‘λ μ΄ν΄ν μ μλ€.
Algorithm. general additive function expansion
μ¦,
λ³Έλ additive modelμ ERM(Empirical Risk Minimization)μΌλ‘ fitting μν€λ κ²μ hardν λ¬Έμ μ΄λ€.
λλΆλΆμ κ²½μ°μμλ computationally infeasible νλ€κ³ 보면 λλ€.
μ΄λ° κ³μ°μ μΈ λ¬Έμ λ₯Ό ν΄κ²°νκΈ° μν λ°©λ²μΌλ‘λ
- Update the expansion sequentially
- Consider only a single basis function at each step
κ° μλλ°, μ΄λ° μ κ·Όμ νν <Forward Stagewise Update>λΌκ³ νλ€!
κ°λ¨νκ² μ€λͺ ν΄λ³΄μλ©΄,
λ₯Ό fitting λ κ·Έλλ‘ λκ³ , μ fitting μ κ·Έλλ‘ λκ³ , λ₯Ό fitting- β¦
Algorithm. Forward Stagewise Update π₯
Supp. that the current expansion is
then, the next update can be achieved by solving
μ¦, <forward stagewise update>λ κΈ°μ‘΄μ solutionμ μμ νμ§ μμΌλ©΄μ μλ‘μ΄ basis functionμ μΆκ°νλ€.
Example. Forward Stagewise Update with Squared Error Loss
Consider the squared error loss:
then, the loss in forward stagewise is
μ΄λ, current residualμΈ
Example. Forward Stagewise Update with Exponential Loss π₯
Consider the exponential error loss:
μ΄λ! λλκ²λ <AdaBoost>λ Forward Stagewise Updates for the ERM with Exponential Lossμ λμΉλ€!! π²
μ΄ λͺ μ λ <Boosting>μ λν μμ ν μλ‘μ΄ λ°©μμ μμΌλ₯Ό μ μνλ€!
Note that the
It can be shown that
where
Then, the update is
and leads the weights for the new iterations to be
where
Q. Why Exponential Loss?
Population minimizer of the exponential loss:
equivalently,
Derivation
It is true that
<AdaBoost>μ λν΄μλ <AdaBoost>λ₯Ό ꡬνλ κ²μ΄ μ¬μ€ βForward Stagewise Updates for the ERM with Exponential LossβλΌλ κ²μ κΌ κΈ°μ΅ν΄μΌ νλ€!! π€
μ΄νμ λμ¨ GBM κ³μ΄μ΄ λ μ±λ₯μ΄ μ’μμ μμ¦μ μ μ°μ§ μλλ€κ³ νλ€ π₯