MARS
2021-1νκΈ°, λνμμ βν΅κ³μ λ°μ΄ν°λ§μ΄λβ μμ μ λ£κ³ 곡λΆν λ°λ₯Ό μ 리ν κΈμ λλ€. μ§μ μ μΈμ λ νμμ λλ€ :)
Introduction to MARS
<MARS; Multivariate Adaptive Regression Splines> μκ³ λ¦¬μ¦μ μ΄μ μ <non-parametric regression>μμ <Multi-dimensional Splines>λ₯Ό μ΄ν΄λ³Ό λ, high dimensionμμμ λ¬Έμ λ₯Ό ν΄κ²°νκΈ° μν λμ μ€ νλλ‘ μκ°λμλ€.
<MARS>λ μλμ κ°μ ννμ λͺ¨λΈμ ꡬμΆνλ€.
\[\hat{f}(x) = \sum^k_{i=1} c_i B_i(x)\]μ΄λ, $B_i(x)$λ basis functionμ΄λ€. basis func. $B_i(x)$λ μλμ μΈ κ°μ§ νν μ€ νλλ‘ νΉμ λλ€.
1. an intercept
2. a hinge function π₯
\[h(x - a) = (x - a) \cdot I(x > a) \quad \text{or} \quad h(a - x) = (a - x) \cdot I(a > x)\]λλ μ’λ κ°λ¨νκ² ννν΄
\[h_+ (x-a) = max(x-a, 0) \quad \text{or} \quad h_- (x-a) = min(x-a, 0)\]3. a product of two or more hinge functions! π₯
Model fitting
<MARS> λͺ¨λΈμ fitting νλ κ²μ λ κ°μ§ κ³Όμ μ μν΄ μ΄λ£¨μ΄μ§λ€.
- forward pass
- backward pass
Process. Forward pass
Start with null model - just intercept
Adds basis functions in pairs to model
- choose the pair that gives the largest reduction in RSS.
// μ΄λ, βintercept-hingeβ μμ΄ μ νλμ΄, νλμ basis func.μ΄ λ€μ΄κ°κ² λ μλ μμ!
Explores
- existing terms
- all variables
- all values of variables
Coefficients for basis are fitted with linear regression.
Process. Back pass
Terms are removed one-by-one based on βgeneralised cross validation; GCVβ.
Image from βJonathan Tukeβ
MARS vs. GAM
<MARS> λͺ¨λΈκ³Ό <GAM> λͺ¨λΈμ μ°¨μ΄μ μ λ input feature μ¬μ΄μ βinteractionβμ κ³ λ €νλμ§ μ¬λΆμ΄λ€.
<MARS>λ μλ‘μ΄ basis func.μ pairλ₯Ό μΆκ°νλ©΄μ, λͺ¨λΈμ fitting νλ€.
λ°λ©΄μ <GAM>μ λͺ¨λΈμ basis func.μ΄ λͺ¨λ independent νλ€κ³ κ°μ νκ³ λͺ¨λΈμ fitting νλ€!