2021-1ν•™κΈ°, λŒ€ν•™μ—μ„œ β€˜ν†΅κ³„μ  λ°μ΄ν„°λ§ˆμ΄λ‹β€™ μˆ˜μ—…μ„ λ“£κ³  κ³΅λΆ€ν•œ λ°”λ₯Ό μ •λ¦¬ν•œ κΈ€μž…λ‹ˆλ‹€. 지적은 μ–Έμ œλ‚˜ ν™˜μ˜μž…λ‹ˆλ‹€ :)

2 minute read

2021-1ν•™κΈ°, λŒ€ν•™μ—μ„œ β€˜ν†΅κ³„μ  λ°μ΄ν„°λ§ˆμ΄λ‹β€™ μˆ˜μ—…μ„ λ“£κ³  κ³΅λΆ€ν•œ λ°”λ₯Ό μ •λ¦¬ν•œ κΈ€μž…λ‹ˆλ‹€. 지적은 μ–Έμ œλ‚˜ ν™˜μ˜μž…λ‹ˆλ‹€ :)

Introduction to MARS

<MARS; Multivariate Adaptive Regression Splines> μ•Œκ³ λ¦¬μ¦˜μ€ 이전에 <non-parametric regression>μ—μ„œ <Multi-dimensional Splines>λ₯Ό μ‚΄νŽ΄λ³Ό λ•Œ, high dimensionμ—μ„œμ˜ 문제λ₯Ό ν•΄κ²°ν•˜κΈ° μœ„ν•œ λŒ€μ•ˆ 쀑 ν•˜λ‚˜λ‘œ μ†Œκ°œλ˜μ—ˆλ‹€.

<MARS>λŠ” μ•„λž˜μ™€ 같은 ν˜•νƒœμ˜ λͺ¨λΈμ„ κ΅¬μΆ•ν•œλ‹€.

\[\hat{f}(x) = \sum^k_{i=1} c_i B_i(x)\]

μ΄λ•Œ, $B_i(x)$λŠ” basis function이닀. basis func. $B_i(x)$λŠ” μ•„λž˜μ˜ μ„Έ 가지 ν˜•νƒœ 쀑 ν•˜λ‚˜λ‘œ νŠΉμ •λœλ‹€.

1. an intercept

2. a hinge function πŸ”₯

\[h(x - a) = (x - a) \cdot I(x > a) \quad \text{or} \quad h(a - x) = (a - x) \cdot I(a > x)\]

λ˜λŠ” 쒀더 κ°„λ‹¨ν•˜κ²Œ ν‘œν˜„ν•΄

\[h_+ (x-a) = max(x-a, 0) \quad \text{or} \quad h_- (x-a) = min(x-a, 0)\]

3. a product of two or more hinge functions! πŸ”₯


Model fitting

<MARS> λͺ¨λΈμ„ fitting ν•˜λŠ” 것은 두 가지 과정에 μ˜ν•΄ 이루어진닀.

  1. forward pass
  2. backward pass

Process. Forward pass

Start with null model - just intercept

Adds basis functions in pairs to model
- choose the pair that gives the largest reduction in RSS.
// μ΄λ•Œ, β€œintercept-hinge” 쌍이 μ„ νƒλ˜μ–΄, ν•˜λ‚˜μ˜ basis func.이 λ“€μ–΄κ°€κ²Œ 될 μˆ˜λ„ 있음!

Explores
- existing terms
- all variables
- all values of variables

Coefficients for basis are fitted with linear regression.

Process. Back pass

Terms are removed one-by-one based on β€œgeneralised cross validation; GCV”.


MARS vs. GAM

<MARS> λͺ¨λΈκ³Ό <GAM> λͺ¨λΈμ˜ 차이점은 두 input feature 사이에 β€œinteractionβ€œμ„ κ³ λ €ν•˜λŠ”μ§€ 여뢀이닀.

<MARS>λŠ” μƒˆλ‘œμš΄ basis func.의 pairλ₯Ό μΆ”κ°€ν•˜λ©΄μ„œ, λͺ¨λΈμ„ fitting ν•œλ‹€.

λ°˜λ©΄μ— <GAM>은 λͺ¨λΈμ˜ basis func.이 λͺ¨λ‘ independent ν•˜λ‹€κ³  κ°€μ •ν•˜κ³  λͺ¨λΈμ„ fitting ν•œλ‹€!


references