2021-1ํ•™๊ธฐ, ๋Œ€ํ•™์—์„œ โ€˜ํ†ต๊ณ„์  ๋ฐ์ดํ„ฐ๋งˆ์ด๋‹โ€™ ์ˆ˜์—…์„ ๋“ฃ๊ณ  ๊ณต๋ถ€ํ•œ ๋ฐ”๋ฅผ ์ •๋ฆฌํ•œ ๊ธ€์ž…๋‹ˆ๋‹ค. ์ง€์ ์€ ์–ธ์ œ๋‚˜ ํ™˜์˜์ž…๋‹ˆ๋‹ค :)

2 minute read

2021-1ํ•™๊ธฐ, ๋Œ€ํ•™์—์„œ โ€˜ํ†ต๊ณ„์  ๋ฐ์ดํ„ฐ๋งˆ์ด๋‹โ€™ ์ˆ˜์—…์„ ๋“ฃ๊ณ  ๊ณต๋ถ€ํ•œ ๋ฐ”๋ฅผ ์ •๋ฆฌํ•œ ๊ธ€์ž…๋‹ˆ๋‹ค. ์ง€์ ์€ ์–ธ์ œ๋‚˜ ํ™˜์˜์ž…๋‹ˆ๋‹ค :)

์ด ํฌ์ŠคํŠธ๋Š” Regression Spline๊ณผ ์ด์–ด์ง€๋Š” ๋‚ด์šฉ์ž…๋‹ˆ๋‹ค ๐Ÿ˜Š

Non-parameteric Logistic Regression

๋ณธ๋ž˜ <Binary Logistic Regreeion>์€ ์•„๋ž˜์™€ ๊ฐ™์ด ๋ชจ๋ธ๋งํ•œ๋‹ค.

\[\log \frac{P(Y = 1 \mid X=x)}{P(Y = 0 \mid X=x)} = \beta^T x\]

์œ„์˜ ์‹์„ ๋‹ค์‹œ ์ž˜ ์ •๋ฆฌํ•˜๋ฉด ์•„๋ž˜์™€ ๊ฐ™๋‹ค.

\[P(Y = 1 \mid X = x) = \frac{e^{\beta^T x}}{1 + e^{\beta^T x}}\]

<Non-parametric (binary) logistic regression>์€ ์œ„์˜ ์‹์—์„œ $\beta^T x$๋ฅผ $f(x)$๋กœ ๋Œ€์ฒดํ•œ๋‹ค!!

\[P(Y = 1 \mid X = x) = \frac{e^{f(x)}}{1 + e^{f(x)}}\]

์ด๋•Œ, $f(x)$๋Š” ํ˜„์žฌ ๋ชจ๋ฅด๋Š” ์ƒํƒœ๋กœ ์šฐ๋ฆฌ๊ฐ€ estimation ํ•ด์•ผ ํ•˜๋Š” ๋Œ€์ƒ์ด๋‹ค!!

์ •๊ทœ ์ˆ˜์—…์—์„œ๋Š” $f(\cdot)$๋ฅผ ์ถ”์ •ํ•˜๋Š” ๋ฐฉ์‹์œผ๋กœ ์•„๋ž˜์˜ โ€œpenalized log-likelihood functionโ€์„ Maximize ํ•˜๋Š” ๊ฒƒ์„ ์ œ์‹œํ•œ๋‹ค.

\[\ell_\lambda (f) = \sum^n_{i=1} \left[ y_i f(x_i) - \log (1 + e^{f(x_i)}) \right] - \frac{\lambda}{2} \int \left\{ f''(t) \right\}^2 \; dt\]

๋ณต์žกํ•˜๊ฒŒ ์ƒ๊ฐํ•˜๊ธฐ ๋ณด๋‹ค๋Š” <smoothing spline>๊ณผ ๋น„์Šทํ•œ ํ˜•ํƒœ๋ผ๊ณ  ์ธ์‹ ํ•ด๋‘์ž!


Multi-dimensional Splines

์ง€๊ธˆ๊นŒ์ง€ ์‚ดํŽด๋ณธ <Spline Method>๋Š” ๋ชจ๋‘ 1-dimensional spline model์ด์—ˆ๋‹ค. ํ•˜์ง€๋งŒ, ๋งŽ์€ ๊ฒฝ์šฐ feature์˜ ์ƒํ˜ธ์ž‘์šฉ์„ ๊ณ ๋ คํ•˜๋Š” multi-dimensionalํ•œ ์ ‘๊ทผ์„ ํ•„์š”๋กœ ํ•œ๋‹ค.

<Multi-dimensional Spline>์€ ์•„๋ž˜์™€ ๊ฐ™์ด ๋ชจ๋ธ๋ง ํ•œ๋‹ค.

\[f(X) = \sum^{M_1}_{i=1} \sum^{M_2}_{j=1} \; \theta_{ij} \cdot g_{ij} (X)\]

where $g_{ij}(X)$ is the tensor product of basis function, defined by

\[g_{ij}(X) = h_{1i} (X_1) \cdot h_{2j} (X_2)\]

์ฆ‰, โ€œmulti-dimensional splineโ€์€ ๋‘ basis spline์„ ๊ณฑํ•œ ๊ฒƒ์„ basis function์œผ๋กœ ์‚ผ๋Š”๋‹ค๋Š” ๋ง์ด๋‹ค!

์œ„์™€ ๊ฐ™์€ ๋ฐฉ์‹์œผ๋กœ ์ ‘๊ทผํ•˜๋ฉด, 2-dim ๋ฟ๋งŒ ์•„๋‹ˆ๋ผ d-dim๊นŒ์ง€๋„ ์‰ฝ๊ฒŒ generalization ํ•  ์ˆ˜ ์žˆ๋‹ค.

๊ทธ๋Ÿฌ๋‚˜ input variable์˜ ์ˆ˜ $d$๊ฐ€ ์ฆ๊ฐ€ํ•œ๋‹ค๋ฉด, multi-dimensional model์ด ํ•„์š”๋กœ ํ•˜๋Š” basis function์€ exponentialํ•˜๊ฒŒ ์ฆ๊ฐ€ํ•œ๋‹ค. ์ด๊ฒƒ์€ ๊ณ„์‚ฐ๋Ÿ‰ ๋ฟ๋งŒ ์•„๋‹ˆ๋ผ curse of dimensionality ๋“ฑ์˜ ๋ฌธ์ œ๋ฅผ ๋™๋ฐ˜ํ•œ๋‹ค.

์ด๋Ÿฐ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•œ ๋Œ€์•ˆ์œผ๋กœ 1991๋…„, <MARS; Multi-variate Adaptive Regression Spline>๊ฐ€ ์ œ์‹œ๋˜์—ˆ๋‹ค.

๋˜, ์ •๊ทœ ๊ณผ์ •์˜ ๋งˆ์ง€๋ง‰ ์ฆˆ์Œ์— ๋‹ค๋ฃฐ <Additive Model> ์—ญ์‹œ ์ด๋Ÿฐ multi-dimensional model์˜ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๋Š” ๋Œ€์•ˆ์ด ๋œ๋‹ค.


์ด์–ด์ง€๋Š” ํฌ์ŠคํŠธ์—์„œ๋Š” KNN ๊ธฐ๋ฐ˜์˜ non-parametric method์— ๋Œ€ํ•ด ์‚ดํŽด๋ณด๊ฒ ๋‹ค.

๐Ÿ‘‰ KNN & kernel method