โ€œํ™•๋ฅ ๊ณผ ํ†ต๊ณ„(MATH230)โ€ ์ˆ˜์—…์—์„œ ๋ฐฐ์šด ๊ฒƒ๊ณผ ๊ณต๋ถ€ํ•œ ๊ฒƒ์„ ์ •๋ฆฌํ•œ ํฌ์ŠคํŠธ์ž…๋‹ˆ๋‹ค. ์ „์ฒด ํฌ์ŠคํŠธ๋Š” Probability and Statistics์—์„œ ํ™•์ธํ•˜์‹ค ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค ๐ŸŽฒ

3 minute read

โ€œํ™•๋ฅ ๊ณผ ํ†ต๊ณ„(MATH230)โ€ ์ˆ˜์—…์—์„œ ๋ฐฐ์šด ๊ฒƒ๊ณผ ๊ณต๋ถ€ํ•œ ๊ฒƒ์„ ์ •๋ฆฌํ•œ ํฌ์ŠคํŠธ์ž…๋‹ˆ๋‹ค. ์ „์ฒด ํฌ์ŠคํŠธ๋Š” Probability and Statistics์—์„œ ํ™•์ธํ•˜์‹ค ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค ๐ŸŽฒ

์ด ๊ธ€์€ โ€œIntroduction to Linear Regressionโ€ ํฌ์ŠคํŠธ์—์„œ ์ œ์‹œํ•œ ์ˆ™์ œ๋“ค์„ ํ’€์ดํ•œ ํฌ์ŠคํŠธ์ž…๋‹ˆ๋‹ค.

Theorem.

The sum of residuals is zero.

\[\sum_{i=1}^n e_i = \sum_{i=1}^n (y_i - \hat{y}_i) = 0\]

proof.

\[\begin{aligned} \sum_{i=1}^n e_i &= \sum_{i=1}^n (y_i - \hat{y}_i) \\ &= \sum_{i=1}^n (y_i - (b_0 + b_1 x_i)) \\ &= \sum_{i=1}^n \left(y_i - (\bar{y} + b_1 (x_i - \bar{x})) \right) \\ &= \sum_{i=1}^n (y_i - \bar{y} - b_1 (x_i - \bar{x})) \\ &= \cancelto{0}{\sum_{i=1}^n (y_i - \bar{y})} - b_1 \cancelto{0}{\sum_{i=1}^n (x_i - \bar{x})} \\ &= 0 \end{aligned}\]

$\blacksquare$


Theorem.

The sum of product of residual and $x_i$s is zero.

\[\sum_{i=1}^n x_i e_i = \sum_{i=1}^n x_i (y_i - \hat{y}_i) = 0\]

proof.

\[\begin{aligned} \sum_{i=1}^n x_i e_i &= \sum_{i=1}^n x_i (y_i - \hat{y}_i) \\ &= \sum_{i=1}^n x_i (y_i - \bar{y} - b_1 (x_i - \bar{x})) \\ &= \sum_{i=1}^n x_i (y_i - \bar{y}) - b_1 \sum_{i=1}^n x_i (x_i - \bar{x}) \\ &= \sum_{i=1}^n (x_i - \bar{x}) (y_i - \bar{y}) - b_1 \sum_{i=1}^n (x_i - \bar{x_i}) (x_i - \bar{x}) \\ &= S_{xy} - \frac{S_{xy}}{\cancel{S_{xx}}} \cdot \cancel{S_{xx}} \\ &= 0 \end{aligned}\]

$\blacksquare$


Theorem.

\[\begin{aligned} \sum_{i=1}^n (y_i - \bar{y})^2 &= \sum_{i=1}^n (\hat{y}_i - \bar{y})^2 + \sum_{i=1}^n (y_i - \hat{y}_i)^2 \\ \text{SST} &= \text{SSR} + \text{SSE} \end{aligned}\]

proof.

(์Šคํฌ) ์ฆ๋ช… ๊ณผ์ •์—์„œ ์œ„์—์„œ ์ฆ๋ช…ํ–ˆ๋˜ ๋‘ ๋ช…์ œ๋ฅผ ์‚ฌ์šฉํ•œ๋‹ค!

\[\begin{aligned} \sum_{i=1}^n (y_i - \bar{y})^2 &= \sum_{i=1}^n (y_i - \hat{y}_i + \hat{y}_i - \bar{y})^2 \\ &= \sum_{i=1}^n \left((y_i - \hat{y}_i) + (\hat{y}_i - \bar{y})\right)^2 \\ &= \sum_{i=1}^n (y_i - \hat{y}_i)^2 + 2 \sum_{i=1}^n (y_i - \hat{y}_i)(\hat{y}_i - \bar{y}) + \sum_{i=1}^n (\hat{y}_i - \bar{y})^2 \\ \end{aligned}\]

์ด๋•Œ, ์œ„์˜ ์‹์—์„œ ์ค‘๊ฐ„์˜ ํ…€๋งŒ ๋”ฐ๋กœ ๋–ผ์–ด๋ณด์ž. ๊ทธ๋ฆฌ๊ณ  $\hat{y}_i$์— ๋Œ€ํ•œ ์‹์„ ๋Œ€์ž…ํ•˜๋ฉด,

\[\begin{aligned} \sum_{i=1}^n (y_i - \hat{y}_i)(\hat{y}_i - \bar{y}) &= \sum_{i=1}^n (y_i - \hat{y}_i)(b_0 + b_1 x_i - \bar{y}) \\ &= \sum_{i=1}^n (y_i - \hat{y}_i)((\cancel{\bar{y}} - b_1 \bar{x}) + b_1 x_i - \cancel{\bar{y}}) \\ &= \sum_{i=1}^n (y_i - \hat{y}_i) \cdot b_1 (x_i - \bar{x}) \\ &= b_1 \cdot \left( \cancelto{0}{\sum_{i=1}^n (y_i - \hat{y}_i) x_i} - \bar{x} \cdot \cancelto{0}{\sum_{i=1}^n (y_i - \hat{y}_i)} \right) \\ &= 0 \end{aligned}\]