β€œν™•λ₯ κ³Ό 톡계(MATH230)” μˆ˜μ—…μ—μ„œ 배운 것과 κ³΅λΆ€ν•œ 것을 μ •λ¦¬ν•œ ν¬μŠ€νŠΈμž…λ‹ˆλ‹€. 전체 ν¬μŠ€νŠΈλŠ” Probability and Statisticsμ—μ„œ ν™•μΈν•˜μ‹€ 수 μžˆμŠ΅λ‹ˆλ‹€ 🎲

3 minute read

β€œν™•λ₯ κ³Ό 톡계(MATH230)” μˆ˜μ—…μ—μ„œ 배운 것과 κ³΅λΆ€ν•œ 것을 μ •λ¦¬ν•œ ν¬μŠ€νŠΈμž…λ‹ˆλ‹€. 전체 ν¬μŠ€νŠΈλŠ” Probability and Statisticsμ—μ„œ ν™•μΈν•˜μ‹€ 수 μžˆμŠ΅λ‹ˆλ‹€ 🎲

이 글은 β€œIntroduction to Linear Regression” ν¬μŠ€νŠΈμ—μ„œ μ œμ‹œν•œ μˆ™μ œλ“€μ„ ν’€μ΄ν•œ ν¬μŠ€νŠΈμž…λ‹ˆλ‹€.

Theorem.

The sum of residuals is zero.

\[\sum_{i=1}^n e_i = \sum_{i=1}^n (y_i - \hat{y}_i) = 0\]

proof.

\[\begin{aligned} \sum_{i=1}^n e_i &= \sum_{i=1}^n (y_i - \hat{y}_i) \\ &= \sum_{i=1}^n (y_i - (b_0 + b_1 x_i)) \\ &= \sum_{i=1}^n \left(y_i - (\bar{y} + b_1 (x_i - \bar{x})) \right) \\ &= \sum_{i=1}^n (y_i - \bar{y} - b_1 (x_i - \bar{x})) \\ &= \cancelto{0}{\sum_{i=1}^n (y_i - \bar{y})} - b_1 \cancelto{0}{\sum_{i=1}^n (x_i - \bar{x})} \\ &= 0 \end{aligned}\]

$\blacksquare$


Theorem.

The sum of product of residual and $x_i$s is zero.

\[\sum_{i=1}^n x_i e_i = \sum_{i=1}^n x_i (y_i - \hat{y}_i) = 0\]

proof.

\[\begin{aligned} \sum_{i=1}^n x_i e_i &= \sum_{i=1}^n x_i (y_i - \hat{y}_i) \\ &= \sum_{i=1}^n x_i (y_i - \bar{y} - b_1 (x_i - \bar{x})) \\ &= \sum_{i=1}^n x_i (y_i - \bar{y}) - b_1 \sum_{i=1}^n x_i (x_i - \bar{x}) \\ &= \sum_{i=1}^n (x_i - \bar{x}) (y_i - \bar{y}) - b_1 \sum_{i=1}^n (x_i - \bar{x_i}) (x_i - \bar{x}) \\ &= S_{xy} - \frac{S_{xy}}{\cancel{S_{xx}}} \cdot \cancel{S_{xx}} \\ &= 0 \end{aligned}\]

$\blacksquare$


Theorem.

\[\begin{aligned} \sum_{i=1}^n (y_i - \bar{y})^2 &= \sum_{i=1}^n (\hat{y}_i - \bar{y})^2 + \sum_{i=1}^n (y_i - \hat{y}_i)^2 \\ \text{SST} &= \text{SSR} + \text{SSE} \end{aligned}\]

proof.

(슀포) 증λͺ… κ³Όμ •μ—μ„œ μœ„μ—μ„œ 증λͺ…ν–ˆλ˜ 두 λͺ…μ œλ₯Ό μ‚¬μš©ν•œλ‹€!

\[\begin{aligned} \sum_{i=1}^n (y_i - \bar{y})^2 &= \sum_{i=1}^n (y_i - \hat{y}_i + \hat{y}_i - \bar{y})^2 \\ &= \sum_{i=1}^n \left((y_i - \hat{y}_i) + (\hat{y}_i - \bar{y})\right)^2 \\ &= \sum_{i=1}^n (y_i - \hat{y}_i)^2 + 2 \sum_{i=1}^n (y_i - \hat{y}_i)(\hat{y}_i - \bar{y}) + \sum_{i=1}^n (\hat{y}_i - \bar{y})^2 \\ \end{aligned}\]

μ΄λ•Œ, μœ„μ˜ μ‹μ—μ„œ μ€‘κ°„μ˜ ν…€λ§Œ λ”°λ‘œ λ–Όμ–΄λ³΄μž. 그리고 $\hat{y}_i$에 λŒ€ν•œ 식을 λŒ€μž…ν•˜λ©΄,

\[\begin{aligned} \sum_{i=1}^n (y_i - \hat{y}_i)(\hat{y}_i - \bar{y}) &= \sum_{i=1}^n (y_i - \hat{y}_i)(b_0 + b_1 x_i - \bar{y}) \\ &= \sum_{i=1}^n (y_i - \hat{y}_i)((\cancel{\bar{y}} - b_1 \bar{x}) + b_1 x_i - \cancel{\bar{y}}) \\ &= \sum_{i=1}^n (y_i - \hat{y}_i) \cdot b_1 (x_i - \bar{x}) \\ &= b_1 \cdot \left( \cancelto{0}{\sum_{i=1}^n (y_i - \hat{y}_i) x_i} - \bar{x} \cdot \cancelto{0}{\sum_{i=1}^n (y_i - \hat{y}_i)} \right) \\ &= 0 \end{aligned}\]