Linear Regression - 1-2
2021-1νκΈ°, λνμμ βν΅κ³μ λ°μ΄ν°λ§μ΄λβ μμ μ λ£κ³ 곡λΆν λ°λ₯Ό μ 리ν κΈμ λλ€. μ§μ μ μΈμ λ νμμ λλ€ :)
Motivation.
estimatorμμ independent vector $\mathbf{x}$μμ μ΄λ€ featureκ° response vector $\mathbf{y}$μ μν₯μ λ―ΈμΉλμ§ νμΈνλ €λ©΄ μ΄λ»κ² ν΄μΌν κΉ? κ°λ¨νκ² μκ°ν΄λ³Έλ€λ©΄, μΆμ ν $\hat{\beta}$μμ $\hat{\beta}_i$μ κ°μ΄ 0μΈμ§ μλμ§λ₯Ό ν΅ν΄μ νλ¨ν μ μμ κ²μ΄λ€. μ΄λ κ² μ΄λ€ featureκ° κ²°κ³Όμ μν₯μ λ―ΈμΉλ€ μ λ―ΈμΉλ€λ₯Ό μ°Ύμλ΄λ μμ μ <ν΅κ³μ μΆλ‘ statistical inference>λΌκ³ νλ€.
μλμ κ°μ μ <statistical inference>λ₯Ό μνν λμ μννλ κ³ μ μ μΈ κ°μ μ΄λ€.
Assumption. Classical Assumption
Assume that the true distribution of the data is
\[Y = X^T \beta + \epsilon, \quad \epsilon \sim N(0, \sigma^2)\]μ΄κ²μ λ€μ μ°λ©΄,
\[(Y \mid X = x) \sim N(x^T \beta, \; \sigma^2)\]λ§μ½ μμ κ°μ κ°μ μ λ§μ‘±νλ€λ©΄, μλμ μ±μ§μ΄ μ±λ¦½ν¨μ μ¦λͺ ν μ μλ€.
Property.
Supp. that the classical assumption holds. Then,
\[\hat{\beta} \sim N(\beta, \; (\mathbf{X}^T \mathbf{X})^{-1} \sigma^2)\]κ·Έλ¦¬κ³ $\hat{\sigma}^2$λ₯Ό μ λΉν scaling ν΄μ€λ€λ©΄,
\[\frac{(n-p) \hat{\sigma}^2}{\sigma^2} \sim \chi^2_{n-p}\]κ·Έλ¦¬κ³ , $\hat{\beta}$, $\hat{\sigma}^2$λ μλ‘ independentνλ€.
\[\hat{\beta} \perp \hat{\sigma}^2\]μ΄ λΆλΆμ μΆνμ μ’λ 보좩νλλ‘ νκ² λ€.