Studentโs t-distribution
โํ๋ฅ ๊ณผ ํต๊ณ(MATH230)โ ์์ ์์ ๋ฐฐ์ด ๊ฒ๊ณผ ๊ณต๋ถํ ๊ฒ์ ์ ๋ฆฌํ ํฌ์คํธ์ ๋๋ค. ์ ์ฒด ํฌ์คํธ๋ Probability and Statistics์์ ํ์ธํ์ค ์ ์์ต๋๋ค ๐ฒ
์๋ฆฌ์ฆ: Sampling Distributions
[toc]
- Studentโs t-distribution
- Sampling Distribution of Mean (unknown $\sigma^2$)
Studentโs t-distribution
Definition. Studentโs t-distribution
Let $Z \sim N(0, 1)$, and $V \sim \chi^2(n)$, and $Z \perp V$.
Define $T$ as
\[T := \frac{Z}{\sqrt{V / n}}\]Then, the distribution of $T$ is called <studentโs t-distribution of $n$ degrees of freedom>.
Remark.
1. The pdf of $t$-distribution is
\[f(x) = \frac{\Gamma\left(\dfrac{n+1}{2}\right)}{\sqrt{n\pi} \cdot \Gamma\left( \dfrac{n}{2} \right)} \left( 1 + \frac{x^2}{n} \right)^{-(n+1)/2}\]for $x \in \mathbb{R}$.
(๊ฑฑ์ ํ์ง ๋ง๋ผ, ์ฐ๋ฆฌ๊ฐ <t-distribution>์ ๋ถํฌ๋ฅผ ์ธ์์ ์ ์ฉํ ์ผ์ ์ ๋ ์๋ค!)
2. $t$-distribution would converges to normal distribution as $n \rightarrow \infty$.
\[t(x; n) \rightarrow \frac{1}{\sqrt{2\pi}} \cdot \exp(-x^2 / 2)\]$n$์ด ์ปค์ง ์๋ก ์ ๊ท ๋ถํฌ์ ๊ฐ๊น์์ง๋ค!
proof.
ํญ๋ชฉ ๋ณ๋ก ๊ทนํ์ ์๊ฐํด๋ณด์.
[Step 1]
๋ ์ฌ์ด ๋ ์์ธ ์ค๋ฅธ์ชฝ ๋ ์๋ถํฐ ํ๊ฒ ๋ค.
\[\left( 1 + \frac{x^2}{n} \right)^{-(n+1)/2}\]$\exp(x)$ ํจ์์ ์ ์๋ฅผ ์ด์ฉํ์.
\[\left( 1 + \frac{x^2}{n} \right)^{-n/2} \cdot \left( 1 + \frac{x^2}{n} \right)^{-1/2}\]$n \rightarrow \infty$์ผ ๋, ์ผ์ชฝ์
\[\left( 1 + \frac{x^2}{n} \right)^{-n/2} \rightarrow \exp(-x^2 / 2)\]์ค๋ฅธ์ชฝ์
\[\left( 1 + \frac{x^2}{n} \right)^{-1/2} \rightarrow (1)^{-1/2} = 1\]๋ฐ๋ผ์,
\[\left( 1 + \frac{x^2}{n} \right)^{-(n+1)/2} \rightarrow \exp(-x^2 / 2)\][Step 2]
\[\frac{\Gamma\left(\dfrac{n+1}{2}\right)}{\sqrt{n\pi} \cdot \Gamma\left( \dfrac{n}{2} \right)}\]๊ฐ๋ง ํจ์๋ ์๋์ ๊ฐ์ด ์๊ฒผ๋ค.
\[\Gamma(\alpha) = \int^{\infty}_0 t^{\alpha - 1} e^{-t} dt \quad \text{for} \; \alpha > 0\]์ฌ๊ธฐ์ ๊ทธ๋ฅ ๋ฐ์๋ค์ฌ์ผ ํ๋ ๋ถ๋ถ์ด ๋ฑ์ฅํ๋๋ฐ, ๋ฐ๋ก <์คํธ๋ง ๊ทผ์ฌ; Stirlingโs Approximation>๋ค.
<์คํธ๋ง ๊ทผ์ฌ>์ ๋ฐ๋ฅด๋ฉด, ํฐ $k$์ ๋ํด ์๋๊ฐ ์ฑ๋ฆฝํ๋ค.
\[\Gamma(k) \approx \sqrt{\frac{2\pi}{k}} \left( \frac{k}{e} \right)^k\]์ด ์ฌ์ค์ ๋ฐํ์ผ๋ก ์์์ ์ ๊ฐํ๋ฉด,
\[\begin{aligned} \frac{\Gamma\left(\dfrac{k+1}{2}\right)}{\Gamma\left( \dfrac{k}{2} \right)} &= \frac{ \sqrt{\frac{1}{k + 1}} \left( \frac{k + 1}{2e} \right)^{(k+1) / 2} }{ \sqrt{\frac{1}{k}} \left( \frac{k}{2e} \right)^{k/2} } \\ &= \sqrt{\frac{k}{k+1}} \cdot \frac{(2e)^{k/2}}{(2e)^{(k+1)/2}} \cdot \frac{(k+1)^{(k+1)/2}}{(k)^{k/2}} \\ &= \sqrt{\frac{k}{k+1}} \cdot \frac{1}{\sqrt{2e}} \cdot \left(\frac{k+1}{k}\right)^{k/2} \cdot \sqrt{k+1} \\ &= \sqrt{\frac{k}{2e}} \cdot \left(1 + \frac{1}{k}\right)^{k/2} \\ \end{aligned}\]์ด์ ์์์ ํฉ์น๋ฉด,
\[\begin{aligned} \frac{\Gamma\left(\dfrac{n+1}{2}\right)}{\sqrt{n\pi} \cdot \Gamma\left( \dfrac{n}{2} \right)} &= \frac{1}{\sqrt{n \pi}} \cdot \sqrt{\frac{n}{2e}} \cdot \left(1 + \frac{1}{n}\right)^{n/2} \\ &\rightarrow \frac{1}{\sqrt{\pi}} \cdot \sqrt{\frac{1}{2e}} \cdot \sqrt{e} \\ &= \frac{1}{\sqrt{2\pi}} \end{aligned}\][Final]
์ข ํฉํ๋ฉด,
\[\begin{aligned} f(x) &= \frac{\Gamma\left(\dfrac{n+1}{2}\right)}{\sqrt{n\pi} \cdot \Gamma\left( \dfrac{n}{2} \right)} \cdot \left( 1 + \frac{x^2}{n} \right)^{-(n+1)/2} \\ &\rightarrow \frac{1}{\sqrt{2\pi}} \cdot \exp(-x^2 / 2) \end{aligned}\]3. We define $t_\alpha$ as the number $x$ s.t. $P(T \ge x) = \alpha$.
Sampling Distribution of Mean (unknown $\sigma^2$)
Sample Mean $\bar{X}$์ ๋ํ ๋ถํฌ๋ฅผ ๊ณ์ ์ดํด๋ณด์. ์ด์ ์ โSampling Distribution of Meanโ ํฌ์คํธ์์ population variance $\sigma^2$์ ๋ํ ๊ฐ์ ์ ํํ ์๊ณ ์์๋ค.
\[Z = \frac{\bar{X} - \mu}{\sigma / \sqrt{n}} \sim N(0, 1)\]์ด๋ฒ์๋ $\sigma^2$๋ฅผ ๋ชจ๋ฅด๋ ์ํ์์ Sample Mean $\bar{X}$์ ๋ถํฌ๋ฅผ ๋ชจ๋ธ๋ง ํด๋ณด์.
Theorem.
Let $X_1, \dots, X_n$ be a random sample from $N(\mu, \sigma^2)$1.
Let $T := \dfrac{\bar{X} - \mu}{S / \sqrt{n}}$, then $T$ has a t-distribution with $(n-1)$ dof.
Proof.
์ด๋, ๋ถ์์ธ $\dfrac{\overline{X} - \mu}{\sigma / \sqrt{n}}$๋ $N(0, 1)$์ ๋ถํฌ๋ฅผ ๋ฐ๋ฅด๊ณ ,
๋ถ๋ชจ์ธ $S / \sigma$๋
\[\frac{S}{\sigma} = \sqrt{\frac{(n-1) \cdot S^2}{\sigma^2}\cdot \frac{1}{(n-1)}}\]์ธ๋ฐ ์ด๋, $\dfrac{(n-1)\cdot S^2}{\sigma^2}$๊ฐ $\chi^2(n-1)$๋ฅผ ๋ฐ๋ฅด๋ฏ๋ก.
์์ ์ ๋ฆฌํ๋ฉด ๋ถํฌ $T$๋ ์๋์ ๊ฐ์๋ฐ,
\[T = \frac{Z}{\sqrt{V/(n-1)}}\]$Z \sim N(0, 1)$์ด๊ณ $V \sim \chi^2(n-1)$์ด๋ค. ๊ทธ๋ฆฌ๊ณ Sample Variance์ Sample Mean์ด ์๋ก ๋ ๋ฆฝ์ด๋ฏ๋ก, $Z \perp V$์ด๋ค.
๋ฐ๋ผ์, $T$๋ dof๊ฐ $n-1$์ธ t-distribution์ด๋ค. $\blacksquare$
Examples
[population] $X$ follows Normal Distribution, $\mu = 500$, $\sigma$: unknown
[sample] $n=25$, $\bar{x} = 518$, $s^2 = 40^2$
[t-test] check weather or not $t \in [-t_{0.05}, t_{0.05}]$
Let $T := \dfrac{\bar{x} - \mu}{S / \sqrt{n}} \overset{D}{=} t(n-1) = t(24)$
t-value is
\[\frac{\bar{x} - \mu}{s/\sqrt{n}} = \frac{518-500}{40/5} = 2.25\]Here, $t_{0.05}(24) = 2.172$, and $t_{0.05} < 2.25$.
t-value๊ฐ $t_{0.05}$๋ณด๋ค ํฌ๋ฏ๋ก ์ ์ํ๋ค. ๊ทธ๋์ population mean $\mu$๋ 500๋ณด๋ค ๋ ํด์ง๋ ๋ชจ๋ฅธ๋ค. $\blacksquare$
๋งบ์๋ง
์ด์ด์ง๋ ํฌ์คํธ์์๋ ๋ sample variance๋ฅผ ๋น๊ตํ ๋ ์ฐ๋ <F-distribution>๋ฅผ ์ดํด๋ณธ๋ค.
\[F := \frac{S_1^2 / \sigma_1^2}{S_2^2 / \sigma_2^2} = F(n_1 - 1, n_2 -1)\]๐ F-distribution
<t-distribution>์ ๋ค์ ๋์ค๋ <Interval Estimation>์์ ๋ค์ ๋ณผ ์์ ์ด๋ค.
๐ t-test: Estimate $\mu$ when $\sigma^2$ is unknown
๊ฐ์ธ์ ์ผ๋ก ์ฌ๊ธฐ๊ฐ <t-value>, <z-value>, <p-value>๊ฐ ํท๊ฐ๋ฆฌ๋ ์ง์ ์ด๋ผ๊ณ ์๊ฐํ๋ค. ๋ง์ฝ, ๋ ๊ฐ๋ ์ด ์ด๋ป๊ฒ ๋ค๋ฅด๊ณ , ๋ ์ธ์ ๋ฑ์ฅํ๋์ง ๋น๊ตํ๊ณ ์ถ๋ค๋ฉด, ์๋์ ํฌ์คํธ๋ฅผ ์ฐธ๊ณ ํ๊ธธ ๋ฐ๋๋ค.
๐ Values in Statistics
References
-
<t-distribution>์ ์ฐ๊ธฐ ์ํด์ , ์ํ์ด ๋ฐ๋์ normal ๋ถํฌ๋ก๋ถํฐ ์ถ์ถ๋์ด์ผ ํ๋ค!! ๐ฅย ↩