โ€œํ™•๋ฅ ๊ณผ ํ†ต๊ณ„(MATH230)โ€ ์ˆ˜์—…์—์„œ ๋ฐฐ์šด ๊ฒƒ๊ณผ ๊ณต๋ถ€ํ•œ ๊ฒƒ์„ ์ •๋ฆฌํ•œ ํฌ์ŠคํŠธ์ž…๋‹ˆ๋‹ค. ์ „์ฒด ํฌ์ŠคํŠธ๋Š” Probability and Statistics์—์„œ ํ™•์ธํ•˜์‹ค ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค ๐ŸŽฒ

7 minute read

โ€œํ™•๋ฅ ๊ณผ ํ†ต๊ณ„(MATH230)โ€ ์ˆ˜์—…์—์„œ ๋ฐฐ์šด ๊ฒƒ๊ณผ ๊ณต๋ถ€ํ•œ ๊ฒƒ์„ ์ •๋ฆฌํ•œ ํฌ์ŠคํŠธ์ž…๋‹ˆ๋‹ค. ์ „์ฒด ํฌ์ŠคํŠธ๋Š” Probability and Statistics์—์„œ ํ™•์ธํ•˜์‹ค ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค ๐ŸŽฒ

[toc]

  • Studentโ€™s t-distribution
  • Sampling Distribution of Mean (unknown $\sigma^2$)

Studentโ€™s t-distribution

Definition. Studentโ€™s t-distribution

Let $Z \sim N(0, 1)$, and $V \sim \chi^2(n)$, and $Z \perp V$.

Define $T$ as

\[T := \frac{Z}{\sqrt{V / n}}\]

Then, the distribution of $T$ is called <studentโ€™s t-distribution of $n$ degrees of freedom>.

Remark.

1. The pdf of $t$-distribution is

\[f(x) = \frac{\Gamma\left(\dfrac{n+1}{2}\right)}{\sqrt{n\pi} \cdot \Gamma\left( \dfrac{n}{2} \right)} \left( 1 + \frac{x^2}{n} \right)^{-(n+1)/2}\]

for $x \in \mathbb{R}$.

(๊ฑฑ์ •ํ•˜์ง€ ๋งˆ๋ผ, ์šฐ๋ฆฌ๊ฐ€ <t-distribution>์˜ ๋ถ„ํฌ๋ฅผ ์™ธ์›Œ์„œ ์ ์šฉํ•  ์ผ์€ ์ ˆ๋Œ€ ์—†๋‹ค!)

2. $t$-distribution would converges to normal distribution as $n \rightarrow \infty$.

\[t(x; n) \rightarrow \frac{1}{\sqrt{2\pi}} \cdot \exp(-x^2 / 2)\]

$n$์ด ์ปค์งˆ ์ˆ˜๋ก ์ •๊ทœ ๋ถ„ํฌ์— ๊ฐ€๊นŒ์›Œ์ง„๋‹ค!

proof.

\[f(x) = \frac{\Gamma\left(\dfrac{n+1}{2}\right)}{\sqrt{n\pi} \cdot \Gamma\left( \dfrac{n}{2} \right)} \cdot \left( 1 + \frac{x^2}{n} \right)^{-(n+1)/2}\]

ํ•ญ๋ชฉ ๋ณ„๋กœ ๊ทนํ•œ์„ ์ƒ๊ฐํ•ด๋ณด์ž.

[Step 1]

๋” ์‰ฌ์šด ๋…€์„์ธ ์˜ค๋ฅธ์ชฝ ๋…€์„๋ถ€ํ„ฐ ํ•˜๊ฒ ๋‹ค.

\[\left( 1 + \frac{x^2}{n} \right)^{-(n+1)/2}\]

$\exp(x)$ ํ•จ์ˆ˜์˜ ์ •์˜๋ฅผ ์ด์šฉํ•˜์ž.

\[\left( 1 + \frac{x^2}{n} \right)^{-n/2} \cdot \left( 1 + \frac{x^2}{n} \right)^{-1/2}\]

$n \rightarrow \infty$์ผ ๋•Œ, ์™ผ์ชฝ์€

\[\left( 1 + \frac{x^2}{n} \right)^{-n/2} \rightarrow \exp(-x^2 / 2)\]

์˜ค๋ฅธ์ชฝ์€

\[\left( 1 + \frac{x^2}{n} \right)^{-1/2} \rightarrow (1)^{-1/2} = 1\]

๋”ฐ๋ผ์„œ,

\[\left( 1 + \frac{x^2}{n} \right)^{-(n+1)/2} \rightarrow \exp(-x^2 / 2)\]


[Step 2]

\[\frac{\Gamma\left(\dfrac{n+1}{2}\right)}{\sqrt{n\pi} \cdot \Gamma\left( \dfrac{n}{2} \right)}\]

๊ฐ๋งˆ ํ•จ์ˆ˜๋Š” ์•„๋ž˜์™€ ๊ฐ™์ด ์ƒ๊ฒผ๋‹ค.

\[\Gamma(\alpha) = \int^{\infty}_0 t^{\alpha - 1} e^{-t} dt \quad \text{for} \; \alpha > 0\]

์—ฌ๊ธฐ์„œ ๊ทธ๋ƒฅ ๋ฐ›์•„๋“ค์—ฌ์•ผ ํ•˜๋Š” ๋ถ€๋ถ„์ด ๋“ฑ์žฅํ•˜๋Š”๋ฐ, ๋ฐ”๋กœ <์Šคํ„ธ๋ง ๊ทผ์‚ฌ; Stirlingโ€™s Approximation>๋‹ค.

<์Šคํ„ธ๋ง ๊ทผ์‚ฌ>์— ๋”ฐ๋ฅด๋ฉด, ํฐ $k$์— ๋Œ€ํ•ด ์•„๋ž˜๊ฐ€ ์„ฑ๋ฆฝํ•œ๋‹ค.

\[\Gamma(k) \approx \sqrt{\frac{2\pi}{k}} \left( \frac{k}{e} \right)^k\]

์ด ์‚ฌ์‹ค์„ ๋ฐ”ํƒ•์œผ๋กœ ์ˆ˜์‹์„ ์ „๊ฐœํ•˜๋ฉด,

\[\begin{aligned} \frac{\Gamma\left(\dfrac{k+1}{2}\right)}{\Gamma\left( \dfrac{k}{2} \right)} &= \frac{ \sqrt{\frac{1}{k + 1}} \left( \frac{k + 1}{2e} \right)^{(k+1) / 2} }{ \sqrt{\frac{1}{k}} \left( \frac{k}{2e} \right)^{k/2} } \\ &= \sqrt{\frac{k}{k+1}} \cdot \frac{(2e)^{k/2}}{(2e)^{(k+1)/2}} \cdot \frac{(k+1)^{(k+1)/2}}{(k)^{k/2}} \\ &= \sqrt{\frac{k}{k+1}} \cdot \frac{1}{\sqrt{2e}} \cdot \left(\frac{k+1}{k}\right)^{k/2} \cdot \sqrt{k+1} \\ &= \sqrt{\frac{k}{2e}} \cdot \left(1 + \frac{1}{k}\right)^{k/2} \\ \end{aligned}\]

์ด์ œ ์ˆ˜์‹์„ ํ•ฉ์น˜๋ฉด,

\[\begin{aligned} \frac{\Gamma\left(\dfrac{n+1}{2}\right)}{\sqrt{n\pi} \cdot \Gamma\left( \dfrac{n}{2} \right)} &= \frac{1}{\sqrt{n \pi}} \cdot \sqrt{\frac{n}{2e}} \cdot \left(1 + \frac{1}{n}\right)^{n/2} \\ &\rightarrow \frac{1}{\sqrt{\pi}} \cdot \sqrt{\frac{1}{2e}} \cdot \sqrt{e} \\ &= \frac{1}{\sqrt{2\pi}} \end{aligned}\]

[Final]

์ข…ํ•ฉํ•˜๋ฉด,

\[\begin{aligned} f(x) &= \frac{\Gamma\left(\dfrac{n+1}{2}\right)}{\sqrt{n\pi} \cdot \Gamma\left( \dfrac{n}{2} \right)} \cdot \left( 1 + \frac{x^2}{n} \right)^{-(n+1)/2} \\ &\rightarrow \frac{1}{\sqrt{2\pi}} \cdot \exp(-x^2 / 2) \end{aligned}\]

3. We define $t_\alpha$ as the number $x$ s.t. $P(T \ge x) = \alpha$.


Sampling Distribution of Mean (unknown $\sigma^2$)

Sample Mean $\bar{X}$์— ๋Œ€ํ•œ ๋ถ„ํฌ๋ฅผ ๊ณ„์† ์‚ดํŽด๋ณด์ž. ์ด์ „์˜ โ€œSampling Distribution of Meanโ€ ํฌ์ŠคํŠธ์—์„  population variance $\sigma^2$์— ๋Œ€ํ•œ ๊ฐ’์„ ์ •ํ™•ํžˆ ์•Œ๊ณ  ์žˆ์—ˆ๋‹ค.

\[Z = \frac{\bar{X} - \mu}{\sigma / \sqrt{n}} \sim N(0, 1)\]

์ด๋ฒˆ์—๋Š” $\sigma^2$๋ฅผ ๋ชจ๋ฅด๋Š” ์ƒํƒœ์—์„œ Sample Mean $\bar{X}$์˜ ๋ถ„ํฌ๋ฅผ ๋ชจ๋ธ๋ง ํ•ด๋ณด์ž.

Theorem.

Let $X_1, \dots, X_n$ be a random sample from $N(\mu, \sigma^2)$1.

Let $T := \dfrac{\bar{X} - \mu}{S / \sqrt{n}}$, then $T$ has a t-distribution with $(n-1)$ dof.

Proof.

\[\begin{aligned} T &= \frac{\overline{X} - \mu}{S / \sqrt{n}} \\ &= \frac{\overline{X} - \mu}{\sigma / \sqrt{n}} \cdot \frac{\sigma / \cancel{\sqrt{n}}}{S / \cancel{\sqrt{n}}} \\ &= \dfrac{\left(\dfrac{\overline{X} - \mu}{\sigma / \sqrt{n}}\right)}{S / \sigma} \end{aligned}\]

์ด๋•Œ, ๋ถ„์ž์ธ $\dfrac{\overline{X} - \mu}{\sigma / \sqrt{n}}$๋Š” $N(0, 1)$์˜ ๋ถ„ํฌ๋ฅผ ๋”ฐ๋ฅด๊ณ ,

๋ถ„๋ชจ์ธ $S / \sigma$๋Š”

\[\frac{S}{\sigma} = \sqrt{\frac{(n-1) \cdot S^2}{\sigma^2}\cdot \frac{1}{(n-1)}}\]

์ธ๋ฐ ์ด๋•Œ, $\dfrac{(n-1)\cdot S^2}{\sigma^2}$๊ฐ€ $\chi^2(n-1)$๋ฅผ ๋”ฐ๋ฅด๋ฏ€๋กœ.

์‹์„ ์ •๋ฆฌํ•˜๋ฉด ๋ถ„ํฌ $T$๋Š” ์•„๋ž˜์™€ ๊ฐ™์€๋ฐ,

\[T = \frac{Z}{\sqrt{V/(n-1)}}\]

$Z \sim N(0, 1)$์ด๊ณ  $V \sim \chi^2(n-1)$์ด๋‹ค. ๊ทธ๋ฆฌ๊ณ  Sample Variance์™€ Sample Mean์ด ์„œ๋กœ ๋…๋ฆฝ์ด๋ฏ€๋กœ, $Z \perp V$์ด๋‹ค.

๋”ฐ๋ผ์„œ, $T$๋Š” dof๊ฐ€ $n-1$์ธ t-distribution์ด๋‹ค. $\blacksquare$


Examples

[population] $X$ follows Normal Distribution, $\mu = 500$, $\sigma$: unknown

[sample] $n=25$, $\bar{x} = 518$, $s^2 = 40^2$

[t-test] check weather or not $t \in [-t_{0.05}, t_{0.05}]$

Let $T := \dfrac{\bar{x} - \mu}{S / \sqrt{n}} \overset{D}{=} t(n-1) = t(24)$

t-value is

\[\frac{\bar{x} - \mu}{s/\sqrt{n}} = \frac{518-500}{40/5} = 2.25\]

Here, $t_{0.05}(24) = 2.172$, and $t_{0.05} < 2.25$.

t-value๊ฐ€ $t_{0.05}$๋ณด๋‹ค ํฌ๋ฏ€๋กœ ์œ ์˜ํ•˜๋‹ค. ๊ทธ๋ž˜์„œ population mean $\mu$๋Š” 500๋ณด๋‹ค ๋” ํด์ง€๋„ ๋ชจ๋ฅธ๋‹ค. $\blacksquare$


๋งบ์Œ๋ง

์ด์–ด์ง€๋Š” ํฌ์ŠคํŠธ์—์„œ๋Š” ๋‘ sample variance๋ฅผ ๋น„๊ตํ•  ๋•Œ ์“ฐ๋Š” <F-distribution>๋ฅผ ์‚ดํŽด๋ณธ๋‹ค.

\[F := \frac{S_1^2 / \sigma_1^2}{S_2^2 / \sigma_2^2} = F(n_1 - 1, n_2 -1)\]

๐Ÿ‘‰ F-distribution


<t-distribution>์€ ๋’ค์— ๋‚˜์˜ค๋Š” <Interval Estimation>์—์„œ ๋‹ค์‹œ ๋ณผ ์˜ˆ์ •์ด๋‹ค.

๐Ÿ‘‰ t-test: Estimate $\mu$ when $\sigma^2$ is unknown


๊ฐœ์ธ์ ์œผ๋กœ ์—ฌ๊ธฐ๊ฐ€ <t-value>, <z-value>, <p-value>๊ฐ€ ํ—ท๊ฐˆ๋ฆฌ๋Š” ์ง€์ ์ด๋ผ๊ณ  ์ƒ๊ฐํ•œ๋‹ค. ๋งŒ์•ฝ, ๋‘ ๊ฐœ๋…์ด ์–ด๋–ป๊ฒŒ ๋‹ค๋ฅด๊ณ , ๋˜ ์–ธ์ œ ๋“ฑ์žฅํ•˜๋Š”์ง€ ๋น„๊ตํ•˜๊ณ  ์‹ถ๋‹ค๋ฉด, ์•„๋ž˜์˜ ํฌ์ŠคํŠธ๋ฅผ ์ฐธ๊ณ ํ•˜๊ธธ ๋ฐ”๋ž€๋‹ค.

๐Ÿ‘‰ Values in Statistics

References


  1. <t-distribution>์„ ์“ฐ๊ธฐ ์œ„ํ•ด์„ , ์ƒ˜ํ”Œ์ด ๋ฐ˜๋“œ์‹œ normal ๋ถ„ํฌ๋กœ๋ถ€ํ„ฐ ์ถ”์ถœ๋˜์–ด์•ผ ํ•œ๋‹ค!! ๐Ÿ’ฅย