Normal Distribution
โํ๋ฅ ๊ณผ ํต๊ณ(MATH230)โ ์์ ์์ ๋ฐฐ์ด ๊ฒ๊ณผ ๊ณต๋ถํ ๊ฒ์ ์ ๋ฆฌํ ํฌ์คํธ์ ๋๋ค. ์ ์ฒด ํฌ์คํธ๋ Probability and Statistics์์ ํ์ธํ์ค ์ ์์ต๋๋ค ๐ฒ
์๋ฆฌ์ฆ: Continuous Probability Distributions
Normal Distribution (or Gaussian Distribution)
Definition. Gaussian Distribution
Let $\mu \in \mathbb{R}$ and $\sigma > 0$. We say that $X$ has a <normal distribution> with mean $\mu$ and variance $\sigma^2$ if its pdf $f(x; \mu, \sigma^2)$ is given by
\[f(x; \mu, \sigma^2) = \frac{1}{\sqrt{2\pi\sigma^2}} \exp \left( - \frac{(x-\mu)^2}{2\sigma^2}\right) \quad \text{for} \; x \in \mathbb{R}\]And we denote $X \sim N(\mu, \sigma^2)$.
์ด๋, ๋ง์ฝ $\mu = 0$, $\sigma^2 = 1$์ด๋ผ๋ฉด, ์ฐ๋ฆฌ๋ $X$๋ <standard normal RV>๋ผ๊ณ ๋ถ๋ฅธ๋ค.
\[f(x; 0, 1) = \frac{1}{\sqrt{2\pi}} \exp \left( -\frac{x^2}{2} \right)\]์ด์ Normal Distribution์ pdf $f(x; \mu, \sigma^2)$๊ฐ ์ ํจํ pdf์ธ์ง ๊ฒ์ฆํด๋ณด์.
\[\int^{\infty}_{-\infty} \frac{1}{\sqrt{2\pi\sigma^2}} \exp\left( -\frac{(x-\mu)^2}{2\sigma^2}\right) \; dx \overset{?}{=} 1\]Proof.
Let $A$ as
\[A = \int^{\infty}_{-\infty} f(x) \; dx = \frac{1}{\sqrt{2\pi\sigma^2}} \int^{\infty}_{\infty} \exp\left( -\frac{(x-\mu)^2}{2\sigma^2}\right) dx\]Let $z = \dfrac{x-\mu}{\sigma}$, then
\[A = \frac{1}{\sqrt{2\pi}} \int^{\infty}_{-\infty} \exp\left( -\frac{z^2}{2}\right) dx\]then,
\[\begin{aligned} A^2 &= \frac{1}{2\pi} \int^{\infty}_{-\infty} \int^{\infty}_{-\infty} f(x) f(y) \; dxdy \\ &= \frac{1}{2\pi} \int^{\infty}_{-\infty} \int^{\infty}_{-\infty} \exp\left( -\frac{x^2 + y^2}{2}\right) \; dxdy \end{aligned}\]์ฌ๊ธฐ์์ ์ ๋ถ ๋ฐฉ์์ $xy$-coordinate์์ $r\theta$-coordinate๋ก ๋ฐ๊ฟ๋ณด์.
\[\begin{aligned} x &= r \cos \theta \\ y &= r \sin \theta \end{aligned}\]then,
\[A^2 = \frac{1}{2\pi} \int^{2\pi}_0 \int^{\infty}_0 \exp \left( - \frac{r^2}{2}\right) \cdot r \; drd\theta\]์์ ์ ๋ถ์ ์ฝ๊ฒ ํด๊ฒฐํ ์ ์๋ค.
\[\begin{aligned} A^2 &= \frac{1}{2\pi} \int^{2\pi}_0 \left[ - \exp \left( - \frac{r^2}{2} \right) \right]^{\infty}_0 \; d\theta \\ &= \frac{1}{2\pi} \int^{2\pi}_0 1 \; d\theta \\ &= \frac{1}{2\pi} \cdot 2\pi = 1 \end{aligned}\]๋๋ฒ์งธ ์ง๋ฌธ์ <normal distribution>์์์ CDF๋ฅผ ๊ตฌํ๋ ๊ฒ์ด๋ค. ๋ ผ์์ ํธ์๋ฅผ ์ํด $N(\mu, \sigma^2)$ ๋์ ์ $Z \sim N(0, 1)$๋ก ๋์ ์ดํด๋ณด์.
\[F(x) = P(Z \le x) = \int^x_{-\infty} \frac{1}{\sqrt{2\pi}} \exp \left( - \frac{z^2}{2}\right) \; dz\]์ฐ์ ํ์คํ๊ฒ ์ ์ ์๋ ์ฌ์ค์
- $F(0) = P(Z \le 0) = 0.5$
- $F(-\infty) = P(Z \le -\infty) = 0$
- $F(\infty) = P(Z \le \infty) = 1$
๋ผ๋ ์ ์ด๋ค.
<normal distribution>๊ฐ ์ฐ์ํ๋ฅ ๋ถํฌ์ด๊ธฐ ๋๋ฌธ์ ํ๋ฅ ์ ๊ตฌํ๊ธฐ ์ํด์ ๋ฐ๋์ CDF๋ฅผ ์์์ผ ํ๋ค. ๊ทธ๋ฌ๋, ์ฐ๋ฆฌ๋ <normal distribution>์ CDF๋ฅผ ์ง์ ์ ๋ถํด์ ๊ตฌํ์ง ์๋๋ค. ๊ต์ฌ ๋คํธ์ Appendix์ ํ๋ฅผ ๋ณด๊ณ ๊ตฌํ๋ฉด ๋๋ค!! ๐คฉ ์๋์ ๋งํฌ์ ์ด ํ์ ๋งํฌ๋ฅผ ๋ฌ์๋จ๋ค. <normal distribution>์ ์ด๋ฐ ํ๋ฅผ <standard normal table> ๋๋ <Z table>์ด๋ผ๊ณ ํ๋ค.
๐ Wikiepeida/Standard normal table
Theorem.
Let $X \sim N(\mu, \sigma^2)$, then
- $E[X] = \mu$
- $\text{Var}(X) = \sigma^2$
์์ ๋ช ์ ๋ฅผ ์ฆ๋ช ํด์ผ ํ์ง๋ง, ์ฝ๊ฒ ํ ์ ์์ ๊ฒ ๊ฐ์์ ์๋ตํ๊ฒ ๋ค.
์ด๋ฒ์๋ <normal distribution>๊ณผ <standard normal distribution>์ ๊ด๊ณ๋ฅผ ์ข ์ดํด๋ณด์.
Theorem.
1. If $X \sim N(\mu, \sigma^2)$, then $Z := \dfrac{X - \mu}{\sigma} \sim N(0, 1)$.
2. If $Z \sim N(0, 1)$, then $X := \sigma Z + \mu ~ N(\mu, \sigma^2)$
์ด ๋ถ๋ถ์ ๊ฐ๋จํ๊ฒ ์ฆ๋ช ์ ์ดํด๋ณด์. 1๋ฒ ๋ช ์ ๋ $Z$๊ฐ normal ๋ถํฌ๋ฅผ ๊ฐ์ง๋ ๊ฑธ ์ ๋ํ๋ฉด ๋๋ค.
CDF of $Z$ is $P(Z \le z) = P\left( \dfrac{X - \mu}{\sigma} \le z \right)$, then we can shift and scaling $Z$ as
\[P\left( \dfrac{X - \mu}{\sigma} \le z \right) = P ( X \le \sigma z + \mu)\]Letโs say cdf of $Z$ as $F_Z (z) = F_X (\sigma z + \mu)$, then to get pdf of $Z$, take derivative
\[\begin{aligned} f(z) &= \frac{d}{dz} F_X (\sigma z + \mu) = \sigma f_x (\sigma z + \mu) \\ &= \sigma \cdot \left( \frac{1}{\sqrt{2\pi\sigma^2}} \exp \left( - \frac{(\sigma z + \mu - \mu)^2}{2\sigma^2}\right) \right) \\ &= \frac{1}{\sqrt{2\pi}} \exp \left( -\frac{z^2}{2} \right) \end{aligned}\]$Z$์ pdf๊ฐ $N(0, 1)$์ด๋ฏ๋ก $Z \sim N(0, 1)$์ด๋ค. $\blacksquare$
Remark.
1. If $Z \sim N(0, 1)$, the <standard normal>, then its pdf and cdf are commonly denoted by $\varphi(z)$ and $\Phi(z)$.
2. The value of $\Phi(z)$ is listed on the Appendix table.
3. \(\Phi(-z) = 1 - \Phi(z)\)
4. If $X \sim N(\mu, \sigma^2)$, then we can normalize $X$ to $Z$.
Normal Approximation to the Binomial
์ฐ๋ฆฌ๋ <Binomial Distribution>์ด ์ถฉ๋ถํ ์์ ํ๋ฅ $p \ll 1$๊ณผ ์ถฉ๋ถํ ํฐ trial $1 \ll n < \infty$๋ผ๋ฉด, ์ด๊ฒ์ <Poisson Distribution>์ผ๋ก ๊ทผ์ฌํด์ ์ฌ์ฉํ ์ ์์๋ค.
Example.
Let $X \sim \text{BIN}(100, 0.02)$, then get the value of $P(X = 39)$ is hard. (0.02๋ฅผ 39๋ฒ ๊ณฑํ๋ฉด 0์ ๊ฐ๊น์์ง ๋ฑ๋ฑ)
However, if we approximate it to $\text{POI}(2)$, then $P(x = 39) = e^{-2} \frac{2^{39}}{39!}$.
๊ทธ๋ฐ๋ฐ, ์ด๋ฐ <Binomial Distribution>์ ์ข๋ ํ์ฅํด <Normal Distribution>์ผ๋ก ๊ทผ์ฌํ ์ ์์์ ๊ธฐ์ ํ๋ ์ ๋ฆฌ๊ฐ ์๋ค!! ๐คฉ ์ด ๊ฒฝ์ฐ๋ โ์ถฉ๋ถํ ํฐ trialโ์ด๋ผ๋ ์กฐ๊ฑด๋ง ์ถฉ์กฑํ๋ฉด ๋๋ค!
Theorem. De Moivre-Laplace Central Limit Theorem
Let $X \sim \text{BIN}(n, p)$, then we have
\[\lim_{n \rightarrow \infty} P\left( \frac{X - np}{\sqrt{npq}} \le x \right) = \Phi(x)\]where $\Phi(x)$ is CDF of normal $N(0, 1)$.
โป Note that this is one special case of CLT.
์ด ๋ถ๋ถ์ ์์ ๋ฅผ ํตํด ๊ฐ์ ์ตํ๋ ๊ฑธ ์ถ์ฒํ๋ค. 2-3 ๋ฌธ์ ๋ง ํ์ด๋ด๋ ๊ธ๋ฐฉ ๊ฐ์ ์ก์ ์ ์๋ค.
์ด์ด์ง๋ ํฌ์คํธ์์๋ ์ข๋ ๋ค์ํ๊ณ , ์์ฒญ๋ ๋ถํฌ๋ค์ ๋ง๋๊ฒ ๋๋ค.