โ€œํ™•๋ฅ ๊ณผ ํ†ต๊ณ„(MATH230)โ€ ์ˆ˜์—…์—์„œ ๋ฐฐ์šด ๊ฒƒ๊ณผ ๊ณต๋ถ€ํ•œ ๊ฒƒ์„ ์ •๋ฆฌํ•œ ํฌ์ŠคํŠธ์ž…๋‹ˆ๋‹ค. ์ „์ฒด ํฌ์ŠคํŠธ๋Š” Probability and Statistics์—์„œ ํ™•์ธํ•˜์‹ค ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค ๐ŸŽฒ

7 minute read

โ€œํ™•๋ฅ ๊ณผ ํ†ต๊ณ„(MATH230)โ€ ์ˆ˜์—…์—์„œ ๋ฐฐ์šด ๊ฒƒ๊ณผ ๊ณต๋ถ€ํ•œ ๊ฒƒ์„ ์ •๋ฆฌํ•œ ํฌ์ŠคํŠธ์ž…๋‹ˆ๋‹ค. ์ „์ฒด ํฌ์ŠคํŠธ๋Š” Probability and Statistics์—์„œ ํ™•์ธํ•˜์‹ค ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค ๐ŸŽฒ

Normal Distribution (or Gaussian Distribution)

Definition. Gaussian Distribution

Let $\mu \in \mathbb{R}$ and $\sigma > 0$. We say that $X$ has a <normal distribution> with mean $\mu$ and variance $\sigma^2$ if its pdf $f(x; \mu, \sigma^2)$ is given by

\[f(x; \mu, \sigma^2) = \frac{1}{\sqrt{2\pi\sigma^2}} \exp \left( - \frac{(x-\mu)^2}{2\sigma^2}\right) \quad \text{for} \; x \in \mathbb{R}\]

And we denote $X \sim N(\mu, \sigma^2)$.

์ด๋•Œ, ๋งŒ์•ฝ $\mu = 0$, $\sigma^2 = 1$์ด๋ผ๋ฉด, ์šฐ๋ฆฌ๋Š” $X$๋Š” <standard normal RV>๋ผ๊ณ  ๋ถ€๋ฅธ๋‹ค.

\[f(x; 0, 1) = \frac{1}{\sqrt{2\pi}} \exp \left( -\frac{x^2}{2} \right)\]

์ด์ œ Normal Distribution์˜ pdf $f(x; \mu, \sigma^2)$๊ฐ€ ์œ ํšจํ•œ pdf์ธ์ง€ ๊ฒ€์ฆํ•ด๋ณด์ž.

\[\int^{\infty}_{-\infty} \frac{1}{\sqrt{2\pi\sigma^2}} \exp\left( -\frac{(x-\mu)^2}{2\sigma^2}\right) \; dx \overset{?}{=} 1\]

Proof.

Let $A$ as

\[A = \int^{\infty}_{-\infty} f(x) \; dx = \frac{1}{\sqrt{2\pi\sigma^2}} \int^{\infty}_{\infty} \exp\left( -\frac{(x-\mu)^2}{2\sigma^2}\right) dx\]

Let $z = \dfrac{x-\mu}{\sigma}$, then

\[A = \frac{1}{\sqrt{2\pi}} \int^{\infty}_{-\infty} \exp\left( -\frac{z^2}{2}\right) dx\]

then,

\[\begin{aligned} A^2 &= \frac{1}{2\pi} \int^{\infty}_{-\infty} \int^{\infty}_{-\infty} f(x) f(y) \; dxdy \\ &= \frac{1}{2\pi} \int^{\infty}_{-\infty} \int^{\infty}_{-\infty} \exp\left( -\frac{x^2 + y^2}{2}\right) \; dxdy \end{aligned}\]

์—ฌ๊ธฐ์—์„œ ์ ๋ถ„ ๋ฐฉ์‹์„ $xy$-coordinate์—์„œ $r\theta$-coordinate๋กœ ๋ฐ”๊ฟ”๋ณด์ž.

\[\begin{aligned} x &= r \cos \theta \\ y &= r \sin \theta \end{aligned}\]

then,

\[A^2 = \frac{1}{2\pi} \int^{2\pi}_0 \int^{\infty}_0 \exp \left( - \frac{r^2}{2}\right) \cdot r \; drd\theta\]

์œ„์˜ ์ ๋ถ„์€ ์‰ฝ๊ฒŒ ํ•ด๊ฒฐํ•  ์ˆ˜ ์žˆ๋‹ค.

\[\begin{aligned} A^2 &= \frac{1}{2\pi} \int^{2\pi}_0 \left[ - \exp \left( - \frac{r^2}{2} \right) \right]^{\infty}_0 \; d\theta \\ &= \frac{1}{2\pi} \int^{2\pi}_0 1 \; d\theta \\ &= \frac{1}{2\pi} \cdot 2\pi = 1 \end{aligned}\]

๋‘๋ฒˆ์งธ ์งˆ๋ฌธ์€ <normal distribution>์—์„œ์˜ CDF๋ฅผ ๊ตฌํ•˜๋Š” ๊ฒƒ์ด๋‹ค. ๋…ผ์˜์˜ ํŽธ์˜๋ฅผ ์œ„ํ•ด $N(\mu, \sigma^2)$ ๋Œ€์‹ ์— $Z \sim N(0, 1)$๋กœ ๋Œ€์‹  ์‚ดํŽด๋ณด์ž.

\[F(x) = P(Z \le x) = \int^x_{-\infty} \frac{1}{\sqrt{2\pi}} \exp \left( - \frac{z^2}{2}\right) \; dz\]

์šฐ์„  ํ™•์‹คํ•˜๊ฒŒ ์•Œ ์ˆ˜ ์žˆ๋Š” ์‚ฌ์‹ค์€

  • $F(0) = P(Z \le 0) = 0.5$
  • $F(-\infty) = P(Z \le -\infty) = 0$
  • $F(\infty) = P(Z \le \infty) = 1$

๋ผ๋Š” ์ ์ด๋‹ค.

<normal distribution>๊ฐ€ ์—ฐ์†ํ™•๋ฅ ๋ถ„ํฌ์ด๊ธฐ ๋•Œ๋ฌธ์— ํ™•๋ฅ ์„ ๊ตฌํ•˜๊ธฐ ์œ„ํ•ด์„  ๋ฐ˜๋“œ์‹œ CDF๋ฅผ ์•Œ์•„์•ผ ํ•œ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜, ์šฐ๋ฆฌ๋Š” <normal distribution>์˜ CDF๋ฅผ ์ง์ ‘ ์ ๋ถ„ํ•ด์„œ ๊ตฌํ•˜์ง€ ์•Š๋Š”๋‹ค. ๊ต์žฌ ๋’คํŽธ์˜ Appendix์˜ ํ‘œ๋ฅผ ๋ณด๊ณ  ๊ตฌํ•˜๋ฉด ๋œ๋‹ค!! ๐Ÿคฉ ์•„๋ž˜์˜ ๋งํฌ์— ์ด ํ‘œ์˜ ๋งํฌ๋ฅผ ๋‹ฌ์•„๋†จ๋‹ค. <normal distribution>์˜ ์ด๋Ÿฐ ํ‘œ๋ฅผ <standard normal table> ๋˜๋Š” <Z table>์ด๋ผ๊ณ  ํ•œ๋‹ค.

๐Ÿ‘‰ Wikiepeida/Standard normal table


Theorem.

Let $X \sim N(\mu, \sigma^2)$, then

  • $E[X] = \mu$
  • $\text{Var}(X) = \sigma^2$

์œ„์˜ ๋ช…์ œ๋ฅผ ์ฆ๋ช…ํ•ด์•ผ ํ•˜์ง€๋งŒ, ์‰ฝ๊ฒŒ ํ•  ์ˆ˜ ์žˆ์„ ๊ฒƒ ๊ฐ™์•„์„œ ์ƒ๋žตํ•˜๊ฒ ๋‹ค.


์ด๋ฒˆ์—๋Š” <normal distribution>๊ณผ <standard normal distribution>์˜ ๊ด€๊ณ„๋ฅผ ์ข€ ์‚ดํŽด๋ณด์ž.

Theorem.

1. If $X \sim N(\mu, \sigma^2)$, then $Z := \dfrac{X - \mu}{\sigma} \sim N(0, 1)$.

2. If $Z \sim N(0, 1)$, then $X := \sigma Z + \mu ~ N(\mu, \sigma^2)$

์ด ๋ถ€๋ถ„์€ ๊ฐ„๋‹จํ•˜๊ฒŒ ์ฆ๋ช…์„ ์‚ดํŽด๋ณด์ž. 1๋ฒˆ ๋ช…์ œ๋Š” $Z$๊ฐ€ normal ๋ถ„ํฌ๋ฅผ ๊ฐ€์ง€๋Š” ๊ฑธ ์œ ๋„ํ•˜๋ฉด ๋œ๋‹ค.

CDF of $Z$ is $P(Z \le z) = P\left( \dfrac{X - \mu}{\sigma} \le z \right)$, then we can shift and scaling $Z$ as

\[P\left( \dfrac{X - \mu}{\sigma} \le z \right) = P ( X \le \sigma z + \mu)\]

Letโ€™s say cdf of $Z$ as $F_Z (z) = F_X (\sigma z + \mu)$, then to get pdf of $Z$, take derivative

\[\begin{aligned} f(z) &= \frac{d}{dz} F_X (\sigma z + \mu) = \sigma f_x (\sigma z + \mu) \\ &= \sigma \cdot \left( \frac{1}{\sqrt{2\pi\sigma^2}} \exp \left( - \frac{(\sigma z + \mu - \mu)^2}{2\sigma^2}\right) \right) \\ &= \frac{1}{\sqrt{2\pi}} \exp \left( -\frac{z^2}{2} \right) \end{aligned}\]

$Z$์˜ pdf๊ฐ€ $N(0, 1)$์ด๋ฏ€๋กœ $Z \sim N(0, 1)$์ด๋‹ค. $\blacksquare$

Remark.

1. If $Z \sim N(0, 1)$, the <standard normal>, then its pdf and cdf are commonly denoted by $\varphi(z)$ and $\Phi(z)$.

2. The value of $\Phi(z)$ is listed on the Appendix table.

3. \(\Phi(-z) = 1 - \Phi(z)\)

4. If $X \sim N(\mu, \sigma^2)$, then we can normalize $X$ to $Z$.


Normal Approximation to the Binomial

์šฐ๋ฆฌ๋Š” <Binomial Distribution>์ด ์ถฉ๋ถ„ํžˆ ์ž‘์€ ํ™•๋ฅ  $p \ll 1$๊ณผ ์ถฉ๋ถ„ํžˆ ํฐ trial $1 \ll n < \infty$๋ผ๋ฉด, ์ด๊ฒƒ์„ <Poisson Distribution>์œผ๋กœ ๊ทผ์‚ฌํ•ด์„œ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์—ˆ๋‹ค.

Example.

Let $X \sim \text{BIN}(100, 0.02)$, then get the value of $P(X = 39)$ is hard. (0.02๋ฅผ 39๋ฒˆ ๊ณฑํ•˜๋ฉด 0์— ๊ฐ€๊นŒ์›Œ์ง ๋“ฑ๋“ฑ)

However, if we approximate it to $\text{POI}(2)$, then $P(x = 39) = e^{-2} \frac{2^{39}}{39!}$.

๊ทธ๋Ÿฐ๋ฐ, ์ด๋Ÿฐ <Binomial Distribution>์„ ์ข€๋” ํ™•์žฅํ•ด <Normal Distribution>์œผ๋กœ ๊ทผ์‚ฌํ•  ์ˆ˜ ์žˆ์Œ์„ ๊ธฐ์ˆ ํ•˜๋Š” ์ •๋ฆฌ๊ฐ€ ์žˆ๋‹ค!! ๐Ÿคฉ ์ด ๊ฒฝ์šฐ๋Š” โ€œ์ถฉ๋ถ„ํžˆ ํฐ trialโ€์ด๋ผ๋Š” ์กฐ๊ฑด๋งŒ ์ถฉ์กฑํ•˜๋ฉด ๋œ๋‹ค!

Theorem. De Moivre-Laplace Central Limit Theorem

Let $X \sim \text{BIN}(n, p)$, then we have

\[\lim_{n \rightarrow \infty} P\left( \frac{X - np}{\sqrt{npq}} \le x \right) = \Phi(x)\]

where $\Phi(x)$ is CDF of normal $N(0, 1)$.

โ€ป Note that this is one special case of CLT.


์ด ๋ถ€๋ถ„์€ ์˜ˆ์ œ๋ฅผ ํ†ตํ•ด ๊ฐ์„ ์ตํžˆ๋Š” ๊ฑธ ์ถ”์ฒœํ•œ๋‹ค. 2-3 ๋ฌธ์ œ๋งŒ ํ’€์–ด๋ด๋„ ๊ธˆ๋ฐฉ ๊ฐ์„ ์žก์„ ์ˆ˜ ์žˆ๋‹ค.


์ด์–ด์ง€๋Š” ํฌ์ŠคํŠธ์—์„œ๋Š” ์ข€๋” ๋‹ค์–‘ํ•˜๊ณ , ์—„์ฒญ๋‚œ ๋ถ„ํฌ๋“ค์„ ๋งŒ๋‚˜๊ฒŒ ๋œ๋‹ค.