โ€œํ™•๋ฅ ๊ณผ ํ†ต๊ณ„(MATH230)โ€ ์ˆ˜์—…์—์„œ ๋ฐฐ์šด ๊ฒƒ๊ณผ ๊ณต๋ถ€ํ•œ ๊ฒƒ์„ ์ •๋ฆฌํ•œ ํฌ์ŠคํŠธ์ž…๋‹ˆ๋‹ค. ์ „์ฒด ํฌ์ŠคํŠธ๋Š” Probability and Statistics์—์„œ ํ™•์ธํ•˜์‹ค ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค ๐ŸŽฒ

8 minute read

โ€œํ™•๋ฅ ๊ณผ ํ†ต๊ณ„(MATH230)โ€ ์ˆ˜์—…์—์„œ ๋ฐฐ์šด ๊ฒƒ๊ณผ ๊ณต๋ถ€ํ•œ ๊ฒƒ์„ ์ •๋ฆฌํ•œ ํฌ์ŠคํŠธ์ž…๋‹ˆ๋‹ค. ์ „์ฒด ํฌ์ŠคํŠธ๋Š” Probability and Statistics์—์„œ ํ™•์ธํ•˜์‹ค ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค ๐ŸŽฒ

์ด์ „ ํฌ์ŠคํŠธ์—์„œ ์ด์‚ฐ ๋ถ„ํฌ์˜ ๊ธฐ๋ณธ์ด ๋˜๋Š” <Bernoulli Distribution>, <Binomial Distribution> ๋“ฑ๋“ฑ์„ ์‚ดํŽด๋ดค๋‹ค. ์ด๋ฒˆ ํฌ์ŠคํŠธ์—์„œ๋Š” ์ข€๋” ์žฌ๋ฏธ์žˆ๋Š” ๋ถ„ํฌ๋“ค์ด ๋“ฑ์žฅํ•œ๋‹ค!

Geometric Distribution

<Geometric Distribution>์˜ ๊ฒฝ์šฐ๋Š” ์•ž์—์„œ ์ œ์‹œ๋œ Distribution๋“ค๊ณผ ์กฐ๊ธˆ ์ƒํ™ฉ์ด ๋‹ค๋ฅด๋‹ค.

Definition. Geometric Distribution

$p$-coin์„ ๋…๋ฆฝ์ ์œผ๋กœ tossing ํ•˜๋Š” ์ƒํ™ฉ์„ ์ƒ๊ฐํ•ด๋ณด์ž. ์ด๋•Œ, ์šฐ๋ฆฌ๋Š” ์ฒ˜์Œ์œผ๋กœ Head๊ฐ€ ๋‚˜์˜ฌ ๋•Œ๊นŒ์ง€ $p$-coin์„ ๋˜์งˆ ๊ฒƒ์ด๋‹ค. ์ด๋•Œ, ์ฒซ Head๊ฐ€ ๋‚˜์˜ค๊ธฐ๊นŒ์ง€ ์‹œ๋„ํ•œ Tossing ํšŸ์ˆ˜๋ฅผ Random Variable $X$๋ผ๊ณ  ํ•˜๋ฉด, ์ด๊ฒƒ์˜ pmf๋Š” ์•„๋ž˜์™€ ๊ฐ™๋‹ค.

\[g(x; p) = pq^{x-1}, \quad x = 1, 2, 3, \dots\]

์ด RV $X$๋ฅผ <Geometric RV>๋ผ๊ณ  ํ•˜๋ฉฐ, $X \sim \text{Geo}(p)$๋กœ ํ‘œ๊ธฐํ•œ๋‹ค.

์—ฌ๊ธฐ์„œ ์™œ <Geometric Distribution>์— โ€œGeometricโ€์ด๋ผ๋Š” ์ด๋ฆ„์ด ๋ถ™์—ˆ๋Š”์ง€ ๊ถ๊ธˆ์ฆ์ด ์ƒ๊ธด๋‹ค. ๊ทธ ์ด์œ ๋Š” ํ™•๋ฅ ์˜ ๅˆ์ด 1์ด ๋˜๋Š”์ง€ ํ™•์ธํ•ด๋ณด๋ฉด ์•Œ ์ˆ˜ ์žˆ๋‹ค.

\[\begin{aligned} \sum_x g(x) &= \sum^{\infty}_x p \dot q^{x-1}\\ &= p \; (1 + q + q^2 + \cdots + q^n + \cdots ) \\ &= \lim_{n \rightarrow \infty} p \; \frac{1-q^n}{1-q} = \frac{p}{1-q} = \frac{p}{p} = 1 \end{aligned}\]

์œ„์™€ ๊ฐ™์ด ํ™•๋ฅ  ๅˆ์ด 1์ด ๋จ์„ ๋ณด์ด๋Š” ๊ณผ์ •์—์„œ โ€œGeometric Seriesโ€๊ฐ€ ๋“ฑ์žฅํ•˜๊ธฐ ๋•Œ๋ฌธ์— โ€œGeometricโ€ Distribution์ด๋ผ๋Š” ์ด๋ฆ„์ด ๋ถ™์—ˆ๋‹ค!!

Property. Memeryless property ๐Ÿ”ฅ

<Geometric Distribution>์€ โ€œMemoryless Propertyโ€๋ผ๋Š” ์žฌ๋ฏธ์žˆ๋Š” ์„ฑ์งˆ์„ ๊ฐ€์ง€๊ณ  ์žˆ๋‹ค.

์˜ˆ๋ฅผ ๋“ค์–ด, ๋กœ๋˜๋ฅผ 1๋…„ ์ „๋ถ€ํ„ฐ ์‚ฌ๊ธฐ ์‹œ์ž‘ํ•œ ์‚ฌ๋žŒ๊ณผ, ๋กœ๋˜๋ฅผ ์˜ค๋Š˜๋ถ€ํ„ฐ ์‚ฌ๊ธฐ ์‹œ์ž‘ํ•œ ์‚ฌ๋žŒ์˜ ๋‹น์ฒจ ํ™•๋ฅ ์€ ๊ฐ™๋‹ค! ์ด๊ฒƒ์€ 1๋…„ ์ „๋ถ€ํ„ฐ ๋กœ๋˜๋ฅผ ์‚ฌ๊ธฐ ์‹œ์ž‘ํ–ˆ๊ณ , ๊ทธ๊ฒƒ๋“ค์ด ๋ชจ๋‘ ๋‚™์ฒจ์ด์—ˆ๋‹ค๋Š” ์‚ฌ์‹ค์ด ๋กœ๋˜์— ์–ธ์ œ ์ฒ˜์Œ ๋‹น์ฒจ๋ ์ง€์™€ ์•„๋ฌด ๊ด€๋ จ์ด ์—†๊ธฐ ๋•Œ๋ฌธ์ด๋‹ค.

์ˆ˜์‹์œผ๋กœ ๊ธฐ์ˆ ํ•˜๋ฉด ์•„๋ž˜์™€ ๊ฐ™๋‹ค.

\[P(X = x+k \mid x > k) = P(X = x)\]

์ฆ‰, ๋‚ด๊ฐ€ ์ด์ „์— $k$๋ฒˆ ์‹œ๋„ ํ–ˆ๋‹ค๋Š” ์‚ฌ์‹ค์ด ํ˜„์žฌ ํ™•๋ฅ ์— ์•„๋ฌด๋Ÿฐ ์˜ํ–ฅ์„ ๋ผ์น˜์ง€ ์•Š๋Š”๋‹ค.

Geometric Distribution ๊ธฐ์ค€์œผ๋กœ ์ž‘์„ฑํ•˜๋ฉด ์•„๋ž˜์™€ ๊ฐ™๋‹ค.

\[P(X > k) = q^{k}\]

๋‘๋ฒˆ์งธ ์‹์„ ์ž˜ ์‚ฌ์šฉํ•ด๋ณด๋ฉด, ์ฒซ๋ฒˆ์งธ ์‹์„ ์‰ฝ๊ฒŒ ์œ ๋„ํ•  ์ˆ˜ ์žˆ๋‹ค ๐Ÿ˜Š


Theorem.

Let $X \sim \text{Geo}(p)$, then

  • $\displaystyle E[X] = \frac{1}{p}$
  • $\displaystyle \text{Var}(X) = \frac{1-p}{p^2}$

์œ„์˜ ์‹์— ๋Œ€ํ•œ ์ฆ๋ช…์€ ๊ฐ„๋‹จํ•˜๋‹ค. ์ง€๊ธˆ ์œ ๋„ํ•ด๋ณด์ž.

ํŽผ์ณ๋ณด๊ธฐ

Proof.

1. $E[X]$

\[\begin{aligned} E[X] &= \sum k f(k) = p \sum^{\infty}_{k=1} k q^{k-1} \\ &= p \; (1 + 2q + 3q^2 + \cdots ) \\ \end{aligned}\]

(1) ๋ฉฑ๊ธ‰์ˆ˜๋กœ ์œ ๋„

\[\begin{aligned} S &= (1 + 2q + 3q^2 + \cdots ) \\ qS &= (0 + q + 2q^2 + \cdots) \\ (1-q)S &= 1 \\ (1-q)S &= \frac{1}{1-q} \\ S &= \frac{1}{(1-q)^2} \\ \end{aligned}\]

(2) ๋ฏธ๋ถ„์œผ๋กœ ์œ ๋„

\[\begin{aligned} S &= (1 + 2q + 3q^2 + \cdots ) \\ &= (1 + q + q^2 + \cdots) ' \\ &= \left( \frac{1}{1-q} \right)' \\ &= \frac{1}{(1-q)^2} \end{aligned}\]

๋”ฐ๋ผ์„œ, $\displaystyle E[X] = p S = p \frac{1}{(1-q)^2} = \frac{p}{p^2} = \frac{1}{p}$

2. $\text{Var}(X)$

$\text{Var}(X)$๋ฅผ ๊ตฌํ•˜๊ธฐ ์œ„ํ•ด $E[X^2]$๋ฅผ ๊ตฌํ•ด์•ผ ํ•œ๋‹ค. ์ด๋•Œ, ๊ณ„์‚ฐ์˜ ํŽธ์˜๋ฅผ ์œ„ํ•ด $E[X^2]$ ๋Œ€์‹  $E[X(X-1)]$๋ฅผ ๊ตฌํ•˜๋Š” ํ…Œํฌ๋‹‰์„ ์‚ฌ์šฉํ•˜์ž.

\[\begin{aligned} E[X(X-1)] &= p \sum k(k-1)q^{k-1} \\ &= pq \sum^{\infty}_{i=2} k(k-1) q^{k-2} \\ &= pq \left( \frac{1}{(1-q)^2} \right)' \\ &= pq \left( \frac{2}{(1-q)^3}\right) \\ &= pq \frac{2}{p^3} = \frac{2q}{p^2} \end{aligned}\]

์ด์ œ ์œ„์˜ ๊ฒฐ๊ณผ๋ฅผ ์ด์šฉํ•ด์„œ ์ž˜ ์ •๋ฆฌํ•˜๋ฉด,

\[\begin{aligned} \text{Var}(X) &= E[X(X-1)] + E[X] - \left(E[X]\right)^2 \\ &= \frac{2q}{p^2} + \frac{1}{p} - \frac{1}{p^2} \\ &= \frac{1-p}{p^2} \end{aligned}\]

Negative Binomial Distribution

์ด๋ฒˆ์—๋Š” <Geometric Distribution>๊ณผ ๋น„์Šทํ•˜์ง€๋งŒ, $k$๊ฐœ์˜ Head๊ฐ€ ๋‚˜์˜ฌ ๋•Œ๊นŒ์ง€ ๋™์ „์„ ๋˜์ง„๋‹ค. ์ด๋•Œ Tossing ํšŸ์ˆ˜๋ฅผ Random Variable $X$๋ผ๊ณ  ํ•˜๋ฉด, ์ด๊ฒƒ์€ <Negative Binomial Distribution>์„ ๋”ฐ๋ฅธ๋‹ค.

Definition. Negative Binomial Distribution

$p$-coin์„ independently tossing ํ•œ๋‹ค๊ณ  ํ•ด๋ณด์ž. ์ด๋•Œ $k$๊ฐœ Head๊ฐ€ ๋‚˜์˜ฌ ๋•Œ๊นŒ์ง€ ๋™์ „์„ ๋˜์ง„ ํšŸ์ˆ˜๋ฅผ RV $X$๋กœ ์žก์ž. ๊ทธ๋Ÿฌ๋ฉด ์ด๊ฒƒ์˜ pmf๋Š” ์•„๋ž˜์™€ ๊ฐ™๋‹ค.

\[b^{*}(x; k,p) =\binom{x-1}{k-1} p^k q^{x-k} \quad \text{for} \quad x = k, k+1, \dots\]

์ด๊ฒƒ์˜ ์œ ๋„๋Š” $(x-1)$ ์‹œ๋„๊นŒ์ง€ $(k-1)$๋ฒˆ ๋งŒํผ์˜ Head๊ฐ€ ๋‚˜์™€์•ผ ํ•œ๋‹ค๊ณ  ์ƒ๊ฐํ•˜๋ฉด, <Binomial Distribution>์—์„œ $(x-1)$ ์‹œ๋„, $(k-1)$๋งŒํผ ์„ฑ๊ณตํ•œ ๊ฒƒ๊ณผ ๊ฐ™๋‹ค.

\[\binom{x-1}{k-1} p^{k-1} q^{x-k}\]

๋งˆ์ง€๋ง‰์—๋Š” ๋ฐ˜๋“œ์‹œ Head๊ฐ€ ๋‚˜์™€์•ผ ํ•˜๋ฏ€๋กœ ์œ„์˜ ์‹์— $p$๋ฅผ ๊ณฑํ•ด์ฃผ๋ฉด, <Negative Binomial Distribution>์„ ์–ป๊ฒŒ ๋œ๋‹ค!

Negative Binomial์€ ์„œ๋กœ ๋…๋ฆฝ์ธ $n$๊ฐœ์˜ Geometric RV๋ผ๊ณ  ์ƒ๊ฐํ•ด๋ณผ ์ˆ˜๋„ ์žˆ๋‹ค. ๊ทธ๋ž˜์„œ NegBIN $Y$๋Š” Geo $X_i$์— ๋Œ€ํ•ด

\[Y = X_1 + \cdots X_n\]

์ธ ์…ˆ์ด๋‹ค.

๊ทธ๋Ÿฐ๋ฐ ์™œ โ€œNegativeโ€ Binomial์ด๋ผ๋Š” ์ด๋ฆ„์ด ๋ถ™์—ˆ์„๊นŒ? ๊ทธ๊ฒƒ์€ <Geometric Distribution> ๋•Œ์™€ ๋งˆ์ฐฌ๊ฐ€์ง€๋กœ ํ™•๋ฅ ์˜ ๅˆ์ด 1์ด ๋จ์„ ๋ณด์ด๋Š” ๊ณผ์ •์—์„œ ์œ ๋ž˜ํ•œ๋‹ค.

\[\begin{aligned} \sum f(x) &= \sum^{\infty}_{x=k} \binom{x-1}{k-1} p^k q^{x-k} \\ &= p^k \sum^{\infty}_{x=k} \binom{x-1}{k-1} q^{x-k} \\ \end{aligned}\]

์—ฌ๊ธฐ์—์„œ $y = x - k$๋กœ ์น˜ํ™˜ํ•˜์ž. ์ด๋•Œ, $y$๋Š” $k$๋ฒˆ์งธ ์„ฑ๊ณต์„ ์–ป๊ธฐ ์œ„ํ•ด ๊ฑธ๋ฆฐ ์‹คํŒจ ํšŸ์ˆ˜ $Y$์ด๋‹ค. ํ‘œ๊ธฐ์˜ ํŽธ์˜๋ฅผ ์œ„ํ•ด ์ง€๊ธˆ๋ถ€ํ„ฐ๋Š” ๋ฉฑ๊ธ‰์ˆ˜ ๋ถ€๋ถ„๋งŒ ํ‘œํ˜„ํ•˜๊ฒ ๋‹ค.

\[\sum^{\infty}_{x=k} \binom{x-1}{k-1} q^{x-k} = \sum^{\infty}_{y=0} \binom{y + k - 1}{k-1} q^{y}\]

์ด๋•Œ, ์กฐํ•ฉ(combination)์˜ ์„ฑ์งˆ์— ์˜ํ•ด ์•„๋ž˜๊ฐ€ ์„ฑ๋ฆฝํ•œ๋‹ค.

\[\binom{y + k - 1}{k-1} = \binom{y + k - 1}{y}\]

๋”ฐ๋ผ์„œ,

\[\sum^{\infty}_{y=0} \binom{x-1}{k-1} q^{x-k} = \sum^{\infty}_{y=0} \binom{k + y - 1}{y} q^{y}\]

์—ฌ๊ธฐ์— <Negative Binomial Theorem>์„ ์ ์šฉํ•ด๋ณด์ž.

\[(1 + x)^{-n} = \sum^{\infty}_{k = 0} \binom{-n}{k} x^k = \sum^{\infty}_{k = 0} \binom{n + k - 1}{k} (-1)^k x^k\]

์œ„์˜ ์ •๋ฆฌ์—์„œ $x$์— $-q$๋ฅผ ๋Œ€์ž…ํ•˜๋ฉด,

\[\sum^{\infty}_{y=0} \binom{k + y - 1}{y} q^{y} = (1 - q)^{-k}\]

์‹์„ ์ •๋ฆฌํ•˜๋ฉด,

\[\begin{aligned} \sum f(x) &= \sum^{\infty}_{x=k} \binom{x-1}{k-1} p^k q^{x-k} \\ &= p^k \sum^{\infty}_{x=k} \binom{x-1}{k-1} q^{x-k} \\ & p^k \sum^{\infty}_{y=0} \binom{k + y - 1}{y} q^{y} \\ &= p^k \cdot (1 - q)^{-k} \\ &= p^k \cdot p^{-k} \\ &= 1 \end{aligned}\]

$\blacksquare$

์ฆ‰, ์œ ๋„ ๊ณผ์ •์—์„œ Negative Binomial์ด ๋“ฑ์žฅํ•˜๊ธฐ ๋•Œ๋ฌธ์— ์ง€๊ธˆ์˜ Negative Binomial์ด๋ผ๋Š” ์ด๋ฆ„์ด ๋ถ™์—ˆ๋‹ค.

Theorem.

If $X \sim \text{Neg BIN}(k, p)$, then

  • $\displaystyle E[X] = \frac{1}{p}k$
  • $\displaystyle \text{Var}(X) = \left(\frac{1-p}{p^2}\right) k$

์œ„์˜ ๊ฒฐ๊ณผ๋ฅผ ์ž˜ ์‚ดํŽด๋ณด๋ฉด, Geometric Distribution๊ณผ ์—ฐ๊ด€์„ฑ์„ ์ฐพ์„ ์ˆ˜ ์žˆ๋‹ค. Geo์—์„œ๋Š” ํ‰๊ท ์ด $E[X] = \dfrac{1}{p}$์˜€๋Š”๋ฐ, NegBIN๋ฅผ $k$๊ฐœ์˜ Geo๊ฐ€ ๋ชจ์ธ ๊ฒƒ์œผ๋กœ ํ•ด์„ํ•œ๋‹ค๋ฉด, Geo์˜ ํ‰๊ท  $\dfrac{1}{p}$๊ฐ€ $k$๊ฐœ ๋ชจ์ธ ์…ˆ์ด๋‹ˆ $\dfrac{1}{p}k$๊ฐ€ ๋œ๋‹ค. ๋งˆ์ฐฌ๊ฐ€์ง€๋กœ ๋ถ„์‚ฐ์— ๋Œ€ํ•ด์„œ๋„ ๋™์ผํ•œ ์‹œ๊ฐ์œผ๋กœ ์ ‘๊ทผํ•ด๋ณผ ์ˆ˜ ์žˆ๋‹ค. ๐Ÿ˜Ž

๋งบ์Œ๋ง

์ด์–ด์ง€๋Š” ํฌ์ŠคํŠธ์—์„œ๋Š” <Poisson Distribution>๋ผ๋Š” ์ด์‚ฐ ํ™•๋ฅ  ๋ถ„ํฌ์˜ ๋ณด์Šค๊ฐ€ ๋“ฑ์žฅํ•œ๋‹ค!! Poisson์€ ์ƒ๋‹นํžˆ ์ค‘์š”ํ•˜๋‹ˆ ๋ˆˆ์—ฌ๊ฒจ ์‚ดํŽด๋ณด๋„๋ก ํ•˜์ž!

๐Ÿ‘‰ Poisson Distribution