โ€œํ™•๋ฅ ๊ณผ ํ†ต๊ณ„(MATH230)โ€ ์ˆ˜์—…์—์„œ ๋ฐฐ์šด ๊ฒƒ๊ณผ ๊ณต๋ถ€ํ•œ ๊ฒƒ์„ ์ •๋ฆฌํ•œ ํฌ์ŠคํŠธ์ž…๋‹ˆ๋‹ค. ์ „์ฒด ํฌ์ŠคํŠธ๋Š” Probability and Statistics์—์„œ ํ™•์ธํ•˜์‹ค ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค ๐ŸŽฒ

12 minute read

โ€œํ™•๋ฅ ๊ณผ ํ†ต๊ณ„(MATH230)โ€ ์ˆ˜์—…์—์„œ ๋ฐฐ์šด ๊ฒƒ๊ณผ ๊ณต๋ถ€ํ•œ ๊ฒƒ์„ ์ •๋ฆฌํ•œ ํฌ์ŠคํŠธ์ž…๋‹ˆ๋‹ค. ์ „์ฒด ํฌ์ŠคํŠธ๋Š” Probability and Statistics์—์„œ ํ™•์ธํ•˜์‹ค ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค ๐ŸŽฒ

์ด์ „ ํฌ์ŠคํŠธ์—์„œ ์ด์‚ฐ ๋ถ„ํฌ์˜ ๊ธฐ๋ณธ์ด ๋˜๋Š” <Bernoulli Distribution>, <Binomial Distribution> ๋“ฑ๋“ฑ์„ ์‚ดํŽด๋ดค๋‹ค. ์ด๋ฒˆ ํฌ์ŠคํŠธ์—์„œ๋Š” ์ข€๋” ์žฌ๋ฏธ์žˆ๋Š” ๋ถ„ํฌ๋“ค์ด ๋“ฑ์žฅํ•œ๋‹ค!


HyperGeometric Distribution

<HyperGeometric Distribution>์€ ์•ž์—์„œ ์‚ดํŽด๋ณธ <Binomial Distribution>๊ณผ ์ƒํ™ฉ์ด ์ •๋ง ๋น„์Šทํ•˜๋‹ค. ํ•˜์ง€๋งŒ, Sampling ๋ฐฉ์‹์—์„œ <Binomial Distribution>์€ ๊ฐ trial์ด ๋…๋ฆฝ์ ์ด๊ณ , with replacement์ธ ๋ฐ˜๋ฉด์— <HyperGeometric Distribution>์€ ๊ฐ trial์ด dependentํ•˜๊ณ  w/o replacement๋กœ ์ง„ํ–‰๋œ๋‹ค!

w/o replacement ๋ฐฉ์‹์œผ๋กœ ์ƒ˜ํ”Œ๋งํ•˜๋Š” ๊ฒƒ์˜ ์˜ˆ์—๋Š” <acceptance sampling>์ด ์žˆ๋‹ค. ๋ฌผํ’ˆ์„ ํ’ˆ์งˆ์„ ๊ฒ€์ˆ˜ํ•˜๋Š” ์ด ์ž‘์—…์„  ํ…Œ์ŠคํŒ… ํ›„์— ๋ฌผํ’ˆ์ด ํŒŒ๊ดด๋˜๊ฑฐ๋‚˜ ๋”์ด์ƒ ์“ฐ์ง€ ๋ชปํ•˜๊ฒŒ ๋  ์ˆ˜ ์žˆ๊ธฐ ๋•Œ๋ฌธ์— replacement๋ฅผ ํ•  ์ˆ˜๊ฐ€ ์—†๋‹ค. ๊ทธ๋ ‡๊ธฐ ๋•Œ๋ฌธ์— w/o replacement๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ ํ•˜๋Š” ์ƒ˜ํ”Œ๋ง์— ๋Œ€ํ•œ ๋…ผ์˜๋Š” ๊ผญ ํ•„์š”ํ•˜๋‹ค.

Definition. HyperGeometric Distribution

์„ฑ๊ณต์œผ๋กœ ํ‘œ์‹œ๋œ $K$๊ฐœ์˜ ์ƒ˜ํ”Œ๊ณผ ์‹คํŒจ๋กœ ํ‘œ์‹œ๋œ $N-K$๊ฐœ์˜ ์ƒ˜ํ”Œ์ด ์žˆ๋Š” $N$๊ฐœ์˜ ์ƒ˜ํ”Œ์—์„œ, ๋ฌด์ž‘์œ„๋กœ $n$๊ฐœ์˜ ์ƒ˜ํ”Œ์„ w/o replacement๋กœ ๋ฝ‘๋Š”๋‹ค๊ณ  ํ•˜์ž. ์ด๊ฒƒ์„ <HyperGeometric Experiment>๋ผ๊ณ  ํ•œ๋‹ค. ์ด๋•Œ, RV $X$๋Š” <HyperGeometric Experiment>์—์„œ ์„ฑ๊ณต์„ ๋ฝ‘์€ ํšŸ์ˆ˜์ด๋‹ค. ์ด RV $X$๋ฅผ <HyperGeometric RV>๋ผ๊ณ  ํ•œ๋‹ค.

<HyperGeometric RV> $X$์˜ pmf๋Š” ์•„๋ž˜์™€ ๊ฐ™์ด ์ •์˜๋œ๋‹ค.

\[h(x; N, K, n) = \frac{\binom{K}{x} \binom{N-K}{n-x}}{\binom{N}{n}} \quad \text{where} \quad 0 \le x \le K \quad \text{and} \quad 0 \le n-x \le N-K\]

์œ„์™€ ๊ฐ™์€ pmf๋ฅผ <HyperGeometric Distribution>๋ผ๊ณ  ํ•˜๋ฉฐ, $X \sim \text{HyperGeo}(N, K, n)$๋กœ ํ‘œ๊ธฐํ•œ๋‹ค.

์ด๋•Œ, <HyperGeometric Distribution>์— ๋Œ€ํ•œ ์กฐ๊ฑด์‹์„ ๋‹ค๋“ฌ์œผ๋ฉด ์•„๋ž˜์™€ ๊ฐ™๋‹ค.

\[\begin{aligned} \quad 0 \le x \le K \quad &\text{and} \quad 0 \le n-x \le N-K \\ \quad 0 \le x \le K \quad &\text{and} \quad -n \le -x \le N-K-n \\ \quad 0 \le x \le K \quad &\text{and} \quad K+n - N \le x \le n \\ \end{aligned}\] \[\therefore \max \{ 0, n-(N-K) \} \le x \le \min \{ K, n \}\]

Theorem.

Let $X \sim \text{HyperGeo}(N, K, n)$, then

  • $\displaystyle E[X] = n \frac{K}{N}$
  • $\displaystyle \text{Var}(X) = n \frac{K}{N}\left( 1 - \frac{K}{N} \right) \cdot \frac{N-n}{N-1}$

์ง€๊ธˆ ๋‹น์žฅ <HyperGeometric Distribution>์— ๋Œ€ํ•œ ํ‰๊ท ๊ณผ ๋ถ„์‚ฐ์— ๋Œ€ํ•œ ์ •๋ฆฌ๋ฅผ ์ฆ๋ช…ํ•˜์ง€๋Š” ์•Š์„ ๊ฒƒ์ด๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ์œ„์˜ ์‹์„ ์ข€๋” ์ง๊ด€์ ์œผ๋กœ ์ดํ•ดํ•ด๋ณด๋ฉด, <Binomial Distribution>์˜ ๊ฒฝ์šฐ์™€ ์ •๋ง ์œ ์‚ฌํ•จ์„ ๋ฐœ๊ฒฌํ•  ์ˆ˜ ์žˆ๋‹ค.

HyperGeo์˜ $\dfrac{K}{N}$๋ฅผ Binomial์˜ $p$๋กœ ํ•ด์„ํ•œ๋‹ค๋ฉด, Binomial์˜ ํ‰๊ท ์ธ $np$์™€ HpyerGeom์˜ $n\dfrac{K}{N}$๋Š” ๊ทธ ํ˜•ํƒœ๊ฐ€ ๊ฝค ๋น„์Šทํ•˜๋‹ค. ๋ถ„์‚ฐ์˜ ๊ฒฝ์šฐ์—๋„ HyperGeo์˜ ๊ฒฝ์šฐ $n \dfrac{K}{N}\left( 1 - \dfrac{K}{N} \right) \cdot \dfrac{N-n}{N-1}$๋กœ Binomial์˜ ๊ฒฝ์šฐ์ฒ˜๋Ÿผ $npq$์˜ ํ˜•ํƒœ๊ฐ€ ๋ณด์ด์ง€๋งŒ, ๋งˆ์ง€๋ง‰ ๋ถ€๋ถ„์— $\dfrac{N-n}{N-1}$์— ๋Œ€ํ•œ ํ…€์ด ๋ถ™๋Š”๋‹ค.

Theorem.

ํŠน์ • ๊ฒฝ์šฐ์—์„œ๋Š” HyperGeo๋ฅผ Binomial๋กœ ์ทจ๊ธ‰ํ•  ์ˆ˜๋„ ์žˆ๋‹ค.

If $N \gg n$ and $K \gg n$, then

\[h(x; N, K, n) \approx \text{BIN}(x; n, \frac{K}{N})\]

์œ„์˜ ์ •๋ฆฌ์™€ ๋งˆ์ฐฌ๊ฐ€์ง€๋กœ ์ฆ๋ช…์€ ๋’ค์—์„œ ๋”ฐ๋กœ ์ œ์‹œํ•˜๊ฒ ๋‹ค.


Multivariate HyperGeometric Distribution

<Multivariate HyperGeometric Distribution>์€ HyperGeo์—์„œ ๊ฐ€๋Šฅํ•œ outcome์ด 2๊ฐœ์—์„œ ์—ฌ๋Ÿฌ ๊ฐœ๋กœ ๋Š˜์–ด๋‚œ ์ƒํ™ฉ์ด๋‹ค. Multivariate HyperGeo์˜ pmf๋Š” ์•„๋ž˜์™€ ๊ฐ™์ด ๊ธฐ์ˆ ํ•  ์ˆ˜ ์žˆ๋‹ค.

Definition. Mutlivariate HyperGeometric Distribution

If $N$ items can be partitioned into the $k$ cells $A_1, A_2, \dots, A_k$ with $a_1, a_2, \dots, a_k$ elements, respectively, then the probability distribution of the RVs $X_1, X_2, \dots, X_k$, representing the number of elements selected from $A_1, A_2, \dots, A_k$ in a random sample of size $n$, is

\[f(x_1, \dots, x_k\; ; \; a_1, \dots, a_k, N, n) = \frac{\binom{a_1}{x_1} \cdots \binom{a_k}{x_k}}{\binom{N}{n}}\]

with $\displaystyle \sum^k_{i=1} x_i = n$ and $\displaystyle \sum^k_{i=1} a_i = N$.


Geometric Distribution

<Geometric Distribution>์˜ ๊ฒฝ์šฐ๋Š” ์•ž์—์„œ ์ œ์‹œ๋œ Distribution๋“ค๊ณผ ์กฐ๊ธˆ ์ƒํ™ฉ์ด ๋‹ค๋ฅด๋‹ค.

Definition. Geometric Distribution

$p$-coin์„ ๋…๋ฆฝ์ ์œผ๋กœ tossing ํ•˜๋Š” ์ƒํ™ฉ์„ ์ƒ๊ฐํ•ด๋ณด์ž. ์ด๋•Œ, ์šฐ๋ฆฌ๋Š” ์ฒ˜์Œ์œผ๋กœ Head๊ฐ€ ๋‚˜์˜ฌ ๋•Œ๊นŒ์ง€ $p$-coin์„ ๋˜์งˆ ๊ฒƒ์ด๋‹ค. ์ด๋•Œ, ์ฒซ Head๊ฐ€ ๋‚˜์˜ค๊ธฐ๊นŒ์ง€ ์‹œ๋„ํ•œ Tossing ํšŸ์ˆ˜๋ฅผ Random Variable $X$๋ผ๊ณ  ํ•˜๋ฉด, ์ด๊ฒƒ์˜ pmf๋Š” ์•„๋ž˜์™€ ๊ฐ™๋‹ค.

\[g(x; p) = pq^{x-1}, \quad x = 1, 2, 3, \dots\]

์ด RV $X$๋ฅผ <Geometric RV>๋ผ๊ณ  ํ•˜๋ฉฐ, $X \sim \text{Geo}(p)$๋กœ ํ‘œ๊ธฐํ•œ๋‹ค.

์—ฌ๊ธฐ์„œ ์™œ <Geometric Distribution>์— โ€œGeometricโ€์ด๋ผ๋Š” ์ด๋ฆ„์ด ๋ถ™์—ˆ๋Š”์ง€ ๊ถ๊ธˆ์ฆ์ด ์ƒ๊ธด๋‹ค. ๊ทธ ์ด์œ ๋Š” Geo์—์„œ ํ™•๋ฅ ์˜ ๅˆ์ด 1์ด ๋จ์„ ํ™•์ธํ•˜๋ฉด์„œ ์•Œ ์ˆ˜ ์žˆ๋‹ค.

\[\begin{aligned} \sum_x g(x) &= \sum^{\infty}_x p \dot q^{x-1}\\ &= p \; (1 + q + q^2 + \cdots + q^n + \cdots ) \\ &= \lim_{n \rightarrow \infty} p \; \frac{1-q^n}{1-q} = \frac{p}{1-q} = \frac{p}{p} = 1 \end{aligned}\]

์œ„์™€ ๊ฐ™์ด ํ™•๋ฅ  ๅˆ์ด 1์ด ๋จ์„ ๋ณด์ด๋Š” ๊ณผ์ •์—์„œ โ€œGeometric Seriesโ€๊ฐ€ ๋“ฑ์žฅํ•˜๊ธฐ ๋•Œ๋ฌธ์— โ€œGeometricโ€ Distribution์ด๋ผ๋Š” ์ด๋ฆ„์ด ๋ถ™์—ˆ๋‹ค!!

Property. Memeryless property ๐Ÿ”ฅ

<Geometric Distribution>์€ <Memoryless Property>๋ผ๋Š” ์žฌ๋ฏธ์žˆ๋Š” ์„ฑ์งˆ์„ ๊ฐ€์ง€๊ณ  ์žˆ๋‹ค. ์ˆ˜์‹์œผ๋กœ ๊ธฐ์ˆ ํ•˜๋ฉด ์•„๋ž˜์™€ ๊ฐ™๋‹ค.

\[P(X = x+k \mid x > k) = P(X = x)\]

๋˜๋Š”

\[P(X > k) = q^{k}\]

๋‘๋ฒˆ์งธ ์‹์„ ์ž˜ ์‚ฌ์šฉํ•ด๋ณด๋ฉด, ์ฒซ๋ฒˆ์งธ ์‹์„ ์‰ฝ๊ฒŒ ์œ ๋„ํ•  ์ˆ˜ ์žˆ๋‹ค ๐Ÿ˜Š


Theorem.

Let $X \sim \text{Geo}(p)$, then

  • $\displaystyle E[X] = \frac{1}{p}$
  • $\displaystyle \text{Var}(X) = \frac{1-p}{p^2}$

์œ„์˜ ์‹์— ๋Œ€ํ•œ ์ฆ๋ช…์€ ๊ฐ„๋‹จํ•˜๋‹ค. ์ง€๊ธˆ ์œ ๋„ํ•ด๋ณด์ž.

ํŽผ์ณ๋ณด๊ธฐ

Proof.

1. $E[X]$

\[\begin{aligned} E[X] &= \sum k f(k) = p \sum^{\infty}_{k=1} k q^{k-1} \\ &= p \; (1 + 2q + 3q^2 + \cdots ) \\ \end{aligned}\]

(1) ๋ฉฑ๊ธ‰์ˆ˜๋กœ ์œ ๋„

\[\begin{aligned} S &= (1 + 2q + 3q^2 + \cdots ) \\ qS &= (0 + q + 2q^2 + \cdots) \\ (1-q)S &= 1 \\ (1-q)S &= \frac{1}{1-q} \\ S &= \frac{1}{(1-q)^2} \\ \end{aligned}\]

(2) ๋ฏธ๋ถ„์œผ๋กœ ์œ ๋„

\[\begin{aligned} S &= (1 + 2q + 3q^2 + \cdots ) \\ &= (1 + q + q^2 + \cdots) ' \\ &= \left( \frac{1}{1-q} \right)' \\ &= \frac{1}{(1-q)^2} \end{aligned}\]

๋”ฐ๋ผ์„œ, $\displaystyle E[X] = p S = p \frac{1}{(1-q)^2} = \frac{p}{p^2} = \frac{1}{p}$


2. $\text{Var}(X)$

$\text{Var}(X)$๋ฅผ ๊ตฌํ•˜๊ธฐ ์œ„ํ•ด $E[X^2]$๋ฅผ ๊ตฌํ•ด์•ผ ํ•œ๋‹ค. ์ด๋•Œ, ๊ณ„์‚ฐ์˜ ํŽธ์˜๋ฅผ ์œ„ํ•ด $E[X^2]$ ๋Œ€์‹  $E[X(X-1)]$๋ฅผ ๊ตฌํ•˜๋Š” ํ…Œํฌ๋‹‰์„ ์‚ฌ์šฉํ•˜์ž.

\[\begin{aligned} E[X(X-1)] &= p \sum k(k-1)q^{k-1} \\ &= pq \sum^{\infty}_{i=2} k(k-1) q^{k-2} \\ &= pq \left( \frac{1}{(1-q)^2} \right)' \\ &= pq \left( \frac{2}{(1-q)^3}\right) \\ &= pq \frac{2}{p^3} = \frac{2q}{p^2} \end{aligned}\]

์ด์ œ ์œ„์˜ ๊ฒฐ๊ณผ๋ฅผ ์ด์šฉํ•ด์„œ ์ž˜ ์ •๋ฆฌํ•˜๋ฉด,

\[\begin{aligned} \text{Var}(X) &= E[X(X-1)] + E[X] - \left(E[X]\right)^2 \\ &= \frac{2q}{p^2} + \frac{1}{p} - \frac{1}{p^2} \\ &= \frac{1-p}{p^2} \end{aligned}\]

Negative Binomial Distribution

์ด๋ฒˆ์—๋Š” <Geometric Distribution>๊ณผ ๋น„์Šทํ•˜์ง€๋งŒ, $k$๊ฐœ์˜ Head๊ฐ€ ๋‚˜์˜ฌ ๋•Œ๊นŒ์ง€ ๋™์ „์„ ๋˜์ง„๋‹ค. ์ด๋•Œ Tossing ํšŸ์ˆ˜๋ฅผ Random Variable $X$๋ผ๊ณ  ํ•˜๋ฉด, ์ด๊ฒƒ์€ <Negative Binomial Distribution>์„ ๋”ฐ๋ฅธ๋‹ค.

Definition. Negative Binomial Distribution

$p$-coin์„ independently tossing ํ•œ๋‹ค๊ณ  ํ•ด๋ณด์ž. ์ด๋•Œ $k$๊ฐœ Head๊ฐ€ ๋‚˜์˜ฌ ๋•Œ๊นŒ์ง€ ๋™์ „์„ ๋˜์ง„ ํšŸ์ˆ˜๋ฅผ RV $X$๋กœ ์žก์ž. ๊ทธ๋Ÿฌ๋ฉด ์ด๊ฒƒ์˜ pmf๋Š” ์•„๋ž˜์™€ ๊ฐ™๋‹ค.

\[b^{*}(x; k,p) =\binom{x-1}{k-1} p^k q^{x-k} \quad \text{for} \quad x = k, k+1, \dots\]

์ด๊ฒƒ์˜ ์œ ๋„๋Š” $(x-1)$ ์‹œ๋„๊นŒ์ง€ $(k-1)$๋ฒˆ ๋งŒํผ์˜ Head๊ฐ€ ๋‚˜์™€์•ผ ํ•œ๋‹ค๊ณ  ์ƒ๊ฐํ•˜๋ฉด, <Binomial Distribution>์—์„œ $(x-1)$ ์‹œ๋„, $(k-1)$๋งŒํผ ์„ฑ๊ณตํ•œ ๊ฒƒ๊ณผ ๊ฐ™๋‹ค.

\[\binom{x-1}{k-1} p^{k-1} q^{x-k}\]

๋งˆ์ง€๋ง‰์—๋Š” ๋ฐ˜๋“œ์‹œ Head๊ฐ€ ๋‚˜์™€์•ผ ํ•˜๋ฏ€๋กœ ์œ„์˜ ์‹์— $p$๋ฅผ ๊ณฑํ•ด์ฃผ๋ฉด, <Negative Binomial Distribution>์„ ์–ป๊ฒŒ ๋œ๋‹ค!

Negative Binomial์€ ์„œ๋กœ ๋…๋ฆฝ์ธ $n$๊ฐœ์˜ Geometric RV๋ผ๊ณ  ์ƒ๊ฐํ•ด๋ณผ ์ˆ˜๋„ ์žˆ๋‹ค. ๊ทธ๋ž˜์„œ NegBIN $Y$๋Š” Geo $X_i$์— ๋Œ€ํ•ด

\[Y = X_1 + \cdots X_n\]

์ธ ์…ˆ์ด๋‹ค.

๊ทธ๋Ÿฐ๋ฐ ์™œ โ€œNegativeโ€ Binomial์ด๋ผ๋Š” ์ด๋ฆ„์ด ๋ถ™์—ˆ์„๊นŒ? ๊ทธ๊ฒƒ์€ <Geometric Distribution> ๋•Œ์™€ ๋งˆ์ฐฌ๊ฐ€์ง€๋กœ ํ™•๋ฅ ์˜ ๅˆ์ด 1์ด ๋จ์„ ๋ณด์ด๋Š” ๊ณผ์ •์—์„œ ์œ ๋ž˜ํ•œ๋‹ค.

\[\begin{aligned} \sum f(x) &= \sum^{\infty}_{x=k} \binom{x-1}{k-1} p^k q^{x-k} \\ &= p^k \sum^{\infty}_{x=k} \binom{x-1}{k-1} q^{x-k} \\ \end{aligned}\]

์—ฌ๊ธฐ์—์„œ $y = x - k$๋กœ ์น˜ํ™˜ํ•˜์ž. ์ด๋•Œ, $y$๋Š” $k$๋ฒˆ์งธ ์„ฑ๊ณต์„ ์–ป๊ธฐ ์œ„ํ•ด ๊ฑธ๋ฆฐ ์‹คํŒจ ํšŸ์ˆ˜ $Y$์ด๋‹ค. ํ‘œ๊ธฐ์˜ ํŽธ์˜๋ฅผ ์œ„ํ•ด ์ง€๊ธˆ๋ถ€ํ„ฐ๋Š” ๋ฉฑ๊ธ‰์ˆ˜ ๋ถ€๋ถ„๋งŒ ํ‘œํ˜„ํ•˜๊ฒ ๋‹ค.

\[\sum^{\infty}_{x=k} \binom{x-1}{k-1} q^{x-k} = \sum^{\infty}_{y=0} \binom{y + k - 1}{k-1} q^{y}\]

์ด๋•Œ, ์กฐํ•ฉ(combination)์˜ ์„ฑ์งˆ์— ์˜ํ•ด ์•„๋ž˜๊ฐ€ ์„ฑ๋ฆฝํ•œ๋‹ค.

\[\binom{y + k - 1}{k-1} = \binom{y + k - 1}{y}\]

๋”ฐ๋ผ์„œ,

\[\sum^{\infty}_{y=0} \binom{x-1}{k-1} q^{x-k} = \sum^{\infty}_{y=0} \binom{k + y - 1}{y} q^{y}\]

์—ฌ๊ธฐ์— <Negative Binomial Theorem>์„ ์ ์šฉํ•ด๋ณด์ž.

\[(1 + x)^{-n} = \sum^{\infty}_{k = 0} \binom{-n}{k} x^k = \sum^{\infty}_{k = 0} \binom{n + k - 1}{k} (-1)^k x^k\]

์œ„์˜ ์ •๋ฆฌ์—์„œ $x$์— $-q$๋ฅผ ๋Œ€์ž…ํ•˜๋ฉด,

\[\sum^{\infty}_{y=0} \binom{k + y - 1}{y} q^{y} = (1 - q)^{-k}\]

์‹์„ ์ •๋ฆฌํ•˜๋ฉด,

\[\begin{aligned} \sum f(x) &= \sum^{\infty}_{x=k} \binom{x-1}{k-1} p^k q^{x-k} \\ &= p^k \sum^{\infty}_{x=k} \binom{x-1}{k-1} q^{x-k} \\ & p^k \sum^{\infty}_{y=0} \binom{k + y - 1}{y} q^{y} \\ &= p^k \cdot (1 - q)^{-k} \\ &= p^k \cdot p^{-k} \\ &= 1 \end{aligned}\]

$\blacksquare$

์ฆ‰, ์œ ๋„ ๊ณผ์ •์—์„œ Negative Binomial์ด ๋“ฑ์žฅํ•˜๊ธฐ ๋•Œ๋ฌธ์— ์ง€๊ธˆ์˜ Negative Binomial์ด๋ผ๋Š” ์ด๋ฆ„์ด ๋ถ™์—ˆ๋‹ค.

Theorem.

If $X \sim \text{Neg BIN}(k, p)$, then

  • $\displaystyle E[X] = \frac{1}{p}k$
  • $\displaystyle \text{Var}(X) = \left(\frac{1-p}{p^2}\right) k$

์œ„์˜ ๊ฒฐ๊ณผ๋ฅผ ์ž˜ ์‚ดํŽด๋ณด๋ฉด, Geometric Distribution๊ณผ ์—ฐ๊ด€์„ฑ์„ ์ฐพ์„ ์ˆ˜ ์žˆ๋‹ค. Geo์—์„œ๋Š” ํ‰๊ท ์ด $E[X] = \dfrac{1}{p}$์˜€๋Š”๋ฐ, NegBIN๋ฅผ $k$๊ฐœ์˜ Geo๊ฐ€ ๋ชจ์ธ ๊ฒƒ์œผ๋กœ ํ•ด์„ํ•œ๋‹ค๋ฉด, Geo์˜ ํ‰๊ท  $\dfrac{1}{p}$๊ฐ€ $k$๊ฐœ ๋ชจ์ธ ์…ˆ์ด๋‹ˆ $\dfrac{1}{p}k$๊ฐ€ ๋œ๋‹ค. ๋งˆ์ฐฌ๊ฐ€์ง€๋กœ ๋ถ„์‚ฐ์— ๋Œ€ํ•ด์„œ๋„ ๋™์ผํ•œ ์‹œ๊ฐ์œผ๋กœ ์ ‘๊ทผํ•ด๋ณผ ์ˆ˜ ์žˆ๋‹ค. ๐Ÿ˜Ž


์ด์–ด์ง€๋Š” ํฌ์ŠคํŠธ์—์„œ๋Š” <Poisson Distribution>๋ผ๋Š” ์ด์‚ฐ ํ™•๋ฅ  ๋ถ„ํฌ์˜ ๋ณด์Šค๊ฐ€ ๋“ฑ์žฅํ•œ๋‹ค!! Poisson์€ ์ƒ๋‹นํžˆ ์ค‘์š”ํ•˜๋‹ˆ ๋ˆˆ์—ฌ๊ฒจ ์‚ดํŽด๋ณด๋„๋ก ํ•˜์ž!

๐Ÿ‘‰ Poisson Distribution