โ€œํ™•๋ฅ ๊ณผ ํ†ต๊ณ„(MATH230)โ€ ์ˆ˜์—…์—์„œ ๋ฐฐ์šด ๊ฒƒ๊ณผ ๊ณต๋ถ€ํ•œ ๊ฒƒ์„ ์ •๋ฆฌํ•œ ํฌ์ŠคํŠธ์ž…๋‹ˆ๋‹ค. ์ „์ฒด ํฌ์ŠคํŠธ๋Š” Probability and Statistics์—์„œ ํ™•์ธํ•˜์‹ค ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค ๐ŸŽฒ

5 minute read

โ€œํ™•๋ฅ ๊ณผ ํ†ต๊ณ„(MATH230)โ€ ์ˆ˜์—…์—์„œ ๋ฐฐ์šด ๊ฒƒ๊ณผ ๊ณต๋ถ€ํ•œ ๊ฒƒ์„ ์ •๋ฆฌํ•œ ํฌ์ŠคํŠธ์ž…๋‹ˆ๋‹ค. ์ „์ฒด ํฌ์ŠคํŠธ๋Š” Probability and Statistics์—์„œ ํ™•์ธํ•˜์‹ค ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค ๐ŸŽฒ

Law of Total Probability

Definition. Partition

The events $\{ B_1, \dots, B_n \}$ form a partition of event space $S$ if

  1. $B_i \cap B_j = \emptyset$ for any $i \ne j$
  2. $\cup^n_{i=1} B_i = S$

Theorem. Law of Total Probability

If the events $B_1$, โ€ฆ, $B_n$ form a partition of $S$ such that $P(B_i) > 0$,

then for any event $A$

\[P(A) = \sum^{n}_{i=1} P(A \cap B_i)\]

<์ „์ฒด ํ™•๋ฅ ์˜ ๋ฒ•์น™; Law of Total Probability>๋Š” <Rule of Elimination>๋ผ๊ณ ๋„ ํ•œ๋‹ค.


Bayesโ€™ Rule

Theorem. Bayesโ€™ Rule

If the events $B_1$, $B_2$, โ€ฆ, $B_k$ be a partition of event space $S$,

then for any event $A$ with $P(A) > 0$

\[P(B_k \mid A) = \frac{P(B_k \cap A)}{P(A)} = \frac{P(A \mid B_k)P(B_k)}{\sum^{n}_{i=1} P(A \mid B_i)P(B_i)}\]

proof.

์ฆ๋ช…์€ ๊ฐ„๋‹จํ•˜๋‹ค.

[Step 1] Conditional Probability์— ๋”ฐ๋ผ ์•„๋ž˜์˜ ์‹์ด ์„ฑ๋ฆฝํ•œ๋‹ค.

\[P(B_k \cap A) = P(B_k \mid A) P(A) = P(A \mid B_k) P(B_k)\]

์‹์„ ์•ฝ๊ฐ„ ๋‹ค์Œ์œผ๋ฉด ์•„๋ž˜๋ฅผ ์œ ๋„ํ•  ์ˆ˜ ์žˆ๋‹ค.

\[P(B_k \mid A) = \frac{P(B_k \cap A)}{P(A)}\]

[Step 2] Law of Total Probability์— ๋”ฐ๋ผ ๋ถ„๋ชจ์˜ $P(A)$๋ฅผ ์•„๋ž˜์™€ ๊ฐ™์ด ๋ฐ”๊ฟ€ ์ˆ˜ ์žˆ๋‹ค.

\[\frac{P(B_k \cap A)}{P(A)} = \frac{P(B_k \cap A)}{\sum^{n}_{i=1} P(A \cap B_i)}\]

[Step 3] ๋‹ค์‹œ Conditional Probability์˜ ์ •์˜๋ฅผ ์ด์šฉํ•˜๋ฉด, ์ตœ์ข…์ ์œผ๋กœ ์•„๋ž˜์˜ ๊ฒฐ๊ณผ๋ฅผ ์–ป๋Š”๋‹ค.

\[\frac{P(B_k \cap A)}{\sum^{n}_{i=1} P(A \cap B_i)} = \frac{P(B_k \cap A)}{\sum^{n}_{i=1} P(A \mid B_i)P(B_i)}\]

Applications of Bayes Rule

<Bayes Rule> ์ž์ฒด๋Š” ์–ด๋ ต์ง€ ์•Š๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ <Bayes Rule>์€ ์˜ˆ์ œ์™€ ์–ธ์ œ/์–ด๋–ป๊ฒŒ ์ด๊ฑธ ์จ์•ผ ํ•˜๋Š”์ง€๋ฅผ ํ™•์‹คํžˆ ์•„๋Š”๊ฒŒ ์ค‘์š”ํ•˜๋‹ค. ๐Ÿ‘

์„ ๋ณ„ ๊ฒ€์‚ฌ

๊ฑด๊ฐ•ํ•œ ์‚ฌ๋žŒ๊ณผ ํŠน์ • ์งˆ๋ณ‘์ด ์žˆ๋Š” ์‚ฌ๋žŒ์„ ๊ตฌ๋ณ„ํ•˜๊ธฐ ์œ„ํ•ด ์‹œํ–‰ํ•˜๋Š” ๊ฒ€์‚ฌ๋ฅผ <์„ ๋ณ„ ๊ฒ€์‚ฌ; Screening Test>๋ผ๊ณ  ํ•œ๋‹ค. ์„ ๋ณ„ ๊ฒ€์‚ฌ์—์„œ ์ด์ƒ์ด ๋‚˜ํƒ€๋‚˜๋ฉด, ์ •๋ฐ€ ๊ฒ€์‚ฌ๋ฅผ ํ†ตํ•ด ์งˆ๋ณ‘์˜ ์œ ๋ฌด๋ฅผ ํŒ๋‹จํ•œ๋‹ค.

๊ฑด๊ฐ•ํ•œ ๋ธ”ํ˜ผ์€ ์•„์นจ๋ถ€ํ„ฐ ๋ชฉ์ด ์•„ํ”„๊ธฐ ์‹œ์ž‘ํ–ˆ๋‹ค. ํ˜น์‹œ ์ฝ”๋กœ๋‚˜์— ๊ฑธ๋ ธ๋‚˜ ์‹ถ์–ด์„œ ์•„์นจ์— ์ž๊ฐ€๊ฒ€์‚ฌํ‚คํŠธ๋ฅผ ์‚ฌ์„œ ํ•ด๋ดค๋”๋‹ˆ ์ด๋Ÿด์ˆ˜๊ฐ€! ์–‘์„ฑ(+)์ด ๋–ด๋‹ค!

2022๋…„ ๋Œ€ํ•œ๋ฏผ๊ตญ์—์„œ ์ฝ”๋กœ๋‚˜ ๊ฑธ๋ฆด ํ™•๋ฅ  $P(C)$๋Š” $0.4$๋ผ๊ณ  ํ•˜์ž. ๊ทธ๋ฆฌ๊ณ  ์ž๊ฐ€๊ฒ€์‚ฌํ‚คํŠธ์˜ ์ •ํ™•๋„๋ฅผ ๊ณ„์‚ฐ ํ•ด๋ณด๋ฉด

  1. ์ฝ”๋กœ๋‚˜์— ๊ฑธ๋ฆฐ ์‚ฌ๋žŒ์ด ์–‘์„ฑ์œผ๋กœ ๋‚˜์˜ฌ ํ™•๋ฅ  $P(+ \mid C)$์€ $0.95$
  2. ์ฝ”๋กœ๋‚˜์— ๊ฑธ๋ฆฌ์ง€ ์•Š์€ ์‚ฌ๋žŒ์ด ์–‘์„ฑ์œผ๋กœ ๋‚˜์˜ฌ ํ™•๋ฅ ์€ $P(+ \mid \sim C)$์€ $0.01$

๋ผ๊ณ  ํ•˜์ž. ๋ธ”ํ˜ผ์€ โ€˜์‚ฌ์‹ค ์ฝ”๋กœ๋‚˜์— ๊ฑธ๋ฆฌ์ง€ ์•Š์€ ๊ฑด๋ฐ ์–‘์„ฑ์ด ๋œฌ ๊ฒƒ์ผ ์ˆ˜๋„ ์žˆ๋‹คโ€™๊ณ  ์ƒ๊ฐํ•˜๋ฉฐโ€™ ์ž๊ฐ€๊ฒ€์‚ฌํ‚คํŠธ์˜ ์ •ํ™•๋„๋ฅผ ์˜์‹ฌํ•˜๊ณ  ์žˆ๋‹ค. ๋ธ”ํ˜ผ์„ ์œ„ํ•ด ์ž๊ฐ€๊ฒ€์‚ฌ ์–‘์„ฑ์ธ๋ฐ ์ฝ”๋กœ๋‚˜์— ๊ฑธ๋ ธ์„ ํ™•๋ฅ  $P(C \mid +)$๋ฅผ ๊ตฌํ•ด๋ณด์ž.

By Bayesโ€™ Rule,

\[\begin{aligned} P(C \mid +) &= \frac{P(+ \mid C) P(C)}{P(+)} = \frac{P(+ \mid C) P(C)}{P(+ \mid C)P(C) + P(+ \mid \sim C)P(\sim C)} \\ &= \frac{0.95 \cdot 0.4}{0.95 \cdot 0.4 + 0.01 \cdot 0.6} = \frac{0.38}{0.386} \\ &= 0.98 \end{aligned}\]

์•„โ€ฆ ์•„์‰ฝ์ง€๋งŒ, ์ž๊ฐ€๊ฒ€์‚ฌ๊ธฐํŠธ์—์„œ ์–‘์„ฑ์ด ๋‚˜์™”๋‹ค๋ฉด, ๋ธ”ํ˜ผ์€ ์ •๋ง๋กœ ์ฝ”๋กœ๋‚˜์— ๊ฑธ๋ ธ์„ ํ™•๋ฅ ์ด ๋ฌด์ง€๋ฌด์ง€ ๋†’์€ ๊ฒƒ์ด๋‹ค!!


Meaning of Bayes Rule

๋ฒ ์ด์ฆˆ ์ •๋ฆฌ๋Š” ์ด๋ฒคํŠธ์— ๋Œ€ํ•œ ์›์ธ์„ ๊ทœ๋ช…ํ•˜๋Š” ๋„๊ตฌ์ด๋‹ค. ์–ด๋–ค ์ด๋ฒคํŠธ๊ฐ€ ์ผ์–ด๋‚˜๋Š” ๊ฒƒ์—๋Š” ์›์ธ์ด ์žˆ๋‹ค. ์ด ์›์ธ๋“ค์€ 2๊ฐœ๊ฐ€ ๋™์‹œ์— ๋ฐœ์ƒํ•˜์ง€๋Š” ์•Š๋Š” Exclusive ํ•˜๋‹ค๊ณ  ๊ฐ€์ •ํ•œ๋‹ค.


๋ฒ ์ด์ฆˆ ์ •๋ฆฌ๋ฅผ ๊ด€์ธก(evidence)์— ๋”ฐ๋ฅธ ๋ฏฟ์Œ(belief)์˜ ๋ณ€ํ™”๋กœ ์ดํ•ดํ•œ ๊ฒƒ์ด <Bayesian; ๋ฒ ์ด์ฆˆ ์ฃผ์˜์ž>๋“ค์ด๋‹ค.

์•ž์—์„œ ์‚ดํŽด๋ณธ โ€œ์ฝ”๋กœ๋‚˜์— ๊ฑธ๋ฆฐ ๋ธ”ํ˜ผโ€์˜ ์˜ˆ์‹œ๋ฅผ ๋‹ค์‹œ ๋ณด์ž. ๊ทธ๋Š” ํ‚คํŠธ๋กœ ์–‘์„ฑ(+) ํŒ์ •์„ ๋ฐ›๊ธฐ ์ „์—๋Š” ์ž์‹ ์˜ ๊ฐ๊ธฐ๊ฐ€ ์ฝ”๋กœ๋‚˜์ผ ๊ฑฐ๋ผ๋Š” ๋ฏฟ์Œ์ด $P(C) = 0.4$์— ๋ถˆ๊ณผ ํ–ˆ๋‹ค. ์ด๊ฑธ ์‚ฌ์ „ ํ™•๋ฅ (Prior Probability)๋ผ๊ณ  ํ•œ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ํ‚คํŠธ๋กœ ์–‘์„ฑ ํŒ์ •์„ ๋ฐ›์€ ํ›„์—๋Š” ์ž์‹ ์ด ์ฝ”๋กœ๋‚˜์— ๊ฑธ๋ ธ์„ ๊ฑฐ๋ผ๋Š” ๋ฏฟ์Œ $P(C \mid +)$์ด $0.98$๋กœ ์น˜์†Ÿ์•˜๋‹ค! ์ด๊ฑธ ์‚ฌํ›„ ํ™•๋ฅ (Posterir Probability)๋ผ๊ณ  ํ•œ๋‹ค.

<๋ฒ ์ด์ฆˆ ์ฃผ์˜์ž>๋ผ๋Š” ๊ฒŒ ์‚ฌ์‹ค์€ ๊ทธ๋ฆฌ ๋Œ€๋‹จํ•œ ์กด์žฌ๋“ค์ด ์•„๋‹ˆ๋‹ค. ๊ด€์ฐฐ๋œ ์‚ฌ์‹ค์„ ๋ฐ”ํƒ•์œผ๋กœ ๋ณธ์ธ์˜ ๋ฏฟ์„์„ ๊ฐฑ์‹ ํ•˜๋Š” ์‚ฌ๋žŒ์ด๋ผ๋ฉด ๋ชจ๋‘๊ฐ€ <๋ฒ ์ด์ฆˆ ์ฃผ์˜์ž>์ด๋‹ค! ๋ฒ ์ด์ฆˆ ์ฃผ์˜์ž ๋งŒ์„ธ!


๋งบ์Œ๋ง

์ด๋ฒˆ์— ์‚ดํŽด๋ณธ <๋ฒ ์ด์ฆˆ ๊ทœ์น™>์€ <๋ฒ ์ด์ฆˆ ํ†ต๊ณ„ํ•™; Bayesian Statistics>๋ผ๋Š” ํ†ต๊ณ„ํ•™ ๋ถ„์•ผ์˜ ์ฒซ ๊ฑธ์Œ์ด๋‹ค. โ€œ๋ฏฟ์Œ์— ์ž๋ฃŒ๋ฅผ ๋ฐ˜์˜ํ•ด ๋ฏฟ์Œ์„ ๊ฐฑ์‹ ํ•œ๋‹คโ€๋Š” ์•„์ด๋””์–ด์— ๊ด€์‹ฌ์ด ์žˆ๋‹ค๋ฉด, ๋ฒ ์ด์ง€์•ˆ์„ ๊ณต๋ถ€ํ•ด๋ณด์ž!

์•„์‰ฝ์ง€๋งŒ ํ•™๊ต์—์„œ ๋“ค์€ โ€œํ™•๋ฅ ๊ณผ ํ†ต๊ณ„(MATH230)โ€์—์„œ ๋ฒ ์ด์ง€์•ˆ์ด ๋“ฑ์žฅํ•˜๋Š” ๊ฑด <Bayesโ€™ Rule> ๋ฟ์ด๋‹ค. ์ด๊ฒƒ๋งŒ ์•Œ๊ณ  ๊นŒ๋จน์œผ๋ฉด ๋œ๋‹ค ์˜คํžˆ๋ ค ๋จธ์‹ ๋Ÿฌ๋‹์ด๋‚˜ ์ธ๊ณต์ง€๋Šฅ ์ˆ˜์—…์—์„œ ๋ฒ ์ด์ง€์•ˆ์— ๋Œ€ํ•œ ์ด๋ก ์„ ๋” ๊ณต๋ถ€ํ•œ ๊ฒƒ ๊ฐ™๋‹ค. ํ˜ผ์ข… ์ˆ˜ํ•™์ž๊ฐ€ ๋˜์–ด๋ณด์ž!!


<Bayesโ€™ Rule>์„ ํ™œ์šฉํ•œ ์žฌ๋ฐŒ๋Š” ๋ฌธ์ œ๊ฐ€ ์žˆ๋‹ค! <๋ชฌํ‹ฐ ํ™€ ๋ฌธ์ œ; Monti Hall Problem>์ด๋ผ๋Š” ๋ฌธ์ œ๊ฐ€ ์žˆ๋‹ค. ๋” ๋งํ•˜๋ฉด ์Šคํฌ๊ฐ€ ๋  ๊ฒƒ ๊ฐ™์œผ๋‹ˆ ๊ถ๊ธˆํ•˜๋‹ค๋ฉด ํ•œ๋ฒˆ ๋„์ „ํ•ด๋ณด์ž!

๐Ÿ‘‰ Monti Hall Problem