β€œν™•λ₯ κ³Ό 톡계(MATH230)” μˆ˜μ—…μ—μ„œ 배운 것과 κ³΅λΆ€ν•œ 것을 μ •λ¦¬ν•œ ν¬μŠ€νŠΈμž…λ‹ˆλ‹€. 전체 ν¬μŠ€νŠΈλŠ” Probability and Statisticsμ—μ„œ ν™•μΈν•˜μ‹€ 수 μžˆμŠ΅λ‹ˆλ‹€ 🎲

11 minute read

β€œν™•λ₯ κ³Ό 톡계(MATH230)” μˆ˜μ—…μ—μ„œ 배운 것과 κ³΅λΆ€ν•œ 것을 μ •λ¦¬ν•œ ν¬μŠ€νŠΈμž…λ‹ˆλ‹€. 전체 ν¬μŠ€νŠΈλŠ” Probability and Statisticsμ—μ„œ ν™•μΈν•˜μ‹€ 수 μžˆμŠ΅λ‹ˆλ‹€ 🎲

Mean


Definition.

The <expectation> or <mean> of a RV $X$ is defined as

\[\mu := E[x] := \begin{cases} \displaystyle \sum_x x f(x) && X \; \text{is a discrete with pmf} f(x) \; \\ \displaystyle \int^{\infty}_{\infty} x f(x) dx && X \; \text{is a continuous with pdf} \; f(x) \end{cases}\]

λ§Œμ•½ RV $X$에 ν•¨μˆ˜ $g(x)$λ₯Ό μ·¨ν•œλ‹€λ©΄, <Expectation>은 μ•„λž˜μ™€ 같이 ꡬ할 수 μžˆλ‹€.

Theorem.

Let $X$ be a random variable with probability distribution $f(x)$. The expected value of the random variable $g(X)$ is

\[\mu_{g(X)} = E\left[g(X)\right] = \sum_x g(x) f(x) \quad \text{if } X \text{ is discrete RV}\]

and

\[\mu_{g(X)} = E\left[g(X)\right] = \int^{\infty}_{\infty} g(x) f(x) \quad \text{if } X \text{ is continuous RV}\]

($g(x)$λ₯Ό μ·¨ν•˜λ„ μ—¬μ „νžˆ $x$의 μ •μ˜μ—­μ€ μœ μ§€λ˜λ―€λ‘œ, μœ„μ™€ 같이 $g(x) f(x)$λ₯Ό μ‚¬μš©ν•˜λŠ” 것은 νƒ€λ‹Ήν•˜λ‹€.)

ps) μˆ˜μ—… μ‹œκ°„μ— κ΅μˆ˜λ‹˜κ»˜μ„œ 이산 RV에 λŒ€ν•œ 증λͺ…은 μ‰½κ²Œ ν•  수 μžˆμ§€λ§Œ, 연속 RV에 λŒ€ν•œ 증λͺ…은 μ’€ κΉŒλ‹€λ‘­λ‹€κ³  ν•˜μ…¨λ‹€.


μ΄λ²ˆμ—λŠ” joint distributions에 λŒ€ν•œ <Expectation>을 μ‚΄νŽ΄λ³΄μž.

Definition.

Let $X$ and $Y$ be RVs with joint probability distribution $f(x, y)$. The expected value of the RV $g(X, Y)$ is

\[\mu_{g(X, Y)} = E\left[g(X, Y)\right] = \sum_x \sum_y g(x, y) f(x, y) \quad \text{if } X \text{ and } Y \text{ is discrete RV}\] \[\mu_{g(X, Y)} = E\left[g(X, Y)\right] = \int^{\infty}_{-\infty} \int^{\infty}_{-\infty} g(x, y) f(x, y) \; dx dy \quad \text{if } X \text{ and } Y \text{ is continuous RV}\]


Conditional Distribution에 λŒ€ν•΄μ„œλ„ <Expectation>을 생각해볼 수 μžˆλ‹€.

Definition.

\[E\left[ X \mid Y = y \right] = \begin{cases} \displaystyle \sum_x x f(x \mid y) && X \; \text{is a discrete with joint pmf} f(x, y) \; \\ \displaystyle \int^{\infty}_{\infty} x f(x \mid y) \; dx && X \; \text{is a continuous with joint pdf} \; f(x, y) \end{cases}\]

Linearity of Expectation

<Expectation>은 <Linearity>λΌλŠ” μ•„μ£Ό 쒋은 μ„±μ§ˆμ„ 가진닀.

Theorem.

Let $a, b \in \mathbb{R}$, then $E\left[aX + b\right] = aE[X] + b$.

μœ„μ˜ 정리가 λ§ν•΄μ£ΌλŠ” 것은 <Expectation>이 Linear Operatorμž„μ„ 말해쀀닀!! 🀩

쒀더 ν™•μž₯ν•΄μ„œ κΈ°μˆ ν•΄λ³΄λ©΄,

Theorem.

\[E\left[g(X) + h(X)\right] = E\left[g(X)\right] + E\left[h(X)\right]\]


Theorem.

\[E\left[g(X, Y) + h(X, Y)\right] = E\left[g(X, Y)\right] + E\left[h(X, Y)\right]\]

Expectation with Independence

λ§Œμ•½ 두 RV $X$, $Y$κ°€ μ„œλ‘œ <독립>이라면, 두 RV의 곱에 λŒ€ν•œ <Expectation>을 μ‰½κ²Œ ꡬ할 수 μžˆλ‹€.

Theorem.

If $X$ and $Y$ are independent, then

\[E[XY] = E[X]E[Y]\]

Variance and Covariance

두 RV $X$, $Y$κ°€ λ™μΌν•œ 평균을 가지더라도; $E[X] = \mu = E[Y]$ RV의 κ°œλ³„ 값듀이 평균 $\mu$λ‘œλΆ€ν„° λ–¨μ–΄μ Έ μžˆλŠ” μ •λ„λŠ” λ‹€λ₯Ό 수 μžˆλ‹€. <λΆ„μ‚° Variance>λŠ” 이런 ν‰κ· μœΌλ‘œλΆ€ν„°μ˜ 퍼진 정도λ₯Ό μΈ‘μ •ν•˜λŠ” μ§€ν‘œλ‘œ μ•„λž˜μ™€ 같이 μ •μ˜ν•œλ‹€.


Definition.

The <variance> of a RV $X$ is defined as

\[\text{Var}(X) = E[(X-\mu)^2]\]

and $\sigma = \sqrt{\text{Var}(X)}$ is called the <standard deviation> of $X$.

μ•„λž˜μ˜ 곡식을 μ‚¬μš©ν•˜λ©΄, $\text{Var}(X)$λ₯Ό 쒀더 μ‰½κ²Œ ꡬ할 수 μžˆλ‹€.


Theorem.

\[\begin{aligned} \text{Var}(X) &= E[(X-\mu)^2] = E\left[ X^2 - 2 \mu X + \mu^2 \right] \\ &= E[X^2] - 2 \mu E[X] + \mu^2 \\ &= E[X^2] - 2 \mu \cdot \mu + \mu^2 \\ &= E[X^2] - \mu^2 = E[X^2] - \left(E[X]\right)^2 \end{aligned}\]

β€œλΆ„μ‚° = μ œν‰ - ν‰μ œβ€, 고등학ꡐ λ•Œ 배운 곡식이닀!


<Expectation>은 LinearityλΌλŠ” 쒋은 μ„±μ§ˆμ„ 가지고 μžˆμ—ˆλ‹€. <λΆ„μ‚° Variance>μ—μ„œλŠ” μ–΄λ–»κ²Œ λ˜λŠ”μ§€ μ‚΄νŽ΄λ³΄μž.

Theorem.

For any $a, b \in \mathbb{R}$,

\[\text{Var}(aX + b) = a^2 \text{Var}(X)\]

Covariance

<곡뢄산 Covariance>λŠ” 두 RV 사이에 μ–΄λ–€ <관계 relation>이 μžˆλŠ”μ§€λ₯Ό μ‘°μ‚¬ν•˜λŠ” μ§€ν‘œλ‹€. <곡뢄산>은 μ•„λž˜μ™€ 같이 μ •μ˜λœλ‹€.

Definition.

The <covariane> of $X$ and $Y$ is defined as

\[\begin{aligned} \sigma_{XY} := \text{Cov}(X, Y) &= E \left[ (X - \mu_X) (Y - \mu_Y) \right] \\ &= E(XY) - E(X)E(Y) \end{aligned}\]
  • $\text{Cov}(X, X) = \text{Var}(X)$
  • $\text{Cov}(aX + b, Y) = a \cdot \text{Cov}(X, Y)$
  • $\text{Cov}(X, c) = 0$

μ•žμ—μ„œ μ‚΄νŽ΄λ΄€μ„ λ•Œ, 두 RV $X$, $Y$κ°€ 독립이라면, $E(XY) = E(X)E(Y)$κ°€ λ˜μ—ˆλ‹€. λ”°λΌμ„œ 두 RVκ°€ 독립일 λ•ŒλŠ” $\text{Cov}(X, Y) = 0$이 λœλ‹€! κ·ΈλŸ¬λ‚˜ μ£Όμ˜ν•  점은 λͺ…μ œμ˜ μ—­(ζ˜“)인 $\text{Cov}(X, Y) = 0$일 λ•Œ, 두 RVκ°€ 항상 λ…λ¦½μž„μ„ 보μž₯ν•˜μ§€λŠ” μ•ŠλŠ”λ‹€!

<Covariance>은 두 RV의 Linear Combination에 λŒ€ν•œ 뢄산을 ꡬ할 λ•Œλ„ μ‚¬μš©ν•œλ‹€.

Let $a, b, c \in \mathbb{R}$, then

\[\text{Var}(aX + bY + c) = a^2 \text{Var}(X) + b^2 \text{Var}(Y) + 2 \text{Cov}(X, Y)\]

증λͺ…은 $\text{Var}(aX + bY + c)$의 의미λ₯Ό κ·ΈλŒ€λ‘œ μ „κ°œν•˜λ©΄ μ‰½κ²Œ μœ λ„ν•  수 μžˆλ‹€.

\[\text{Var}(aX + bY + c) = E\left[ \left( (X+Y) - (\mu_X + \mu_Y) \right)^2 \right]\]

Correlation

<곡뢄산>을 쒀더 보기 μ‰½κ²Œ Normalize ν•œ 것이 <Correlation>이닀.

Definition.

The <correlation> of $X$ and $Y$ is defined as

\[\rho_{XY} := \text{Corr}(X, Y) = \frac{\text{Cov}(X, Y)}{\sqrt{\text{Var}(X)} \sqrt{\text{Var}(Y)}}\]
  • if $\rho_{XY} > 0$, $X$ and $Y$ are positively correlated.
  • if $\rho_{XY} < 0$, $X$ and $Y$ are negatively correlated.
  • if $\rho_{XY} = 0$, $X$ and $Y$ are uncorrelated.

λ§Œμ•½ 두 RVκ°€ μ™„λ²½ν•œ μ„ ν˜•μ„±μ„ 보인닀면, $\rho_{XY}$κ°€ μ•„λž˜μ™€ κ°™λ‹€.

  • if $Y = aX + b$ for $a > 0$, then $\text{Corr}(X, Y) = 1$
  • if $Y = aX + b$ for $a < 0$, then $\text{Corr}(X, Y) = -1$

μœ„μ˜ λͺ…μ œλŠ” κ·Έ 역도 μ„±λ¦½ν•œλ‹€. 증λͺ…은 μ•„λž˜μ˜ Exerciseμ—μ„œ μ§„ν–‰ν•˜κ² λ‹€.

<Correlation>은 $[-1, 1]$의 값을 κ°–λŠ”λ‹€. μ΄λŠ” <μ½”μ‹œ-μŠˆλ°”λ₯΄νŠΈ 뢀등식>을 톡해 μœ λ„ν•  수 μžˆλ‹€!

Cauchy-Schwarrtz inequality :

\[\left( \sum a_i b_i \right)^2 \le \sum a_i^2 \sum b_i^2\]

Correlation 식을 μ˜λ―Έμ— 맑게 ν’€μ–΄μ“°λ©΄ μ•„λž˜μ™€ κ°™λ‹€.

\[\begin{aligned} \text{Corr}(X, Y) &= \frac{\text{Cov}(X, Y)}{\sqrt{\text{Var}(X)} \sqrt{\text{Var}(Y)}} = \frac{E[(X-\mu_X)(Y - \mu_Y)]}{\sqrt{E[(X-\mu_X)^2]} \sqrt{E[(Y-\mu_Y)^2]}} \\ &= \frac{\sum (X-\mu_X)(Y - \mu_Y)}{\sqrt{\sum (X-\mu_X)^2} \sqrt{\sum (Y-\mu_Y)^2}} \end{aligned}\]

이제 μœ„μ˜ 식을 μ œκ³±ν•΄μ„œ μ‚΄νŽ΄λ³΄λ©΄

\[(\rho_{XY})^2 = \left( \frac{\sum (X-\mu_X)(Y - \mu_Y)}{\sqrt{\sum (X-\mu_X)^2} \sqrt{\sum (Y-\mu_Y)^2}} \right)^2 = \frac{\left( \sum (X-\mu_X)(Y - \mu_Y) \right)^2 }{\sum (X-\mu_X)^2 \sum (Y-\mu_Y)^2}\]

<μ½”μ‹œ-μŠˆλ°”λ₯΄μΈ  뢀등식>μ—μ„œ μš°λ³€μ„ μ’Œλ³€μœΌλ‘œ μ΄λ™ν•˜λ©΄, μ•„λž˜μ™€ 같은 뢀등식이 μ„±λ¦½ν•œλ‹€.

\[\frac{\left( \sum a_i b_i \right)^2}{\sum a_i^2 \sum b_i^2} \le 1\]

이λ₯Ό <Correlation>의 μ œκ³±μ‹μ— μ μš©ν•˜λ©΄ μ•„λž˜μ™€ κ°™λ‹€.

\[(\rho_{XY})^2 = \frac{\left( \sum (X-\mu_X)(Y - \mu_Y) \right)^2 }{\sum (X-\mu_X)^2 \sum (Y-\mu_Y)^2} \le 1\]

λ”°λΌμ„œ $(\rho_{XY})^2 \le 1$μ΄λ―€λ‘œ

\[-1 \le \rho_{XY} \le 1\]

$\blacksquare$

μΆ”κ°€λ‘œ <Correlation>은 β€œν‘œμ€€ν™”β€ν•œ RV의 κ³΅λΆ„μ‚°μœΌλ‘œλ„ 해석할 수 μžˆλ‹€.

$Z = \dfrac{X-\mu_X}{\sigma_X}$, $W = \dfrac{Y-\mu_Y}{\sigma_Y}$라고 ν‘œμ€€ν™”ν•œλ‹€λ©΄, 이 λ‘˜μ˜ 곡뢄산은 $X$, $Y$에 λŒ€ν•œ Correlationκ³Ό κ°™λ‹€.

\[\text{Var}(Z, W) = \text{Corr}(X, Y)\]

λ”± 보면 증λͺ… ν•  수 μžˆμ„ 것 κ°™μ•„μ„œ λ”°λ‘œ μœ λ„λŠ” ν•˜μ§€ μ•Šκ² λ‹€.


Q1. $\text{Var}(X) = 0$λŠ” 무엇을 μ˜λ―Έν•˜λŠ”κ°€?

A1.


Q2. $\text{Cov}(X, Y) = 0$μ΄μ§€λ§Œ, 두 RVκ°€ 독립이 μ•„λ‹Œ 예λ₯Ό μ œμ‹œν•˜λΌ.


Q3. Prove that $-1 \le \text{Corr}(X, Y) \le 1$.


Q4. Prove that if $\text{Corr}(X, Y) = 1$, then there exist $a>0$ and $b\in\mathbb{R}$ s.t. $Y = aX + b$.

펼쳐보기

A1. $p(x)$κ°€ delta-functionμž„μ„ μ˜λ―Έν•œλ‹€.


A2. $Y=X^2$으둜 μ„€μ •ν•˜λ©΄ μ‰½κ²Œ 보일 수 μžˆλ‹€. λ…λ¦½μž„μ„ 보이기 μœ„ν•΄ $p(x, y)$λ₯Ό ꡬ해야 ν•  μˆ˜λ„ μžˆλŠ”λ°, 이것 μ—­μ‹œ 적절히 잘 μ„€μ •ν•΄μ£Όλ©΄ μ‰½κ²Œ reasonableν•˜κ²Œ λ””μžμΈ ν•  수 μžˆμ„ 것이닀.


A3. & A4. Q3λŠ” 이미 μœ„μ—μ„œ 증λͺ…을 ν–ˆλ‹€. κ·ΈλŸ¬λ‚˜ λ‹€λ₯Έ λ°©μ‹μœΌλ‘œλ„ 증λͺ…ν•  수 μžˆλ‹€! πŸ‘‰ 이곳의 [2, 3]pλ₯Ό μ°Έκ³ ν•˜λΌ.


μ΄μ–΄μ§€λŠ” λ‚΄μš©μ—μ„œλŠ” <평균>κ³Ό <λΆ„μ‚°>에 λŒ€ν•œ μ•½κ°„μ˜ 좔가적인 λ‚΄μš©μ„ μ‚΄νŽ΄λ³Έλ‹€.

πŸ‘‰ Chebyshev’s Inequality

그리고 Discrete RVμ—μ„œμ˜ 기본적인 Probability Distribution을 μ‚΄νŽ΄λ³Έλ‹€.

  • Bernoulli Distribution
  • Binomial Distributions
  • Multinomial Distribution
  • Hypergeometric Distributions
  • etc…

πŸ‘‰ Discrete Probability Distributions - 1

πŸ‘‰ Discrete Probability Distributions - 2