Point Estimation
โํ๋ฅ ๊ณผ ํต๊ณ(MATH230)โ ์์ ์์ ๋ฐฐ์ด ๊ฒ๊ณผ ๊ณต๋ถํ ๊ฒ์ ์ ๋ฆฌํ ํฌ์คํธ์ ๋๋ค. ์ ์ฒด ํฌ์คํธ๋ Probability and Statistics์์ ํ์ธํ์ค ์ ์์ต๋๋ค ๐ฒ
Introduction to Estimation
โ<Statistics> is the area of science which can make inferences from data set.โ
โ<Statistical Inference; ํต๊ณ์ ์ถ๋ก > means making generalization about the population properties based on a random sample.โ
Supp. someone gave you some data set $\{ x_1, \dots, x_n \}$ and it is known that this data set is taken from a normal random sample $X_i \sim N(\mu, 1)$.
Q. You are asked to estimate $\mu$. What can be a good estimate of $\mu$ from the sample?
A. $\bar{x}$, sample mean
why? by LLN, $\bar{x} \rightarrow \mu$ as $n \rightarrow \infty$.
์์ sample mean $\bar{x}$ ๊ฐ์ด ๋ชจ์ง๋จ(population)์ ์ฑ์ง์ ์ถ๋ก ํ๋ ๊ฒ์ <์ถ์ (Estimation)>์ด๋ผ๊ณ ํ๋ค.
์ถ์ ์๋ <Point Estimation>๊ณผ <Interval Estimation>, 2๊ฐ์ง ๋ฐฉ์์ด ์กด์ฌํ๋ค.
population mean $\mu$์ ์ถ์ ํ๊ธฐ ์ํด sample mean $\bar{x}$๋ฅผ ์ฌ์ฉํ๋ ๊ฒ์ <Point Estimation> ๋ฐฉ์์ด๋ค.
๋ง์ฝ โpopulation mean $\mu$๋ ์ด๋ค ๋ฒ์(interval) $(a, b)$์์ ๋์ ํ๋ฅ ๋ก ์กด์ฌํ ๊ฒ์ด๋คโ๋ผ๊ณ interval $(a, b)$์ ์ ์ํ๋ ๋ฐฉ์์ <Interval Estimation>์ด๋ผ๊ณ ํ๋ค.
ex) $P\left( \mu \in (a, b) \right) \approx 0.99 \quad \text{or} \quad 0.95$.
๐ฅ Note: For true distribution $N(\mu, \sigma^2)$, $\mu$, $\sigma$ are unknown, and not random!!
Point Estimation
Let $X_1, \dots, X_n$ be a random sample and $X_i \sim f(x; \theta)$ for some pdf(or pmf), and let $x_1, \dots, x_n$ be sample points.
A <Point Estimation> of some population parameter $\theta$ is a single value $\hat{\theta}$ of statistic1 $\hat{\Theta}$.
์ด๋, statistic $\hat{\Theta}$๋ฅผ estimator๋ผ๊ณ ํ๋ฉฐ, estimator $\hat{\Theta}$๋ Random Variable์ด๋ค.
(hat $\hat{x}$์ด ๋ถ์ผ๋ฉด random sample๋ก๋ถํฐ ์ ๋๋๋ ๋์์ด๋ค.)
Example.
Let $X_1, X_2, \dots, X_n$ be a random sample taken from $N(\mu, \sigma^2)$.
Q1. What can be a point estimator of $\mu$?
A1. sample mean, $\bar{X} = \dfrac{X_1 + \cdots + X_n}{n}$.
Q2. How about a point estimator of $\sigma^2$?
A2. sample variance, $\displaystyle S^2 = \dfrac{1}{n-1} \sum^n_i (X_i - \bar{X})^2$ where $E[S^2] = \sigma^2$
or $\displaystyle \hat{S}^2 = \dfrac{1}{n} \sum^n_i (X_i - \bar{X})^2$ where $E[\hat{S}^2] = \dfrac{n-1}{n} \sigma^2$.
Q3. ๋ estimator ์ค ์ด๋ค ๊ฒ์ด ๋ ์ข์๊ฐ?
A3. ๋ estimator์ <bias>๋ฅผ ๋น๊ตํ๋ค!
Unbiased Estimator
Definition. unbiased estimator ๐ฅ
A statistic $\hat{\Theta}$ is called an <unbiased estimator> if
\[E[\hat{\Theta}] = \theta \quad \text{for all} \quad \theta\]์ฆ, <Estimator>์ ํ๊ท ์ ์ทจํ์ ๋, population parameter $\theta$๊ฐ ์ ๋๋๋ estimator๋ฅผ ๋งํ๋ค!
$E[\hat{\Theta} - \theta]$ is the โbiasโ of $\hat{\Theta}$ related to $\theta$.
๐ฅ $E[\hat{\Theta} - \theta] = 0$, unbiased!
Example.
Let $X_1, X_2, \dots, X_n$ be a random sample taken from $N(\mu, \sigma^2)$.
Then, $\bar{X}$ is an unbiased estimator of $\mu$, and $S^2$ is an unbiased estimator of $\mu$.
Note that $E \left[ \frac{2X_1 + 0.5 X_2 + 0.5 X_3 + \cdots + X_n}{n}\right] = \mu$, so that one is also an unbiased estimator!
(Generalization) Letโs consider a weigted average $\displaystyle\bar{X}_w = \sum^n_i w_i X_i$. This estimator is also an unbiased estimator.
\[E\left[ \bar{X}_w \right] = \sum^n_i w_i E[X_i] = \cancelto{1}{\left( \sum^n_i w_i \right)} \mu = \mu\]Q. Why we use $\bar{X}$ instead of $\bar{X}_w$ for an estimator of $\mu$?
A. Because the โvarianceโ of $\bar{X}$ is less than $\bar{X}_w$!
\[\text{Var}(\bar{X}) = E \left[ (\bar{X} - \mu)^2 \right] = \frac{\sigma^2}{n} \le \text{Var}(\bar{X}_w)\]Variance of Estimator
Definition. variance of estimator ๐ฅ
For an estimator $\hat{\Theta}$, the variance of estimator is
\[\text{Var}(\hat{\Theta}) = E \left[ (\hat{\Theta} - E[\hat{\Theta}])^2 \right]\]* Variance์ ์ ์๋ฅผ ๊ทธ๋๋ก ๋ฐ๋ฅธ๋ค. ๊ทธ๋ฌ๋ $\hat{\Theta}$๊ฐ statistic, ์ฆ function of random samples $\hat{\Theta} = f(X_1, โฆ, X_n)$์ด๊ธฐ ๋๋ฌธ์ ์ค์ ๊ณ์ฐ์ random sample์ distribution $X_i \sim g(x; \mu, \sigma)$๋ฅผ ํ์ฉํ๋ฉด ๋๋ค. $\text{Var}(\hat{\Theta}) = \text{Var}(g(X_1, โฆ, X_n))$
Claim.
Among all weighted averages $\{ \bar{X}_w : w = (w_1, \dots, w_n), \sum w_i = 1\}$, $\bar{X}$ has the smallest variance.
We know that $\displaystyle\text{Var}(\bar{X}) = \frac{\sigma^2}{n}$.
\[\begin{aligned} \text{Var}(\bar{X}_w) &= \text{Var}\left( \sum^n_i w_i X_i \right) \\ &= \sum^n_i w_i^2 \cdot \text{Var}(X_i) \\ &= \sigma^2 \cdot \sum^n_i w_i^2 \end{aligned}\]For $\sum w_i = 1$,
\[0 \le \sum^n_i \left(w_i - \frac{1}{n}\right)^2 = \sum w_i^2 - \frac{2}{n} \sum w_i + n \cdot \frac{1}{n^2} = \sum w_i^2 - \frac{1}{n}\]๋ฐ๋ผ์,
\[\text{Var}(\bar{X}) = \frac{\sigma^2}{n} \le \sigma^2 \cdot \sum^n_i w_i^2 = \text{Var}(\bar{X}_w)\]$\blacksquare$
The Most Efficient Estimator
โbiasโ์ โvarianceโ๋ฅผ ์ข ํฉํด ์ด๋ค estimator๊ฐ ์ข์ estimator์ธ์ง ํ๋จํ ์ ์๋ค.
Definition. the most efficient estimator of $\theta$ ๐ฅ
Among all unbiased estimators of parameter $\theta$, the one with the smallest variance is called <the most efficient estimator of $\theta$>.
Remark.
When $X_i$โs are iid $N(\mu, \sigma^2)$, it is known that $\bar{X}$ is the most efficient estimator of $\mu$.
Q. ์ most efficient estimator๋ unbiased estimator ์ค์์ ๊ณ ๋ฅด๋ ๊ฑธ๊น? biased estimator ์ค์์ variance๊ฐ ๊ฐ์ฅ ์์๊ฒ ์์ ์๋ ์์ง ์์๊น?
A. Yes, it is possible that a biased estimator can have smaller variance than an unbiased estimator.
Exercise.
Let $X_1, \dots, X_n$ be iid $N(\mu, \sigma^2)$.
Let $\displaystyle S^2 := \frac{1}{n-1} \sum^n_i (X_i - \bar{X})^2$ and $\displaystyle \hat{S}^2 := \frac{1}{n} \sum^n_i (X_i - \bar{X})^2$
Show that $\text{Var}(S^2) > \text{Var}(\hat{S}^2)$.
(Homework๐)
Mean Squared Error
<MSE; Mean Squared Error>๋ฅผ Point Estimator์ ํ๊ฐ ์งํ๋ก ์ฌ์ฉํ ์๋ ์๋ค!
Definition. MSE; Mean Squared Error ๐ฅ
The <MSE; Mean Squared Error> of an estimator is defined as
\[\text{MSE} := E \left[ \left( \hat{\Theta} - \theta \right)^2 \right]\]Claim.
where $\text{Bias} := E \left[ \hat{\Theta} - \theta \right]$.
Proof.
(Homework๐) / (Solution)
์ผ๋จ ์์ ๋ช ์ ๋ ์ฐธ์ด๋ผ๊ณ ๋ฐ์๋ค์ด๊ณ , ์ด ๋ช ์ ๊ฐ ์ ์ค์ํ์ง๋ฅผ ์ค๋ช ํด๋ณด๊ฒ ๋ค.
Estimator $\hat{\Theta}$๊ฐ statistic์ด๋ผ๋ ๊ฒ์ ๊ธฐ์ตํ๋๊ฐ? Estimator $\hat{\Theta}$๋ random sample $X_i$์ ํจ์๋ก ํํ๋๋ค.
\[\hat{\Theta} = f(X_1, X_2, ..., X_n)\]๊ทธ๋์ ์ด $\hat{\Theta}$์ mean, variance๋ ๋ชจ๋ random sample $X_i$์ ๋ถํฌ๋ฅผ ์ฌ์ฉํด ์์ฃผ ์ฝ๊ฒ ์ ๋ํ ์ ์๋ค. ์๋ฅผ ๋ค์ด, unbiased estimator ๋ฌธ๋จ์์ ๋ค์๋ sample mean $\bar{X}$์ ์ฌ๋ก๋ฅผ ๋ค์ ๋ณด๋ฉดโฆ
Let random sample $X_i$ is taken from $N(\mu, \sigma^2)$. Then, the $E(\bar{X})$ is
\[E(\bar{X}) = E \left( \frac{\sum^n_i X_i}{n} \right) = \frac{1}{n} \sum^n_i E[X_i] = \frac{1}{n} \cdot n \mu = \mu\]๋ง์ฐฌ๊ฐ์ง๋ก Estimator $\hat{\Theta}$์ ๋ถ์ฐ๋ random sample์ ๋ถํฌ๋ฅผ ์ด์ฉํด ์ฝ๊ฒ ์ ๋ํ ์ ์๋ค.
๊ทธ๋ฐ๋ฐ, Estimator์ MSE๋ ๊ทธ๋ ์ง ์๋ค. mean๊ณผ variance์๋ ๋ฌ๋ฆฌ random sample์ ๋ถํฌ์์ ์ ๋ํ๋ ๋ฐฉ๋ฒ์ด straight ํ๊ฒ ๋ ์ค๋ฅด์ง ์์ ๊ฒ์ด๋ค. ๊ทธ๋์ ์์ โMSE๋ Estimator์ ๋ถ์ฐ๊ณผ bias์ ์ ๊ณฑ์ ํฉ์ด๋คโ๋ผ๋ ๋ช ์ ๋ฅผ ํ์ฉํด Estimator์ MSE๋ฅผ ๊ตฌํ๋ ๊ฒ์ด ํจ์ฌํจ์ฌ ์ฝ๋ค.
๋ง์ฝ ์ด๋ฐ ๋ฐฐ๊ฒฝ์ ๋ชจ๋ฅด๊ณ , MSE๋ฅผ ๋ง์ฃผํ๋ค๋ฉด ๊ฝค ํผ๋์ค๋ฝ๋ค. ๋ณธ์ธ์ ๋จธ์ ๋ฌ๋์ด๋ ๋ฐ์ดํฐ ๋ถ์์ ํ๋ฉด์ ๋ชจ๋ธ์ MSE๋ฅผ ๋จผ์ ์ ํ๋๋ฐ, Estimator์ MSE๋ฅผ ๊ตฌํ๋ ๊ฒ์ด ๊ฝค ๋ฌ๊ธ์๋ค๊ณ ๋๊ผ๋ค. ๋ชจ๋ธ์ MSE๋ 300.5
์ ๊ฐ์ด ๊ฐ์ผ๋ก ์ป์ด์ง๋ค. ๊ทธ๋ฌ๋ Estimator์ MSE๋ ๋ฐฐ๊ฒฝ์ ์ดํดํ๊ณ ์์ ๋ช
์ ๋ฅผ ๋ฐ์ ๋ค์ฌ์ผ ํ๋ค.
์ด์ด์ง๋ ํฌ์คํธ์์๋ ๋๋ค๋ฅธ estimation ๋ฐฉ์์ธ <Interval Estimation>์ ๋ํด ์ดํด๋ณด๊ฒ ๋ค. ์ด๋, ์ฃผ์ด์ง Interval์ด ์ผ๋ง๋ ์ข์์ง ์๋ ค์ฃผ๋ ์งํ๊ฐ ๋ฐ๋ก <confidence level> $1 - \alpha$๋ค!
๐ Interval Estimation
ํฌ์คํธ์ ์ ์ ๋์๋ HW ๋ฌธ์ ๋ค์ ์๋์ ํฌ์คํธ์ ๋ณ๋๋ก ์ ๋ฆฌํด๋์๋ค.
๐ Statistics - PS1
-
<statistic; ํต๊ณ๋>์ random samples $X_1, โฆ, X_n$์ ํจ์ $f(X_1, โฆ, X_n)$์ ๋งํ๋ค. Sampling Distribution ํฌ์คํธ ์ฐธ๊ณ ย ↩