โ€œํ™•๋ฅ ๊ณผ ํ†ต๊ณ„(MATH230)โ€ ์ˆ˜์—…์—์„œ ๋ฐฐ์šด ๊ฒƒ๊ณผ ๊ณต๋ถ€ํ•œ ๊ฒƒ์„ ์ •๋ฆฌํ•œ ํฌ์ŠคํŠธ์ž…๋‹ˆ๋‹ค. ์ „์ฒด ํฌ์ŠคํŠธ๋Š” Probability and Statistics์—์„œ ํ™•์ธํ•˜์‹ค ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค ๐ŸŽฒ

15 minute read

โ€œํ™•๋ฅ ๊ณผ ํ†ต๊ณ„(MATH230)โ€ ์ˆ˜์—…์—์„œ ๋ฐฐ์šด ๊ฒƒ๊ณผ ๊ณต๋ถ€ํ•œ ๊ฒƒ์„ ์ •๋ฆฌํ•œ ํฌ์ŠคํŠธ์ž…๋‹ˆ๋‹ค. ์ „์ฒด ํฌ์ŠคํŠธ๋Š” Probability and Statistics์—์„œ ํ™•์ธํ•˜์‹ค ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค ๐ŸŽฒ

Statistical Hypothesis

Definition. Statistical Hypothesis

A <statistical hypothesis> is a statement about the population distribution, usually, in terms of the parameter values.

Example.

Supp. we have a p-coin, I believe that it is a fair coin, on the other hand, you think it is a biased coin, in particular, you believe that $p=0.7$. What can we do?

  • $H_0: p = 0.5$
  • $H_1: p = 0.7$


Definition. Null Hypothetsis $H_0$ & Alternative Hypothesis $H_1$

  • Null Hypothetsis $H_0$: a hypothesis we expect to reject
  • Alternative Hypothesis $H_1$: a hypothesis we set out to prove

Q. How do we do <Hypothesis Test>?

A. First, we should set a <Test Statistic>!

Letโ€™s toss a coin $n$-times independently. For each toss, let $X_i$ are $1$ for head and $0$ for otherwise.

Then, $X := \sum X_i$, the (# of heads in $n$ tosses) be $X \sim \text{BIN}(n, p)$.

Then, we can use $X$ as a <Test Statistic>!

์šฐ๋ฆฌ๋Š” ์ด <Test Statistic>๋กœ ๊ฐ€์„ค $H_0$๋ฅผ reject ํ•˜๊ฑฐ๋‚˜ reject ํ•˜์ง€ ์•Š์„ ๊ฒƒ์ด๋‹ค!

์œ„์˜ $H_0: p=0.5$, $H_1: p=0.7$์˜ ๊ฒฝ์šฐ์—์„œ ์ƒ๊ฐํ•ด๋ณด์ž. ๋งŒ์•ฝ $X$๊ฐ€ large enough, ์ฆ‰ โ€œ$X \ge C$ for some $C$โ€๋ผ๋ฉด, $H_0$๋ฅผ reject ํ•˜๋Š”๊ฒŒ ํ•ฉ๋ฆฌ์ ์ด๋‹ค.

์šฐ๋ฆฌ๋Š” ์ด $H_0$๋ฅผ rejectํ•˜๋Š” ๊ธฐ์ค€์ด ๋˜๋Š” ๋ฒ”์œ„ $X \ge C$๋ฅผ <rejection region> ๋˜๋Š” <critical region>์ด๋ผ๊ณ  ํ•˜๋ฉฐ, ์ด ๋ฒ”์œ„๋ฅผ ์žก์„ ๋•Œ ์“ฐ๋Š” ๊ฐ’ $C$๋ฅผ <critial value>๋ผ๊ณ  ํ•œ๋‹ค!


T1 Error & T2 Error

Q. How to choose $C$?

<critical value> $C$์˜ ๊ฐ’์„ ์žก๊ธฐ ์œ„ํ•ด์„œ๋Š” <Type 1 Error>, <Type 2 Error>๋ฅผ ์‚ดํŽด๋ด์•ผ ํ•œ๋‹ค.

ย  reject $H_0$ not reject $H_0$
$H_0$ is true Type 1 Error good
$H_0$ is false good Type 2 Error
hypothetical error

์ด ์‚ฌ์ง„์ด Type 1, Type 2 Error๋ฅผ ๊ฐ€์žฅ ์ž˜ ํ‘œํ˜„ํ•˜๋Š” ์‚ฌ์ง„์ธ ๊ฒƒ ๊ฐ™๋‹ค ใ…‹ใ…‹ใ…‹

"It is best to make T1 & T2 errors as small as possible."

Case. Type 1 error; $\alpha$ error; ์ž˜๋ชป๋œ ์ธ์ •

\[\begin{aligned} P(\text{T1 error}) &= P(\text{reject} \; H_0 \mid H_0 \; \text{is true}) \\ &= P(X \ge C \mid p = 0.5) \end{aligned}\]

์ด๋•Œ, $P(T1)$์„ ์ตœ๋Œ€ํ•œ ์ค„์ด๋ ค๋ฉด, $C$๋ฅผ ์ตœ๋Œ€ํ•œ ํ‚ค์›Œ์„œ ์›ฌ๋งŒํ•œ ๊ฒฝ์šฐ๊ฐ€ ์•„๋‹ˆ๋ฉด $X$๊ฐ€ $X \ge C$์˜ ์กฐ๊ฑด์„ ๋งŒ์กฑ์‹œํ‚ค์ง€ ๋ชป ํ•˜๋„๋ก ๋งŒ๋“ค๋ฉด ๋œ๋‹ค. ์ฆ‰, $H_0$๋ฅผ reject ํ•˜๋Š” ๊ธฐ์ค€์„ ๋นก์„ธ๊ฒŒ ๋งŒ๋“ ๋‹ค.

Case. Type 2 error; $\beta$ error; ์ž˜๋ชป๋œ ๋ถ€์ •

\[\begin{aligned} P(\text{T2 error}) &= P(\text{not reject} \; H_0 \mid H_1 \; \text{is true}) \\ &= P(X < C \mid p = 0.7) \end{aligned}\]

์ด๋•Œ, $P(T2)$๋ฅผ ์ตœ๋Œ€ํ•œ ์ค„์ด๋ ค๋ฉด, $C$๋ฅผ ์ตœ๋Œ€ํ•œ ์ค„์—ฌ์„œ ์›ฌ๋งŒํ•˜๋ฉด $X$๊ฐ€ $X \ge C$๋ฅผ ๋งŒ์กฑ ์‹œํ‚ค๋„๋ก ๋งŒ๋“ค๋ฉด ๋œ๋‹ค. ์ฆ‰, ์›ฌ๋งŒํ•˜๋ฉด $H_0$๋ฅผ rejectํ•˜๊ฒŒ ๋งŒ๋“ ๋‹ค.

critical value

?? ๋ญ”๊ฐ€ ์ด์ƒํ•˜๋‹ค. $P(T1)$๋ฅผ ์ค„์ด๋ ค๋ฉด, $C$๋ฅผ ํ‚ค์›Œ์•ผ ํ•˜๊ณ , $P(T2)$๋ฅผ ์ค„์ด๋ ค๋ฉด, $C$๋ฅผ ์ค„์—ฌ์•ผ ํ•œ๋‹ค. ๐Ÿ˜• ๋ญ๊ฐ€ ๋งž๋Š” ๊ฑธ๊นŒ?

๋‹ต์€ $P(T1)$๊ณผ $P(T2)$, ๋‘˜ ์ค‘ ํ•˜๋‚˜๋งŒ ๊ฐ€๋Šฅํ•œ ์ž‘๊ฒŒ ๋งŒ๋“ค ์ˆ˜ ์žˆ๋‹ค๋Š” ๊ฒƒ์ด๋‹ค ๐Ÿ˜ฑ

"For a fixed sample size, we can make only one error as small as we want."

๊ทธ๋Ÿผ ๋˜๋‹ค๋ฅธ ์งˆ๋ฌธ์ด ๋– ์˜ค๋ฅธ๋‹ค.

Q. $P(T1)$๊ณผ $P(T2)$ ์ค‘ ์–ด๋Š ๊ฒƒ์„ ์ค„์—ฌ์•ผ ์ข‹์„๊นŒ?

์•„๋ž˜์˜ ๊ฒฝ์šฐ๋ฅผ ์ƒ๊ฐํ•ด๋ณด์ž.

  • $H_0$: ํ”ผ๊ณ  A is innocent
  • $H_1$: ํ”ผ๊ณ  A is guilty

์ด๋•Œ, T1 & T2 error๊ฐ€ ๋ฌด์—‡์„ ์˜๋ฏธํ•˜๋Š”์ง€ ์ž˜ ๋ณด์ž.

  • T1 error: $H_0$๊ฐ€ ์‚ฌ์‹ค์ธ๋ฐ, $H_0$๋ฅผ reject
  • T2 error: $H_1$์ด ์‚ฌ์‹ค์ธ๋ฐ, $H_1$์„ reject

๋‘ ์ƒํ™ฉ ์ค‘ ๋ญ๊ฐ€ ๋” ์•ˆ ์ข‹์„๊นŒ? ๋‹น์—ฐํžˆ โ€œT1 errorโ€์˜ ๊ฒฝ์šฐ๋‹ค! ์™œ๋ƒํ•˜๋ฉด, ๋ฌด๊ณ ํ•œ ์‚ฌ๋žŒ์„ ์œ ์ฃ„๋ผ๊ณ  ์„ ๊ณ ํ–ˆ๊ธฐ ๋•Œ๋ฌธ์ด๋‹ค!


โ€œ์•” ์ง„๋‹จโ€์ด๋ผ๋Š” ๋‹ค๋ฅธ ์ƒํ™ฉ์„ ์ƒ๊ฐํ•ด๋ณธ๋‹ค๋ฉด,

  • $H_0$: ํ™˜์ž B๋Š” ๊ฑด๊ฐ•ํ•˜๋‹ค.
  • $H_1$: ํ™˜์ž B๋Š” ์•”์ด ์žˆ๋‹ค.

  • T1 error: ์‚ฌ์‹ค ํ™˜์ž B๊ฐ€ ๊ฑด๊ฐ•ํ•œ๋ฐ, ์•” ํ™˜์ž๋กœ ์ง„๋‹จ
  • T2 error: ์‚ฌ์‹ค ํ™˜์ž B๊ฐ€ ์•”์ด ์žˆ๋Š”๋ฐ, ๊ฑด๊ฐ•ํ•˜๋‹ค๊ณ  ์ง„๋‹จ

์ด ๊ฒฝ์šฐ์—์„œ๋„ ๊ฑด๊ฐ•ํ•œ ์‚ฌ๋žŒ์„ ์•” ํ™˜์ž๋กœ ์ง„๋‹จํ•ด ์—„์ฒญ๋‚œ ๋ˆ์„ ์“ฐ๊ฒŒ ํ–ˆ์œผ๋‹ˆ โ€œT1 errorโ€๊ฐ€ ๋” ์•ˆ ์ข‹๋‹ค.

์œ„์™€ ๊ฐ™์€ ์ƒํ™ฉ์„ ๋ฐ”ํƒ•์œผ๋กœ, ๋‘˜ ์ค‘ ํ•˜๋‚˜๋งŒ ์ค„์ผ ์ˆ˜ ์žˆ๋‹ค๋ฉด, โ€œT1 errorโ€๋ฅผ ์ตœ๋Œ€ํ•œ ์ค„์—ฌ๋ผ๋Š” ๊ฒฐ๋ก ์„ ์–ป๋Š”๋‹ค.

๊ทธ๋Ÿผ โ€œT2 errorโ€๋Š”?? โ€œT2 errorโ€๋Š” ์šด์— ๋งก๊ธด๋‹ค๊ณ  ํ•œ๋‹ค ใ…‹ใ…‹ใ…‹

๊ทธ ์ด์œ ๋Š” T2 error์˜ ๊ฒฝ์šฐ, โ€œnot reject $H_0$โ€๋ผ๋Š” ๊ฒฐ๊ณผ๊ฐ€ ๋‚˜์˜ค๋Š”๋ฐ, ์ด๊ฒƒ์ด โ€œ$H_1$๋ฅผ acceptํ•œ๋‹คโ€์™€๋Š” ๋‹ค๋ฅธ ์˜๋ฏธ์ด๊ธฐ ๋•Œ๋ฌธ์ด๋‹ค. ๊ฒฐ๊ตญ T2 error์—์„œ๋Š” $H_0$์— ๋Œ€ํ•ด์„œ๋„ $H_1$์— ๋Œ€ํ•ด์„œ๋„ ์–ด๋–ค ์ง„์ˆ ๋„ ํ•  ์ˆ˜ ์—†๊ธฐ ๋•Œ๋ฌธ์—, ๊ทธ๋‚˜๋งˆ ๊ดœ์ฐฎ๋‹ค๊ณ  ๋ณด๋Š” ๊ฒƒ์ด๋‹ค!


Significance Level; $\alpha$

Definition. Significance level; size of a test; ์œ ์˜ ์ˆ˜์ค€ $\alpha$

The probability of committing a <Type 1 Error> is called the <significance level>, and we use $\alpha$ to denote the significance level.

\[\alpha = P(\text{T1 Err}) = P(\text{reject} \; H_0 \mid H_0 \; \text{is true})\]

๐Ÿ’ฅ commonly used values for $\alpha$ are $0.1$, $0.05$, $0.01$.

๐Ÿ’ฅ Interval Estimation์„ ์ˆ˜ํ–‰ํ•  ๋•Œ, ๋น„์Šทํ•œ ๊ฒƒ์„ ๋ดค์—ˆ๋‹ค! ๋ฐ”๋กœ <Confidence Level> $1-\alpha$๋‹ค!

$\alpha$๋Š” 1์ข… ์˜ค๋ฅ˜์˜ ๊ฐ€๋Šฅ์„ฑ์ด๋‹ค. Critical Value $C$์— ์˜์กดํ•˜๋Š” ๊ฐ’์œผ๋กœ $C$๊ฐ€ ์—„๊ฒฉํ•ด์งˆ ์ˆ˜๋ก 1์ข… ์˜ค๋ฅ˜์˜ ๊ฐ€๋Šฅ์„ฑ์ธ $\alpha(C)$์˜ ๊ฐ’์€ ์ค„์–ด๋“ค ๊ฒƒ์ด๋‹ค.

๋ณดํ†ต์€ 1์ข… ์˜ค๋ฅ˜์˜ ์ƒํ•œ์„ ์€ $0.1$, $0.05$, $0.01$ ์ •๋„๋กœ ์„ค์ •ํ•˜๊ณ , ์ด๊ฒƒ์„ <p-value>์™€ ๋น„๊ตํ•œ๋‹ค. <p-value>๋Š” ์•„๋ž˜์—์„œ ๊ณง ๋‹ค๋ฃฐ ๊ฒƒ์ด๋‹ค.


Example.

$H_0: p=0.5$ vs. $H_1: p=0.7$

We toss a coin 20 times independently and obtained 14 heads. Test this at $\alpha = 0.0577$.

Solve.

Let $X = \sum X_i \sim \text{BIN}(20, p)$.

The critical region is $\{ X \ge C \}$.

Here, $\alpha = P(X \ge C \mid p=0.5) = P(\text{BIN}(20, 0.5) \ge C)$.

Then, by the cdf of $\text{BIN}(20, 0.5)$,

\[P(\text{BIN}(20, 0.5) \le 13) = 0.9423\]

Therefore, $C = 14$.

We will reject $H_0$ if (# of heads in 20 tosses) is $\ge 14$.

Since $x=14$, we reject $H_0$ at $\alpha = 0.0577$. $\blacksquare$

Now, we consider T2 error case! If T2 error is small, then we might accept $H_0$.

Example.

(Same situation with the above example)

Solve.

\[P(\text{T2 Err}) = P(X < C \mid H_1 \; \text{is true}) = P(\text{BIN}(20, 0.7) \le C)\]

Weโ€™ve found that $C=14$ from the privous example. Then,

\[P(\text{BIN}(20, 0.7) \le 14) = 0.392 \approx 0.4\]

If we fail to reject $H_0$, then we canโ€™t accept $H_0$ because $P(T2)$ is too height to not accept $H_0$.

Example.

(Now, everything is same but $H_1: p=0.8$)

Solve.

The critical point $C$ is same as the previous one, because $H_0$ doesnโ€™t change. โ†’ $C=14$

Now, T2 Error is

\[P(\text{T2 Err}) = P(X < 14 \mid p=0.8) = P(\text{BIN}(20, 0.8) < 14>) \approx 0.0867\]

In this time, if we fail to reject $H_0$, then we can accept $H_0$!!


Power of Test; $\gamma(\theta)$

Definition. Power of Test; ๊ฒ€์ •๋ ฅ

The <power of a test> $\gamma(\theta)$ at $\theta=\theta_1$ is defined as the probability of rejection of $H_0$ when $\theta=\theta_1$ is a true value.

\[\gamma(\theta_1) = P(\text{reject} \; H_0 \mid \theta = \theta_1)\]

๐Ÿ’ฅ NOTE: $1-P(\text{T2 Err}) = \gamma(\theta_1)$

์ฆ‰, <power of test>๋Š” Null hypo $H_0$๊ฐ€ ๊ฑฐ์ง“์ผ ๋•Œ, $H_0$๋ฅผ ๊ธฐ๊ฐ์‹œํ‚ค๋Š” ํ™•๋ฅ ์ด๋‹ค!

<๊ฒ€์ •๋ ฅ>์€ T2 Error๊ฐ€ ํด์ˆ˜๋ก ๊ทธ ๊ฐ’์ด ์ž‘์•„์ง„๋‹ค! ๊ทธ๋ž˜์„œ <๊ฒ€์ •๋ ฅ>์„ ๋†’์ด๊ณ  ์‹ถ๋‹ค๋ฉด, T2 Error๋ฅผ ์ค„์ด๋Š” ์ ์ ˆํ•œ Alternative Hypothesis $H_1: \theta = \theta_1$๋ฅผ ์ œ์‹œํ•ด์•ผ ํ•œ๋‹ค.

์ด <power of test>๋Š” ์•„๋ž˜ ์ƒํ™ฉ์ผ ๋•Œ, ๊ทธ ๊ฐ’์ด ์ปค์ง„๋‹ค.

  • T2 Error๋ฅผ ์ค„์ด๋Š” ์ ์ ˆํ•œ Alternative Hypothesis $H_1: \theta = \theta_1$
  • <significance level> $\alpha$ โ–ฒ
  • ํ‘œ๋ณธ์˜ ํฌ๊ธฐ $n$ โ–ฒ

p-value

์ง€๊ธˆ๊นŒ์ง€ ์šฐ๋ฆฌ๋Š” <significance level> $\alpha$ ๊ฐ’์„ $0.1$, $0.05$ ๋“ฑ์œผ๋กœ ์„ค์ •ํ•˜๊ณ , ์ด์— ๋”ฐ๋ฅธ <critical value> $C$๋ฅผ ๊ตฌํ•˜๊ณ , ์ด๊ฑธ Test Statistics $X$์™€ ๋น„๊ตํ•ด์„œ $H_0$๋ฅผ ๊ธฐ๊ฐํ• ์ง€ ๊ฒฐ์ •ํ–ˆ๋‹ค. ๊ทธ๋Ÿฐ๋ฐ $\alpha$ ๊ฐ’์„ ์„ค์ •ํ•˜์ง€ ์•Š๊ณ , Critical Value $C$๋ฅผ reject์ด ๊ฐ€๋Šฅํ•œ ๊ฒฝ๊ณ„์ธ $C = X$๋กœ ์„ค์ •ํ•œ ํ›„, $\alpha$์„ ์—ญ์œผ๋กœ ๊ตฌํ•  ์ˆ˜ ์žˆ์ง€ ์•Š์„๊นŒ? <p-value>๊ฐ€ ๋”ฑ ๊ทธ๋Ÿฐ ๋…€์„์ด๋‹ค!

Definition. p-value; ์œ ์˜ ํ™•๋ฅ 

The <p-value> of a test is the lowest significance level at which $H_0$ can be rejected with the given data.

์ฃผ์–ด์ง„ ๋ฐ์ดํ„ฐ์˜ Test Statistic $X$๋ฅผ ๊ธฐ์ค€์œผ๋กœ $H_0$๋ฅผ reject ํ•  ์ˆ˜ ์žˆ๋Š” ๊ฐ€์žฅ ์ž‘์€ $\alpha$ ๊ฐ’์ด ๋ฐ”๋กœ <p-value>์ด๋‹ค!

Q. ์™œ โ€˜๊ฐ€์žฅ ์ž‘์€โ€™ $\alpha$ ๊ฐ’์ผ๊นŒ?

A. T1 Error์— ๋Œ€ํ•ด ์–˜๊ธฐํ•  ๋•Œ, Critical Value $C$๋ฅผ ๋นก์„ธ๊ฒŒ ์žก์„ ์ˆ˜๋ก T1 Error์˜ ๊ฐ€๋Šฅ์„ฑ์ด ์ค„์–ด๋“ ๋‹ค๊ณ  ํ–ˆ๋‹ค. ์ฆ‰, $C$๊ฐ€ ๋นก์…€ ์ˆ˜๋ก $\alpha$ ๊ฐ’์ด ์ž‘์•„์ง„๋‹ค. ๋ณดํ†ต์€ $X > C$์ด๊ธฐ์— $H_0$๋ฅผ reject ํ•˜๋Š”๋ฐ, ์ด๊ฑธ ๊ฒฝ๊ณ„์ธ $C = X$๊นŒ์ง€ $C$ ๊ฐ’์„ ๋Œ์–ด์˜ฌ๋ฆผ์œผ๋กœ์จ $\alpha$ ๊ฐ’์„ ์ตœ๋Œ€ํ•œ ๋‚ฎ์ถ˜ ๊ฒƒ์ด๋‹ค. ์ด๋Ÿฐ ์ด์œ  ๋•Œ๋ฌธ์— <p-value>๊ฐ€ ์ž‘์„ ์ˆ˜๋ก ์ •ํ•ด๋‘” $C_{0.1}$, $C_{0.05}$ ๊ฐ’๋ณด๋‹ค ๋” ๋นก์„ผ ์กฐ๊ฑด์—์„œ๋„ $H_0$๊ฐ€ reject ๋จ์„ ๋งํ•œ๋‹ค.

์˜ˆ๋ฅผ ํ†ตํ•ด ์ œ๋Œ€๋กœ ์ดํ•ดํ•ด๋ณด์ž!

Example.

Everything is same to above situation.

  • $H_0: p = 0.5$
  • $H_1: p = 0.7$

Toss a coin 20 times independently, and obtained 14 heads.

BUT, in this time, we donโ€™t have significance level $\alpha$!!

Solve.

The rejection region is $\{ X \ge C\}$.

$X = 14$๋ผ๋Š” ์ฃผ์–ด์ง„ ๋ฐ์ดํ„ฐ์—์„œ $H_0$๋ฅผ ๊ธฐ๊ฐํ•˜๋ ค๋ฉด, $X=14$๊ฐ€ ์ € rejection region์— ํฌํ•จ๋˜์–ด์•ผ ํ•œ๋‹ค. $X$๊ฐ€ rejection region์— ํฌํ•จ๋˜๋„๋ก ํ•˜๋Š” ๊ฐ€์žฅ ์ž‘์€ $C$ ๊ฐ’์€ $C=14$์ด๋‹ค!

์–ด๋ž? ์šฐ๋ฆฌ๋Š” ์ด๋ฏธ $C=14$์ผ ๋•Œ์˜ T1 Error๋ฅผ ๊ตฌํ–ˆ๋‹ค.

\[0.0577 = P(\text{BIN(20, 0.5)} \ge 14)\]

์ฆ‰, significance level $\alpha=0.0577$๊ฐ€ $H_0$๋ฅผ ๊ธฐ๊ฐํ•˜๋Š” ๊ฐ€์žฅ ์ž‘์€ ๊ฐ’์ด๋‹ค. $0.0577$์ด ์ด๋ฒˆ ๊ฒ€์ •(Test)์˜ โ€œp-valueโ€๋‹ค!!

์šฐ๋ฆฌ๋Š” โ€œp-valueโ€๋ฅผ ์ง€ํ‘œ๋กœ ์‚ผ์•„ $H_0$๋ฅผ ๊ธฐ๊ฐํ• ์ง€ ๊ฒฐ์ •ํ•  ์ˆ˜ ์žˆ๋‹ค.

๋งŒ์•ฝ, significance level $\alpha$์™€ ๋น„๊ตํ–ˆ์„ ๋•Œ, โ€œp-valueโ€์˜ ๊ฐ’์ด ๋” ์ž‘๋‹ค๋ฉด, ์ฆ‰ $\alpha$๊ฐ€ ์ƒ์„ฑํ•˜๋Š” ๋„“์ด๊ฐ€ โ€œp-valueโ€๊ฐ€ ์ƒ์„ฑํ•˜๋Š” ๋„“์ด๋ฅผ ํฌํ•จํ•œ๋‹ค๋ฉด, ์ด๊ฒƒ์€ ์ฃผ์–ด์ง„ ๋ฐ์ดํ„ฐ๊ฐ€ $\alpha$์˜ critical region์— ์†ํ•œ๋‹ค๋Š” ๋ง์ด๊ธฐ ๋•Œ๋ฌธ์—, $H_0$๋ฅผ ๊ธฐ๊ฐํ•œ๋‹ค!

๋ฐ˜๋Œ€๋กœ โ€œp-valueโ€์˜ ๊ฐ’์ด ํฌ๋‹ค๋ฉด, $H_0$๋ฅผ ๊ธฐ๊ฐํ•  ์ˆ˜ ์—†๋‹ค.


๋ณดํ†ต p-value๊ฐ€ 5%(=0.05)๋ณด๋‹ค ์ž‘๋‹ค๋ฉด โ€œ์œ ์˜ํ•œ ์ฐจ์ด๊ฐ€ ์žˆ๋‹คโ€๊ณ  ์–˜๊ธฐํ•œ๋‹ค. ์ด๋•Œ โ€˜์œ ์˜ํ•œ ์ฐจ์ดโ€™๋Š” ์‹คํ—˜์œผ๋กœ ์–ป์€ ๊ฒฐ๊ณผ๊ฐ€ ๊ธฐ์กด ์ด๋ก ์ธ $H_0$์ด ์˜ˆ์ƒํ•˜๋Š” ๊ฒฐ๊ณผ์™€ ์ฐจ์ด๊ฐ€ ํฌ๋‹ค๋Š” ๊ฒƒ์„ ๋งํ•œ๋‹ค. ๋”ฐ๋ผ์„œ ๊ธฐ์กด ์ด๋ก  $H_0$๋ฅผ reject ํ•ด์•ผ ํ•œ๋‹ค๋Š” ๊ฒฐ๋ก ์„ ์œ ๋„ํ•œ๋‹ค.


๊ฐœ์ธ์ ์œผ๋กœ <p-value>๋Š” ๊ทธ ์˜๋ฏธ๊ฐ€ ์ž์ฃผ ํ—ท๊ฐˆ๋ ค์„œ ์—ฌ๋Ÿฌ ์˜๋ฏธ์™€ ํ•ด์„์„ ํ•จ๊ป˜ ๋ณด๋ฉด ๋„์›€์ด ๋  ๊ฒƒ ๊ฐ™๋‹ค.

  • $H_0$๋ฅผ reject ํ•  ์ˆ˜ ์žˆ๋Š” ๊ฐ€์žฅ ์ž‘์€ $\alpha$ ๊ฐ’
  • ๊ธฐ์กด ์ด๋ก  $H_0$๊ฐ€ ๋งž๋‹ค๋Š” ๊ฐ€์ • ํ•˜์—, ์–ป์–ด์ง„ Test Statistic $X$๊ฐ€ ๋‚˜์˜ฌ ํ™•๋ฅ .
    • ์ด ํ™•๋ฅ ์ด ๋‚ฎ๋‹ค๋Š” ๊ฒƒ์€ ๊ธฐ์กด ์ด๋ก  $H_0$๊ฐ€ ๋งž๋‹ค๋Š” ๊ฐ€์ •์ด ํ‹€๋ฆฐ ๊ฒƒ์ด ๋œ๋‹ค. (ํ†ต๊ฒŒ์  ๊ท€๋ฅ˜๋ฒ•)
  • ์‹คํ—˜ ๊ฒฐ๊ณผ๊ฐ€ ๊ธฐ์กด ์ด๋ก  $H_0$์™€ ์–‘๋ฆฝํ•˜๋Š” ์ •๋„๋ฅผ $[0, 1]$์˜ ์ˆ˜์น˜๋กœ ํ‘œํ˜„ํ•œ ๊ฒƒ.
    • <p-value> ๊ฐ’์ด ์ž‘์„์ˆ˜๋ก ๋ฐ์ดํ„ฐ์™€ ๊ธฐ์กด ์ด๋ก  $H_0$๋Š” ์–‘๋ฆฝ ๋ถˆ๊ฐ€๋Šฅ
  • ์šฐ์—ฐ์„ฑ์˜ ์ •๋„
    • <p-value>๊ฐ€ ๋‚ฎ์„ ์ˆ˜๋ก, ์‹คํ—˜ ๊ฒฐ๊ณผ๊ฐ€ ์šฐ์—ฐ์ด ์•„๋‹ ๊ฑฐ๋ผ๋Š” ๋ง

๋งบ์Œ๋ง

์ด์ œ โ€œํ†ต๊ณ„์  ๊ฒ€์ •(Statistical Test)โ€๋ฅผ ์ˆ˜ํ–‰ํ•˜๊ธฐ ์œ„ํ•ด ํ•„์š”ํ•œ ๊ธฐ๋ณธ์ ์ธ ๋‚ด์šฉ์€ ๋‹ค ์‚ดํŽด๋ดค๋‹ค. ๋‹ค์Œ ํฌ์ŠคํŠธ๋ถ€ํ„ฐ ์ƒํ™ฉ์— ๋”ฐ๋ผ ํ†ต๊ณ„์  ๊ฒ€์ •์„ ์–ด๋–ป๊ฒŒ ์ˆ˜ํ–‰ํ•˜๋Š”์ง€ ์‚ดํŽด๋ณผ ์˜ˆ์ •์ด๋‹ค. ๊ทธ๋ ‡๊ฒŒ ์–ด๋ ต์ง„ ์•Š๊ณ , ์š”๊ตฌํ•˜๋Š” ๊ฒƒ๋“ค์„ ์ž˜ ํŒŒ์•…ํ•ด์„œ ์ˆœ์„œ์— ๋งž๊ฒŒ ๊ณ„์‚ฐํ•˜๊ธฐ๋งŒ ํ•˜๋ฉด ๋œ๋‹ค.

์šฐ๋ฆฌ๊ฐ€ ์ถ”์ •(Estimation)์—์„œ ์‚ดํŽด๋ณธ ์ˆœ์„œ์™€ ๋™์ผํ•˜๊ฒŒ ๊ฒ€์ •(Testing)์„ ์‚ดํŽด๋ณด์ž.