统计学: Concepts of Hypothesis Testing

Hypothesis

Hypothesis is a claim about a population parameter.
Null hypothesis $H_0$ is the maintained assumption unless there’s strong evidence against it.
Alternative hypothesis $H_1$

Core Logic

The core logic of Hypethesis Testing is actually simple. We set $H_0$ as assumption, then based on this assumption, we compute the probability that our test statistics happens (usually computed from sampling).

It’s intuitive to see if under the hypothesis $H_0$ ,

suppose we are using the critical value approach. If our test statistic surpasses the threshold, this tells that under the assumption $H_0$ , our test statistic is unlikely to happen, thus our assumption is likely to be false.

Type I and II Errors

A false rejection of $H_0$ is called Type I Error(rejecting $H_0$ while $H_0$ is true)

A false acceptance of $H_0$ is called a Type II Error (accepting $H_0$ while $H_1$ is true).

Significance Level and Power of Test

Significance level $\alpha$ is defined as $\alpha=\mathbb{P}(\text{Reject }H_0 | H_0\text{ is true})$ .

Power of the test $\pi=1-\mathbb{P}(\text{Accept }H_0 | H_1 \text{ is true})=1-\beta$ , which is the probability of correctly rejecting $H_0$ .

$p$ -value. For a different $\alpha$ , we have to repeat the experiment, so we use $p$ -value to avoid repeat. The $p$ -value is the smallest $\alpha$ at which $H_0$ is still rejected.

Testing Procedure

Testing Procedure

specify the null and alternative hypothesis
construct the test statistic
derive the distribution of the test statistic under the null
specify a level of significance
determine the decision rule by finding the critical value
study the power of test

Common Cases

Here, some common testings and scenarios are listed:

If there’s only $1$ population involving …

Case	Link
Given population variance, check if the population mean is as claimed.	👉 Here
Check if population mean without knowledge of population variance.	👉 Here. Also introduces $t$ -distribution
Check population proportion under Bernoulli distribution	👉 Here

If we only know the sampled variance $s^2$ , but we want to check the population variance $\sigma^2$ . 👉 Here.

If there’s $2$ populations involving …

Matched-pair problem: apart from the factor under study, the pairs should resemble one another as closely as possible; or repeated measurements. 👉 Here
We have two dataset selected from $2$ distributions with known variance, and want to check if their population mean is as claimed. 👉 Here
The above situation can be further extended to where we don’t know the exact variance but only know they are equal, i.e. $\sigma_x=\sigma_y$ 👉 Here
The above situation is further relaxed that the variances may also be unequal. AKA Behrens-Fisher Problem 👉 Check out
Given two Bernoulli distributed populations, we want to examine on their proportion. 👉 Check out here
Given two population, we want to test their variances without usage of means. 👉 Check out