Hypothesis
- Hypothesis is a claim about a population parameter.
- Null hypothesis is the maintained assumption unless there’s strong evidence against it.
- Alternative hypothesis
Core Logic
The core logic of Hypethesis Testing is actually simple. We set as assumption, then based on this assumption, we compute the probability that our test statistics happens (usually computed from sampling).
It’s intuitive to see if under the hypothesis ,
- suppose we are using the critical value approach. If our test statistic surpasses the threshold, this tells that under the assumption , our test statistic is unlikely to happen, thus our assumption is likely to be false.
Type I and II Errors
A false rejection of is called Type I Error(rejecting while is true)
A false acceptance of is called a Type II Error (accepting while is true).
Significance Level and Power of Test
Significance level is defined as .
Power of the test , which is the probability of correctly rejecting .
-value. For a different , we have to repeat the experiment, so we use -value to avoid repeat. The -value is the smallest at which is still rejected.
Testing Procedure
Testing Procedure
- specify the null and alternative hypothesis
- construct the test statistic
- derive the distribution of the test statistic under the null
- specify a level of significance
- determine the decision rule by finding the critical value
- study the power of test
Common Cases
Here, some common testings and scenarios are listed:
If there’s only population involving …
| Case | Link |
|---|---|
| Given population variance, check if the population mean is as claimed. | 👉 Here |
| Check if population mean without knowledge of population variance. | 👉 Here. Also introduces -distribution |
| Check population proportion under Bernoulli distribution | 👉 Here |
- If we only know the sampled variance , but we want to check the population variance . 👉 Here.
If there’s populations involving …
- Matched-pair problem: apart from the factor under study, the pairs should resemble one another as closely as possible; or repeated measurements. 👉 Here
- We have two dataset selected from distributions with known variance, and want to check if their population mean is as claimed. 👉 Here
- The above situation can be further extended to where we don’t know the exact variance but only know they are equal, i.e. 👉 Here
- The above situation is further relaxed that the variances may also be unequal. AKA Behrens-Fisher Problem 👉 Check out
- Given two Bernoulli distributed populations, we want to examine on their proportion. 👉 Check out here
- Given two population, we want to test their variances without usage of means. 👉 Check out