Hypothesis

  • Hypothesis is a claim about a population parameter.
  • Null hypothesis H0H_0 is the maintained assumption unless there’s strong evidence against it.
  • Alternative hypothesis H1H_1

Core Logic

The core logic of Hypethesis Testing is actually simple. We set H0H_0 as assumption, then based on this assumption, we compute the probability that our test statistics happens (usually computed from sampling).

It’s intuitive to see if under the hypothesis H0H_0,

  • suppose we are using the critical value approach. If our test statistic surpasses the threshold, this tells that under the assumption H0H_0, our test statistic is unlikely to happen, thus our assumption is likely to be false.

Type I and II Errors

A false rejection of H0H_0 is called Type I Error(rejecting H0H_0 while H0H_0 is true)

A false acceptance of H0H_0 is called a Type II Error (accepting H0H_0 while H1H_1 is true).

Significance Level and Power of Test

Significance level α\alpha is defined as α=P(Reject H0H0 is true)\alpha=\mathbb{P}(\text{Reject }H_0 | H_0\text{ is true}).

Power of the test π=1P(Accept H0H1 is true)=1β\pi=1-\mathbb{P}(\text{Accept }H_0 | H_1 \text{ is true})=1-\beta, which is the probability of correctly rejecting H0H_0.

pp-value. For a different α\alpha, we have to repeat the experiment, so we use pp-value to avoid repeat. The pp-value is the smallest α\alpha at which H0H_0 is still rejected.

Testing Procedure

Testing Procedure

  1. specify the null and alternative hypothesis
  2. construct the test statistic
  3. derive the distribution of the test statistic under the null
  4. specify a level of significance
  5. determine the decision rule by finding the critical value
  6. study the power of test

Common Cases

Here, some common testings and scenarios are listed:

If there’s only 11 population involving …

Case Link
Given population variance, check if the population mean is as claimed. 👉 Here
Check if population mean without knowledge of population variance. 👉 Here. Also introduces tt-distribution
Check population proportion under Bernoulli distribution 👉 Here
  • If we only know the sampled variance s2s^2, but we want to check the population variance σ2\sigma^2. 👉 Here.

If there’s 22 populations involving …

  • Matched-pair problem: apart from the factor under study, the pairs should resemble one another as closely as possible; or repeated measurements. 👉 Here
  • We have two dataset selected from 22 distributions with known variance, and want to check if their population mean is as claimed. 👉 Here
  • The above situation can be further extended to where we don’t know the exact variance but only know they are equal, i.e. σx=σy\sigma_x=\sigma_y 👉 Here
  • The above situation is further relaxed that the variances may also be unequal. AKA Behrens-Fisher Problem 👉 Check out
  • Given two Bernoulli distributed populations, we want to examine on their proportion. 👉 Check out here
  • Given two population, we want to test their variances without usage of means. 👉 Check out