We have two dataset selected from two distributions, i.e. xiN(μx,σx2),yiN(μy,σy2)x_i\sim \mathcal{N}(\mu_x, \sigma_x^2), y_i\sim\mathcal{N}(\mu_y, \sigma_y^2).

Then

z=xˉyˉσx2/nx+σy2/nyz=\frac{\bar{x}-\bar{y}}{\sqrt{\sigma_x^2/n_x + \sigma_y^2/n_y}}

follows the N(0,1)\mathcal{N}(0,1) distribution under H0H_0 because

  1. E[xˉyˉ]=μxμy=0\mathbb{E}[\bar{x}-\bar{y}]=\mu_x-\mu_y=0
  2. Var(xˉyˉ)=Var(xˉ)+Var(yˉ)=σx2nx+σy2ny\text{Var}(\bar{x}-\bar{y})=\text{Var}(\bar{x})+\text{Var}(\bar{y})=\frac{\sigma_x^2}{n_x}+\frac{\sigma_y^2}{n_y}
  3. xˉyˉ\bar{x}-\bar{y} is normally distributed.
H0H_0 H1H_1 pp-value Decision Rule
z>zαz\gt z_\alpha
z<zαz\lt-z_\alpha
z>zα/2\vert z \vert \gt z_{\alpha/2}