This tutorial explains how to perform and interpret a Shapiro-Wilk and other normality tests in Excel using XLSTAT.
Normality tests: what null hypothesis are we challenging?
Normality tests involve the null hypothesis that the variable from which the sample is drawn follows a normal distribution. Thus, a low p-value indicates a low risk of being wrong when stating that the data are not normal. In other words, if p-value < alpha risk threshold, the data are significantly not normal.
Normality of residuals is a required assumptions in common statistical modeling methods.
Dataset for Shapiro-Wilk and other normality tests
An Excel sheet with both the data and the results can be downloaded by clicking on the button below:
Download the data
The data represent two samples each containing the average math score of 1000 students.
Setting up a Shapiro-Wilk and other normality tests
We then want to tests the normality of the two samples. Select the XLSTAT / Describing data / Normality tests, or click on the corresponding button of the Describing data menu.
Once you've clicked on the button, the dialog box appears.
Select the two samples in the Data field.
The Q-Q plot option is activated to allow us to visually check the normality of the samples.
The computations begin once you have clicked on the OK button, and the results are displayed on a new sheet.
Interpreting the results of a Shapiro-Wilk and other normality tests
The results are first displayed for the first sample and then for the second sample.
The first result displayed is the Q-Q plot for the first sample. The Q-Q plot allows to compare the cumulative distribution function (cdf) of the sample (abscissa) to the cumulative distribution function of normal distribution with the same mean and standard deviation (ordinates). In the case of sample following a normal distribution, we should observe an alignment with the first bisecting line. In the other cases some deviations from the bisecting line should be observed.
We can see here that the empiric cdf is very close to the bissecting line. The Shapiro-Wilk and Jarque-Bera confirm that we cannot reject the normality assumption for the sample. We notice that with the Shapiro-Wilk test, the risk of being wrong when rejecting the null assumption is smaller than with the Jarque-Bera test.
The following results are for the second sample. Contrary to what we observed for the first sample, we notice here on the Q-Q plot that there are two strong deviations indicating that the distribution is most probably not normal.
This gap is confirmed by the normality tests (see below) which allow to assert with no hesitation that we need to reject the hypothesis that the sample might be normally distributed.
As a conclusion, in this tutorial, we have seen how to test two sample for normality using Shapiro-Wilk and Jarque-Bera tests. These tests did not reject the normality assumption for the first sample and allowed us to reject it for the second sample.