What is a statistical test?

A statistical test is a way to evaluate the evidence the data provides against a hypothesis. This hypothesis is called the null hypothesis and is often referred to as H0. Under H0, data are generated by random processes. In other words, the controlled processes (the experimental manipulations for example) do not affect the data. Usually, H0 is a statement of equality (equality between averages or between variances or between a correlation coefficient and zero, for example).

H0 is usually opposed to a hypothesis called the alternative hypothesis, referred to as H1 or Ha. Most of the time, the alternative hypothesis is the one the user would like to demonstrate. It involves a statement of difference (difference between averages for example).

If the data does not provide enough evidence against H0, H0 is not rejected. If instead, the data shows strong evidence against H0, H0 is rejected and Ha is considered as true with a quantified (low) risk of being wrong. A statistical test allows to reject / not to reject the H0 hypothesis.
Let’s have a look at an example !
Suppose you're comparing two varieties of apples and you're wondering whether the average size of apples from variety 1 differs from the average size of apples from variety 2. Here's how we would write down the null and alternative hypotheses:

H0: average size of apple from variety 1 = average size of apple from variety 2.
Ha: average size of apple from variety 1 ≠ average size of apple from variety 2.

Bar charts of the real data
Bar charts of data following the null hypothesis
After looking at the above charts, a statistical test can be used to answer the question: how different is my data from equivalent data under the null hypothesis? In other words, how different is my data from equivalent data under a hypothesis where apple size does not change according to variety? This looks like our original question, more or less.

Other examples of null hypotheses versus alternative challenging hypotheses

H0: the insulin rate of the group of patients receiving a placebo is equal to the insulin rate of patients receiving a medication.
Ha: the insulin rate of the group of patients receiving a placebo is different from the insulin rate of patients receiving a medication.
H0: the presence of attribute A does not affect consumer preference toward this product.
Ha: the presence of attribute A affects consumer preference toward this product.
H0: there is no trend in this time series.
Ha: there is a trend in this time series.
H0: Corn fields submitted to fertilizers A, B, C or D produce equivalent yields.
Ha: at least one fertilizer induces a difference in corn yield.

How to interpret the output of a statistical test: the significance level alpha and the p-value

When setting up a study, a risk threshold above which H0 should not be rejected must be specified. This threshold is referred to as the significance level alpha and should lay between 0 and 1. Low alpha’s are more conservative. The choice of alpha should depend on how dangerous it is to reject H0 while it is true. For example, in a study aiming at demonstrating the benefits of a medical treatment, alpha should be low. On the other hand, when screening the effects of many attributes on the appreciation of a product, alpha’s could be more moderate. Very often, alpha is set at 0.05 or 0.01 or 0.001.

The statistical test produces a number called p-value (that is also bounded between 0 and 1). The p-value is the probability of obtaining the data or more extreme data under the null hypothesis.

More practically, the p-value should be compared to alpha:

If p-value < alpha, we reject H0 and accept Ha with a risk proportional to p-value of being wrong.
If p-value > alpha, we do not reject H0, but this does not necessarily imply that we should accept it. It either means that H0 is true, or that H0 is false but our experiment and statistical test were not “strong” enough to lead to a p-value lower than alpha.

What is statistical power and in what case can we accept H0?

Statistically speaking, the ability of an experiment/a test to lead to a rejection of the null hypothesis is called statistical power. The power of an experiment increases with alpha, with the precision of the measurements and with the number of repetitions. Power also changes according to the type of statistical tests being used (see the last section of this tutorial). Power may be computed before or after an experiment. It equals 1 minus the risk of being wrong when accepting H0 (also called risk beta). So the higher the power, the lower theWhat is the difference between a parametric and a nonparametric test?risk of being wrong when accepting H0 (when p-value > alpha, of course).

In summary, if p > alpha AND if statistical power is high enough (usually higher than 0.95), then we may accept H0 with a risk proportional to (1 – Power) of being wrong.

What are the kinds of statistical tests?

A statistical test can be:

How do I know what statistical test to use?

Here is a grid which will help you choose an appropriate test according to your question.[

How can I run a statistical test in XLSTAT?

In XLSTAT, there is an entire section dedicated to statistical tests.
test a hypothesis menu in XLSTAT For example, if you want to test the difference between the means of two samples, then you can use our two sample t-test.
Two sample t-test and z-test dialog box in XLSTAT
You just have to select your samples from your Excel sheet and the analysis will compute a test statistic and calculate whether there is a significant difference between the means of the samples.

Was this article useful?

What is a statistical test?