A guide to choose an appropriate test according to the situation
We have drawn the grid below to guide you through the choice of an appropriate statistical test according to your question and the data you have. The guide proposes a formulation the null hypothesis as well as an example in each situation. Conditions of validity of parametric tests are listed in the paragraph following the grid. When available, nonparametric equivalents are proposed. In some situations, parametric tests do not exist and so only nonparametric solutions are proposed.
For more details about statistical testing please read this tutorial.
For a quick introduction to the difference between parametric and nonparametric tests, please read this tutorial.
The displayed tests are the most commonly used tests in statistics. They are all available in XLSTAT. Please notice that the list is not exhaustive, and that many other situations / tests exist. Please scroll down to see the grid.
|Test family||Question||Data||Null Hypothesis||Example||Parametric tests||Conditions of validity (parametric tests)||Non-parametric equivalents|
|Compare locations*||Compare an observed mean to a theoretical one||Measurements on one sample and 1 theoretical mean (1 number)||Observed mean = theoretical mean||Compare an observed pollution rate to a standard||One-sample t-test||2||One sample Wilcoxon signed rank test|
|Compare two observed means (independent samples)||Measurements on two samples||means* are identical||Compare hemoglobin concentration between two groups of patients||t-test on two independent samples||1 ; 3 ; 5||Mann-Whitney's test|
|Test the equivalence between two samples||Measurements on two samples||means* are different||Check if the effect of medication A is the same as the effect of medication B on the concentration of a molecule in mice||Equivalence test (TOST)||1 ; 3 ; 5|
|Compare several observed means (independent samples)||Measurements on several samples||means* are identical||Compare corn yields according to 4 different fertilizers||ANOVA||1 ; 3 ; 4 ; 6||Kruskal-Wallis test; Mood's test|
|Compare two observed means (dependent measurements)||Two series of quantitative measurements on the same units (before-after…)||means* are identical||Comparise the mean hemoglobin concentration before and after a treatment has been applied on a group of patients||t-test on two paired samples||10||Wilcoxon's test|
|Compare several observed means (dependent measurements)||Several series of quantitative measurements on the same units||means* are identical||Follow the concentration of a trace element in a group of plants across time||Repeated measures ANOVA, mixed models||10 ; Sphericity||Friedman's test for complete block designs; Durbin, Skillings-Mack's test for incomplete block designs; Page test for cases where series scores are expected to increase or to decrease (across time for example)|
|Compare series of binary data||Compare series of binary data (dependent measurements)||Several series of binary measurements on the same units||Locations* are identical||A group of assessors (units) evaluate the presence/absence of an attribute in a group of products||McNemar's test (for 2 series); Cochran's Q test (for more than 2 series)|
|Compare variances||Compare 2 variances (could be used to test assumption 3)||Measurements on two samples||variance(1) = variance(2)||Compare the natural dispersion of size in 2 different varieties of a fruit||Fisher's test|
|Compare several variances (could be used to test assumption 3)||Measurements on several samples||variance(1) = variance(2) = variance(n)||Compare the natural dispersion of size in several different varieties of a fruit||Levene's test|
|Compare proportions||Compare an observed proportion to a theoretical one||1 observed proportion with its associated sample size, one theoretical proportion||observed proportion = theoretical proportion||Compare the proportion of females to a proportion of 0.5 in a sample||Tests for one proportion (chi-square)|
|Compare observed proportions to each other||Sample size associated to every category||proportion(1) = proportion(2) = proportion(n)||Compare the proportions of different eye colors in a sample||Chi-square|
|Compare observed proportions to theoretical ones||Sample size and theoretical proportion associated to every category||observed proportions = theoretical proportions||Compare the proportions of observed F1xF1 cross-breeding frequencies to Mendelian frequencies (1/2, 1/4, 1/2)||Multinomial Goodness-Of-Fit test|
|Association tests||Test the association between two qualitative variables||Contingency table or two qualitative variables||variable 1 & variable 2 are independent||Is the presence of a trace element linked to the presence of another trace element?||Chi-square on contingency table||1 ; 9||Exact Fisher test ;Monte Carlo method|
|Test the association between two qualitative variables across several strata||Several contingency tables or two qualitative variables with a stratum identificator||variable 1 & variable 2 are independent||Is the presence of a trace element linked to the presence of another trace element? Assessed over several sites (strata)||Cochran-Mantel-Haenszel (CMH) test|
|Test the association between two quantitative variables||Measurements of two quantitative variables on the same sample||variable 1 & variable 2 are independent||Does plant biomass change with soil Pb content?||Pearson's correlation test||7 ; 8||Spearman's correlation test|
|Test the association between a binary variable and a quantitative one||Measurements on one binary variable and one quantitative variable||the two variables are independent||Is the concentration of a molecule in rats linked to the rats' sex (M/F)?||Biserial correlation||Normality of the quantitative variable|
|Test the association between a series of proportions and an ordinal variable||Contingency table or proportions and sample sizes||Proportions do not change according to the ordinal variable||Did birth rates change from year to year during the last decade?||Cochran-Armitage trend test|
|Test the association between two tables of quantitative variables||Two tables of quantitative variables||Tables are independent||Does the evaluation of a series of products on a series of attributes change from a panel to another?||RV coefficient test|
|Test the association between two proximity matrices||Two proximity matrices||Proximity matrices are independent||Is geographic distance between populations of bees correlated with genetic distance?||Mantel's test|
|Time series tests||Test the presence of a trend across time||One series of data sorted by date (time series)||There is no trend across time for the measured variable||Did stock price change across the last 10 years?||Mann-Kendall trend test|
|Tests on distributions||Compare an observed distribution to a theoretical one||Measurements of a quantitative variable on one sample; parameters of the theoretical distribution||The observed and the theoretical distributions are the same||Do the salaries of a company follow a normal distribution with mean 2500 and standard deviation 150?||Kolmogorov-Smirnov's test|
|Compare two observed distributions||Measurements of a quantitative variable on two samples||The two samples follow the same distribution||Is the distribution of human weight the same in those two geographical regions?||Kolmogorov-Smirnov's test|
|Test the normality of a series of measurements (could be used to test assumptions 2, 4, 7)||Measurements on one sample||The sample follows a normal distribution||Is the observed sample distribution significantly different from a normal distribution?||Normality tests|
|Tests for outliers||Test for outliers||Measurements on one sample||The sample does not contain an outlier (following the normal distribution)||Is this data point an outlier?||Dixon's test / Grubbs test||Boxplot (not a statistical test)|
*Locations are means in parametric tests and mean ranks in nonparametric equivalents.
Conditions of validity of parametric tests
Validity conditions we propose are rules of thumb. There are no precise rules in literature. We strongly advise to refer to your fields’ specific recommendations.
1) Measurements are independent
2) The population from which the sample was extracted follows a normal distribution (assumed or verified)
3) Samples have equal variances
4) The residuals follow a normal distribution (assumed or verified)
5) At least 20 individuals per sample, or normality of the population of every sample verified or assumed
6) At least 20 individuals in the whole experiment, or normality of residuals assumed or verified
7) Every variable follows a normal distribution
8) At least 20 individuals in the sample (recommended)
9) Theoretical frequencies should not be < 5 in all of the table cells
10) Differences between series should follow normal distributions