Su solución de análisis de datos

Which descriptive statistics tool should you choose?

12/05/2017

A guide to choose a descriptive statistics tool according to the situation

Describing data is an essential part of statistical analysis aiming to provide a complete picture of the data before moving to advanced methods. The type of statistical methods used for this purpose are called descriptive statistics. They include both numerical (e.g. mean, mode, variance…) and graphical tools (e.g. histogram, boxplot…) which allow to summarize a set of data and extract important information such as central tendencies and dispersion. Moreover, we can use them to describe the association between several variables. 

In order to choose the right descriptive statistics tool, we need to consider the types and the number of variables we have as well as the objective of the analysis. Based on these three criteria we have generated a grid that will help you decide which tool to use according to your situation. 

The first column of the grid refers to data types:

  • Quantitative: containing variables that describe quantities of the objects of interest. The values are numbers. The weight of an infant is an example of a quantitative variable.
  • Qualitative: containing variables that describe qualities of the objects of interest. These values are called categories, also referred as levels or modalities. The gender of an infant is an example of a qualitative variable. The possible values are the categories male and female
  • Mixed: containing both types of variables.

The second column indicates the number of variables. The proposed tools can handle either the description of one (univariate analysis) or the description of the relationships between two (bivariate analysis) or several variables. The grid also includes a column with an example for each situation.

Grid

Please note that the list below is not exhaustive. However, it contains the most commonly used descriptive statistics, all available in XLSTAT. 

Data description Objective Example Numerical tool Graphical tool
Quantitative One variable (univariate analysis) Estimate a frequency distribution How many people per age class attended this event? (here the investigated variable is age in a quantitative form) Frequency table Histogram
Measure the central tendency of one sample What is the average grade in a classroom? Mean, median, mode Box plot
Scattergram
Strip plot
Measure the dispersion of one sample How widely or narrowly are the grades dispersed around the mean grade in a classroom? Range, standard deviation, variance, coefficient of variation, quartiles Box plot
Scattergram
Strip plot
Characterize the shape of a distribution Is the employee wage distribution in a company symmetric? Skewness and kurtosis coefficients Histogram
Measure the position of a value within a sample What data point can be used to split the sample into 95% of low values and 5% of high values? Quantiles or Percentiles Box plot
Detect extreme values Is the height of 184cm an extreme value in this group of students?   Box plot
Two variables (bivariate analysis) Describe the association between two variables Does plant biomass increase or decrease with soil Pb content? Correlation coefficients
 
Correlation Map
Scatterplot
 
Several variables Describe the association between multiple variables What is the evolution of the life expectancy, the fertility rate and the size of population over the last 10 years in this country? Correlation coefficients
 
Motion charts
(up to 3 variables to describe over time)
Scatterplot or
3D Scatterplot
(up to 3 variables to describe)
 
Describe the association between three variables under specific conditions How to visualize the proportions of three ice cream ingredients in several ice scream samples?   Ternary diagram
Two matrices of several variables Describe the association between two matrices Does the evaluation of a series of products differ from a panel to another? RV coefficient  
Qualitative One variable (univariate analysis) Compute the frequencies of different categories How many clients said they are satisfied by the service and how many said they were not? Frequency table
 
Bar chart
Pie chart
 
Detect the most frequent category Which is the most frequent hair color in this country? Mode Bar chart
Pie chart
 
Two variables (bivariate analysis) Measure the association between two variables Does the presence of a trace element change according to the presence of another trace element? Contingency table (or cross-tab)
 
3D graph of contingency table
Stacked or clustered bars
 
Mixed
 (quantitative & qualitative)
Two variables
(bivariate analysis)
Describe the relationship between a binary and a continuous variable Is the concentration of a molecule in rats linked to the rats' sex (M/F)? Biserial correlation Boxplot
Describe the relationship between a categorical and a continuous variable Does sepal length differ between three flower species? Univariate descriptive statistics for the quantitative variable within each category of the qualitative variable Boxplot
Several variables
 
Describe the relationship between one categorical and two quantitative variables Does the amount of money spent on this commercial website change according to the age class and the salary of the customers?   Scatterplot
(with groups)
Source: ​Introductory Statistics: Exploring the World Through Data: Robert Gould and Collen Ryan
1c26995d494fb3061dd0ae8571ffc0a4@xlstat.desk-mail.com
https://cdn.desk.com/
false
desk
Cargando
hace #{num} segundos
hace un minuto
hace #{num} minutos
hace una hora
hace #{num} horas
hace un día
Hace #{num} días
sobre
false
Se han encontrado caracteres no válidos
/customer/portal/articles/autocomplete
9283