Your data analysis solution

Parallel Coordinates Visualization in Excel tutorial

2017-03-02

This tutorial shows how to draw parallel coordinates plots in Excel using XLSTAT. Those plots can be useful to describe clusters from clustering analyses.

Dataset to create a Parallel Coordinates plot

An Excel sheet with both the data and the results can be downloaded by clicking here.

The data used in this tutorial have been extracted from a 1994 survey by the American Census Bureau. The data set is such that half of the observations corresponds to individuals with a revenue below 50k$, and the other half to individuals with a revenue greater that 50k$. For all the individuals in the sample, the country of origin is the USA.

Our goal is to visualize if some of the descriptors (Age, Number of years of study, Race, Sex, Hours-per-week) influence the Revenue of the individuals.

Setting up the Parallel Coordinates plot dialog box

Once XLSTAT is activated, select the XLSTAT / Visualizing data / Parallel Coordinates command, or click on the corresponding button of the Visualizing Data toolbar (see below).

barpcor.gif

Once you have clicked on the button, the dialog box appears.

Select the data on the Excel sheet. This tool accepts that you mix numerical and nominal variables. The Groups information is used to color the lines.

We activated the Mean lines option to let XLSTAT display for each group a line that corresponds to the mean of the quantitative variables and to the mode of the nominal variables.

The Rescale option allows to compare how the data are distributed for all the variables and facilitates the visualization.

pcor1.gif

Move on to the Chart tab where you can decide how the plot will look like. Select the option Display as many lines as possible.

pcor2.gif

Then, after you have clicked on the OK button, a chart is displayed on a new Excel sheet (because the Sheet option has been selected for outputs).

Interpreting a Parallel Coordinates plot

In the Parallel Coordinates plot you can see that being a white man, in the upper bracket of ages, with a higher number of years of study, and a high level of hours worked per week increases the likelihood of having a revenue above 50k$. However we notice that the number of hours is not very dsicriminant as the means of the two groups (50k$) are close.

pcor3.gif

This video shows you how to run this tutorial.

1c26995d494fb3061dd0ae8571ffc0a4@xlstat.desk-mail.com
https://cdn.desk.com/
false
desk
Loading
seconds ago
a minute ago
minutes ago
an hour ago
hours ago
a day ago
days ago
about
false
Invalid characters found
/customer/portal/articles/autocomplete
9283