This tutorial shows how to perform a **Johnson transformation **on your data in **Excel** using the XLSTAT statistical software.

#### Included in

XLSTAT-Base XLSTAT-Sensory XLSTAT-MarketingXLSTAT-Forecast XLSTAT-Biomed XLSTAT-EcologyXLSTAT-Psy XLSTAT-Quality XLSTAT-Premium## Goal of this tutorial

Many statistical methods assume that either the original data or some outputs are normally distributed. While some phenomena naturally follow a normal distribution, this is most of the time, not the case. While this might not always be a problem for making decisions when the conditions on normality are mild or if the decisions are not critical, this can become a real problem if the decision can be of high impact. In such cases, a transformation is required. In this example, we will use the Johnson transformation.

## Dataset for variable transformation

An Excel sheet with both the data and the results can be downloaded by clicking on the button below:

Download the data

The dataset contains the measurements of waste in the production for 47 batches. We would like to perform a regression with several process variables but the hypothesis of Normality of the variable Level of waste is violated. This can be easily verified using the Normality test and the Histogram XLSTAT functions. The results are displayed below.

We, therefore, need to make a transformation of this variable before attempting a linear regression.

## Setting up a Johnson transformation within XLSTAT

Once XLSTAT is activated, select the **XLSTAT / Preparing data / Variables transformation command**.

The Variables transformation dialog box opens. In the **General **tab, select the variables you wish to transform in the Data field. In this example, we select the variable **Level of waste** in column B.

The most general transformation is an unbiased standardization (Standardize (n-1)). However, there are more transformations available when you activate the option **Other**.

In the **Transformations **tab, select the option **Johnson **transformation. Set the alpha equal to 0.05 and select the Shapiro-Wilk test.

For the Johnson transformation, the algorithm described by Chou et al. (1998) based on the percentiles has been implemented in XLSTAT. The method allows to both find the best form of the three possible Johnson distributions and to estimate good values of the parameters. While the authors use the Shapiro Wilk test to assess the normality, XLSTAT allows you to use this same test or the Anderson-Darling test.

Click on **OK** to start the computations.

## Results of the Johnson transformation

In the result sheet called **Variables transformation**, you will find the Transformed data table.

Let's check now the Normality tests and the histograms on the transformed data. The p-value of all four normality tests are higher than 0.05 thus we cannot reject the null hypothesis that the variable from which the sample was extracted follows a Normal distribution. The histogram also indicates that the transformed variable Level of waste follows a Normal distribution.