Data anonymization tutorial in Excel
This tutorial shows how to set up and perform data anonymization in Excel using the XLSTAT software.
Dataset for running anonymization in XLSTAT
The data used in this example is a sample of survey results. Rows represent respondents and columns represent private information gathered via the survey such us postal code, study level and salary. The objective of this tutorial is to anonymize the survey results.
Setting up the data anonymization dialog box in XLSTAT
Once XLSTAT is open, go to the Preparing data menu and select Data anonymization.
The Data anonymization dialog box appears.
In the General tab, select the data on the Excel sheet you want to transform. Select the Sheet option to display the results on a new sheet, check the Variable labels option to consider the first row of the data table as labels and add column A in the observation labels field.
To anonymize the variable labels (sex, postal code, study level and salary), activate the Anonymized labels option.
In the Options tab, select the random method and checked the trim spaces option.
In the Missing data tab, choose the way you want to deal with missing values.
In the Outputs tab, select all the proposed outputs. Then, click on the OK button to start computations. The results are displayed on a new sheet named Data anonymization.
Interpreting the results of data anonymization in XLSTAT
The first output is a table summarizing the initial data which is displayed in the same order as in the original data sheet.
The second table presents the anonymized variables. Qualitative ones have been replaced by character strings and quantitative ones have been shuffled. Variable labels have been replaced by random character strings.
The last table provides the correspondence between the initial and the anonymized qualitative variables.
Was this article useful?