Pasar al contenido principal

Removing duplicates in Excel

This tutorial will show you how to quickly remove duplicate rows in Excel using the XLSTAT software.

Dataset for removing duplicates

The data are fictitious and were created for this tutorial. They represent a sample of sales records of an online shop including the order ID, the customer ID and invoice amount.

Goal of this tutorial

Deduping is necessary when observations are mistakenly duplicated (or repeated) due to input errors. Here, we want to clean the data from duplicated rows in order to obtain a table with the unique sales records.

Setting up a duplicate removal with XLSTAT

  1. Once XLSTAT is open, select the Data Management command under the Preparing data menu as shown below. Preparing Data menu in XLSTAT
  2. The Data management dialog box appears. Data management dialog box in XLSTAT
  3. Select columns A, B and C in the Data field. Then select the Dedupe method. Headers are included in our data selection, so we check the Variable labels.

Click on the OK button. An XLSTAT report will be generated in a new sheet named Dedupe.

Results of a duplicate removal

Three duplicated records were removed from the initial data. A comparison between the initial table and the deduped table, generated by XLSTAT, is shown below: XLSTAT output: deduped table

¿Ha sido útil este artículo?

  • No