This page describes how to remove duplicate rows in Excel, using three different methods.
If you want to remove duplicate cells (rather than entire rows of data), you may find the Remove Duplicate Cells page more straightforward.
We first show how to use Excel's Remove Duplicates Command to remove duplicate rows and then we show how to use use Excel's Advanced Filter to perform the same task. Finally, we show how to remove duplicate rows using Excel Formulas.
The Remove Duplicates command is located in the 'Data Tools' group, within the Data tab of the Excel ribbon.
To remove duplicate rows using this command:
You will be presented with the Remove Duplicates dialog box, as shown below:
This dialog box allows you to select which columns of your data set you want to be included in the comparison for duplicate information. In the example spreadsheet above, we only want a record to be removed if the contents of all three columns contain duplicate information. Therefore we leave all three fields selected within the dialog box.
Once you have ensured that the required fields are checked in the dialog box, click OK.
Excel will then delete the duplicate rows, as required and will present you with a message, informing you of the number of records removed and the number of unique records remaining (see below).
The Excel advanced filter has an option that allows you to filter unique records (rows of data) in a spreadsheet and copy the resulting filtered list to a new location.
This gives you a data set that contains the first occurrence of a duplicated row, but does not contain any further occurrences.
To remove duplicate rows using the Advanced Filter:
Select the Excel Advanced Filter option from the Data tab at the top of your Excel workbook(or in Excel 2003, this option is located in Data→Filter menu).
Within this dialog box:
The resulting spreadsheet, with the new data list in column E, is shown below:
It can be seen that the duplicate row 7 (for Laura CARTER, id: #31032) has been removed from the data in columns E-G.
Warning: This method will only work if the contents of your cells are less than 256 characters in length, as Excel functions cannot handle text strings that are longer than this.
In order to illustrate how to use Excel formulas to remove duplicate rows in an Excel spreadsheet, we will again use the simple example spreadsheet (repeated on the rightabove), that contains the personal data (forename, surname and ID Number) of nine individuals.
The first step of removing the duplicate rows is to combine the contents of the columns A-C into a single column. We will then highlight the rows corresponding to the duplicate values, before deleting these rows.
We first combine the data from columns A-C of the example spreadsheet, using the concatenation & operator in column D. The formula to be entered into cell D2 is:
Copying this formula down all rows gives the following spreadsheet:
Once the contents of columns A-C have been concatenated into column D, we need to find the duplicates in the combined column D.
As shown in the formula bar of the above spreadsheet, the format of the Countif function in cell E2 is:
|=COUNTIF( D$2:D3, D3 )|
=COUNTIF( D$2:D4, D4 )
=COUNTIF( D$2:D5, D5 )
Once we have used the Countif function to highlight the duplicates in column D of the example spreadsheet, we need to delete the rows for which the count is greater than 1.
The following steps show how to remove several duplicate rows at once, (after they have been highlighted using the Countif function):
You will be left with a spreadsheet in which the first occurrence of every row is hidden. I.e. only the duplicate rows are displayed.
Remove the filter and you will be left with the spreadsheet shown below, in which the duplicate row 7 has been removed.
You can now delete the columns containing your formulas (columns D and E in the example spreadsheet), to return to the original spreadsheet format.