This article is half-done without your Comment! *** Please share your thoughts via Comment ***
In this post, I am sharing two different SQL scripts to delete the all duplicate rows except the one in SQL Server.
Before a few days ago, I was working on a production report and found duplicates records in the few tables. I worked around this and created the different solutions for removing all duplicate rows.
Let’s first create a sample table and data:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
CREATE TABLE tbl_RemoveDuplicate ( ID INTEGER PRIMARY KEY ,Name VARCHAR(150) ) GO INSERT INTO tbl_RemoveDuplicate VALUES (1,'ABC'),(2,'XYZ') ,(3,'XYZ'),(4,'RFQ') ,(5,'PQR'),(6,'EFG') ,(7,'EFG'),(8,'ABC') GO |
First solution is using WITH CTE:
1 2 3 4 5 6 7 8 9 10 |
;WITH cteDuplicate AS ( SELECT Name ,ROW_NUMBER() OVER (PARTITION BY Name ORDER BY Name) RowNumber FROM tbl_RemoveDuplicate ) DELETE FROM cteDuplicate WHERE RowNumber > 1 GO |
Second solution is using SELF JOIN:
1 2 3 4 5 |
DELETE FROM A FROM tbl_RemoveDuplicate AS A INNER JOIN tbl_RemoveDuplicate AS B ON A.Name = B.Name AND A.ID > B.ID GO |
Check your table after remove the duplicates:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
SELECT *FROM tbl_RemoveDuplicate ORDER BY ID --Result: ID Name ----------- ------------ 1 ABC 2 XYZ 4 RFQ 5 PQR 6 EFG (5 row(s) affected) |