Skip to content

Manually Deleting Stinks!

2012 January 4
tags:
by Shannon Lowder

trash can / ゴミ箱

Go ahead and set up your TroubledTable by running the Create Troubled Table T-SQl Script.  Now take a look at your duplicate rows.

SELECT email, COUNT(*)
FROM troubledTable
GROUP BY email
HAVING COUNT(*) > 1

You’ve got 200 rows that have been duplicated over and over.  Not a pretty sight.  Imagine how long it you would take you to retrieve each distinct row, and manually clear out the duplicate rows.

SELECT * FROM troubledtable
WHERE email = 'elit.a.feugiat@Etiam.com'

Returns:

FirstName LastName email
Clarke Glass elit.a.feugiat@Etiam.com
Clarke Glass elit.a.feugiat@Etiam.com
Clarke Glass elit.a.feugiat@Etiam.com
Clarke Glass elit.a.feugiat@Etiam.com

You’d have to then construct two T-SQL statements for each duplicate (all 200 of them).

DELETE FROM troubledtable WHERE email = 'elit.a.feugiat@Etiam.com'

INSERT INTO troubledtable
(firstname, lastname, email)
VALUES
('Clarke','Glass','elit.a.feugiat@Etiam.com')

That would take a very long time!  Most of us would argue it would take too long, and look for another method of de-duping that data, right?  Now if you had only one or two dupes, you might still consider this method.  Let’s look at one more reason you might not want to do that.  Look at the execution plan for that delete.

That table scan could be an issue if your table is quite large (1 million+ rows).  While you could change the table scan into a index seek by adding an index, you’d then create an additional operation during the delete: an index update.  If your table is a high transaction volume table, you may not want to trade the performance loss of an index operation for the gain in performance on your look up.

You need to keep performance concerns in mind when developing your solutions.  That way you can have the best answer as well as the fastest answer!  Next time we’re going to take a look at using temp tables to get rid of those pesky dupes.  As always, if you have any questions, please let me know. I’m here to help!

Leave a Reply

Note: You can use basic XHTML in your comments. Your email address will never be published.

Subscribe to this comment feed via RSS