revraz Posted January 9, 2009 Share Posted January 9, 2009 Kinda stuck on how to approach this. Table Layout: id cid start duration 1 123 123123 1000 2 123 123123 800 3 123 123123 500 4 123 125125 1000 5 234 333444 500 6 234 333444 800 So what I need to do is build a query that will delete duplicate rows. Duplicate rows is defined as having the same cid and start values, and I need to keep the row with the highest duration. So in the above example, I need to delete ID 2, 3, 5. Can I do this with just SQL or do I need to use code? Quote Link to comment https://forums.phpfreaks.com/topic/140159-solved-trying-to-delete-dupe-rows-but-keep-1/ Share on other sites More sharing options...
PFMaBiSmAd Posted January 9, 2009 Share Posted January 9, 2009 Technique 1 at this link should work - http://www.xaprb.com/blog/2006/10/11/how-to-delete-duplicate-rows-with-sql/ Replace the MIN(id) as min_id with MAX(duration) to keep the rows you want. Quote Link to comment https://forums.phpfreaks.com/topic/140159-solved-trying-to-delete-dupe-rows-but-keep-1/#findComment-733423 Share on other sites More sharing options...
revraz Posted January 9, 2009 Author Share Posted January 9, 2009 Thanks for that, I'm glad I was thinking along that same line, but I was making it more complicated that it really was. Quote Link to comment https://forums.phpfreaks.com/topic/140159-solved-trying-to-delete-dupe-rows-but-keep-1/#findComment-733451 Share on other sites More sharing options...
revraz Posted January 9, 2009 Author Share Posted January 9, 2009 Does this look like it would be it (minus changing the select to delete). I am not 100% sure on what to use as the Group By select bad_rows.* from log as bad_rows inner join ( select cid, start, MAX(duration) as max_id from log group by start having count(*) > 1 ) as good_rows on good_rows.start = bad_rows.start and good_rows.max_id <> bad_rows.duration; Quote Link to comment https://forums.phpfreaks.com/topic/140159-solved-trying-to-delete-dupe-rows-but-keep-1/#findComment-733464 Share on other sites More sharing options...
revraz Posted January 9, 2009 Author Share Posted January 9, 2009 Actually I think I need to add a cid compare as well. select bad_rows.* from log as bad_rows inner join ( select cid, start, MAX(duration) as max_id from log group by start having count(*) > 1 ) as good_rows on good_rows.start = bad_rows.start and good_rows.cid = bad_rows.cid and good_rows.max_id <> bad_rows.duration; Quote Link to comment https://forums.phpfreaks.com/topic/140159-solved-trying-to-delete-dupe-rows-but-keep-1/#findComment-733465 Share on other sites More sharing options...
revraz Posted January 9, 2009 Author Share Posted January 9, 2009 Well changing the Group By from start to cid really gives different results, from around 200 rows to 900 rows. I have over 9000 rows, so gonna be tough to see which is right. Quote Link to comment https://forums.phpfreaks.com/topic/140159-solved-trying-to-delete-dupe-rows-but-keep-1/#findComment-733467 Share on other sites More sharing options...
revraz Posted January 9, 2009 Author Share Posted January 9, 2009 Since I want to filter out the dupes based on the CID and with dupe STARTS, I would think I would want to group by CID, anyone agree? Quote Link to comment https://forums.phpfreaks.com/topic/140159-solved-trying-to-delete-dupe-rows-but-keep-1/#findComment-733472 Share on other sites More sharing options...
PFMaBiSmAd Posted January 9, 2009 Share Posted January 9, 2009 The following worked for me on your example data - $query = " select bad_rows.* from log as bad_rows inner join ( select cid, start, MAX(duration) as max_dur from log group by cid,start having count(*) > 1 ) as good_rows on good_rows.cid = bad_rows.cid AND good_rows.start = bad_rows.start and good_rows.max_dur <> bad_rows.duration"; If you have the same duration values (cid,start, and duration all match) the query won't find them and delete them due to the <> comparison in the above. You would need to use a second query that just forms groups and deletes one of the rows when count(*) > 1 Quote Link to comment https://forums.phpfreaks.com/topic/140159-solved-trying-to-delete-dupe-rows-but-keep-1/#findComment-733477 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.