Jump to content

[SOLVED] Trying to Delete dupe rows but keep 1


revraz

Recommended Posts

Kinda stuck on how to approach this.

 

Table Layout:

 

id    cid    start        duration

1    123    123123    1000

2    123    123123    800

3    123    123123    500

4    123    125125    1000

5    234    333444    500

6    234    333444    800

 

So what I need to do is build a query that will delete duplicate rows.  Duplicate rows is defined as having the same cid and start values, and I need to keep the row with the highest duration.  So in the above example, I need to delete ID 2, 3, 5.

 

Can I do this with just SQL or do I need to use code?

Does this look like it would be it (minus changing the select to delete).  I am not 100% sure on what to use as the Group By

 

select bad_rows.*

from log as bad_rows

  inner join (

      select cid, start, MAX(duration) as max_id

      from log

      group by start

      having count(*) > 1

  ) as good_rows on good_rows.start = bad_rows.start

      and good_rows.max_id <> bad_rows.duration;

Actually I think I need to add a cid compare as well.

 

select bad_rows.*

from log as bad_rows

  inner join (

      select cid, start, MAX(duration) as max_id

      from log

      group by start

      having count(*) > 1

  ) as good_rows on good_rows.start = bad_rows.start

      and good_rows.cid = bad_rows.cid

      and good_rows.max_id <> bad_rows.duration;

The following worked for me on your example data -

 

$query = "
select bad_rows.*
from log as bad_rows
   inner join (
      select cid, start, MAX(duration) as max_dur
      from log
      group by cid,start
      having count(*) > 1
   ) as good_rows on good_rows.cid = bad_rows.cid AND good_rows.start = bad_rows.start
      and good_rows.max_dur <> bad_rows.duration";

 

If you have the same duration values (cid,start, and duration all match) the query won't find them and delete them due to the <> comparison in the above. You would need to use a second query that just forms groups and deletes one of the rows when count(*) > 1

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.