Update rows that are duplicated based on more than one column

binarymonkey · October 19, 2007

Hey everyone,

Have tried to look around for hours now for a solution to this, but have not found anything suitable yet!

Basically, I just want to modify a column in rows are duplicated.

Example table:

[pre]ID | ProductName | CatID | Hits

-------------------------------

1 | Prod 1 | 1 | 23

2 | Prod 1 | 1 | 34

3 | Prod 1 | 2 | 45

4 | Prod 2 | 3 | 56

5 | Prod 2 | 3 | 67

6 | Prod 3 | 4 | 78

-------------------------------[/pre]

The rule is, a row is duplicated when the same ProductName and CatID combination exist on another row, regardless of the ID or Hits values. I would like to get the following results:

[pre]ID | ProductName | CatID | Hits

-------------------------------

1 | Prod 1 | 1 | 23

2 | Prod 1 | 1 | 34

4 | Prod 2 | 3 | 56

5 | Prod 2 | 3 | 67

-------------------------------[/pre]

Note that I am not interested in how many times it has been duplicated, just the fact it is a duplicate. The key thing is, I need to get at the ID value.

Then, I would like to set Hits to 0 on the above, resulting in:

[pre]ID | ProductName | CatID | Hits

-------------------------------

1 | Prod 1 | 1 | 0

2 | Prod 1 | 1 | 0

3 | Prod 1 | 2 | 45

4 | Prod 2 | 3 | 0

5 | Prod 2 | 3 | 0

6 | Prod 3 | 4 | 78

-------------------------------[/pre]

Ideally I want to do this all in one query. So far I have this query:

SELECT ID, ProductName, CatID, Hits
FROM some_table
GROUP BY ProductName, CatID
HAVING COUNT(*) > 1

This will only return:

[pre]ID | ProductName | CatID | Hits

-------------------------------

1 | Prod 1 | 1 | 23

4 | Prod 2 | 3 | 56

-------------------------------[/pre]

Clearly, this is only showing the first duplicate for each!

I must be missing something really obvious and I'd really appreciate some help with this one, before I go insane!

Thank you very much!

Matthew

fenway · October 19, 2007

I'll skip the obvious question about duplicates....

You can probably "cheat" in this case... you can use the query you showed to get the ProductName/CatID tuple that have duplicates; but ask for MAX(hits) back. Then use this as a derived table, and join to the entire table -- and simply issue an update statement where the prod/cat match but the hits don't.

Sign In

Update rows that are duplicated based on more than one column

Recommended Posts

binarymonkey

Link to comment

Share on other sites

fenway

Link to comment

Share on other sites

Archived

Browse

Activity

Important Information