Jump to content

PDO/SQL Delete Duplicates


JayDz

Recommended Posts

Hello Everyone!

I would like to have a sql query which deletes all duplicate fields.
So when a user logs in to our site it sends a log to our database, I want to delete duplicates with a cron-job.
I dont know much about SQL or PDO so maybe some of you know the query to delete duplicates?
Heres what the it looks like (name = logs):
36f6c887584f36fe396c5a264bf8739b.png
 

Link to comment
Share on other sites

You don't specify what constitutes a duplicate so here is an example of how to do it which you can apply to your table. When you have deleted your duplicates, put a UNIQUE key on the relevant column/s to stop it happening again.

 

My table

mysql> SELECT * FROM articles ORDER BY category_id;
+------------+-------------+---------+
| article_id | category_id | title   |
|    (PK)    |             |         |
+------------+-------------+---------+
|          1 |           1 | Art 1 1 |
|          2 |           1 | Art 1 2 |
|          3 |           1 | Art 1 3 |
|          4 |           2 | Art 2 1 |
|          6 |           3 | Art 3 1 |
|          7 |           3 | Art 3 2 |
|          8 |           3 | Art 3 3 |
|          9 |           3 | Art 3 4 |
|          5 |           4 | Art 4 1 |
+------------+-------------+---------+
9 rows in set (0.00 sec)

Task: delete duplicate category_ids

 

The query uses a subquery to allocate a sequence number to each row withing each category so we can then delete those with a sequence number > 1

DELETE articles 
FROM articles
INNER JOIN 
	(
    SELECT
          article_id
	, @seq := IF(category_id<>@prevcat, 1, @seq+1) as seq
        , @prevcat := category_id as category_id
        FROM articles
	  JOIN (SELECT @prevcat:=null, @seq:=0) as init
        ORDER BY category_id
    ) as count USING (article_id)
WHERE seq > 1

results

mysql> SELECT * FROM articles;
+------------+-------------+---------+
| article_id | category_id | title   |
+------------+-------------+---------+
|          1 |           1 | Art 1 1 |
|          4 |           2 | Art 2 1 |
|          5 |           4 | Art 4 1 |
|          6 |           3 | Art 3 1 |
+------------+-------------+---------+
4 rows in set (0.00 sec)
Edited by Barand
  • Like 1
Link to comment
Share on other sites

So you are inserting a new row when the user logs in? Which results in duplicate rows over time. In that case you want to setup a unique key constraint, such as add the usern field as a Unique Key then use a INSERT ON DUPLICATE UPDATE query. Example

INSERT INTO logs (usern, passwd, id, `date`, active, number) VALUES ('$username', '$password', $id, '$auth', NOW(), '$active', $number)
ON DUPLICATE KEY UPDATE `date` = NOW();

The above query will attempt to insert a new row, if there is already a row with the same username then it will update that rows date value to the current date. If you want the existing row to be deleted and a new row to be inserted then use a REPLACE query

 

Or if you do not want the row to be updated or deleted then use a INSERT IGNORE query

Link to comment
Share on other sites

 

You don't specify what constitutes a duplicate so here is an example of how to do it which you can apply to your table. When you have deleted your duplicates, put a UNIQUE key on the relevant column/s to stop it happening again.

 

My table

 

​Do you know the query how I could do this? Im bad at SQL :s

 

 

So you are inserting a new row when the user logs in? Which results in duplicate rows over time. In that case you want to setup a unique key constraint, such as add the usern field as a Unique Key then use a INSERT ON DUPLICATE UPDATE query. Example

INSERT INTO logs (usern, passwd, id, `date`, active, number) VALUES ('$username', '$password', $id, '$auth', NOW(), '$active', $number)

ON DUPLICATE KEY UPDATE `date` = NOW();

The above query will attempt to insert a new row, if there is already a row with the same username then it will update that rows date value to the current date. If you want the existing row to be deleted and a new row to be inserted then use a REPLACE query

 

Or if you do not want the row to be updated or deleted then use a INSERT IGNORE query

I'm not looking to replace the current logging system, just a delete function sorry.

 

Link to comment
Share on other sites

You're not too good at reading either. I gave you a query that will do it. Apply the same method using your table and column names

Hey man, sorry but this is only for one column?

I want it to delete every duplicate (not date, active, number) so if the usern, passwd, matches but the auth doesn't match it should just be deleted

Link to comment
Share on other sites

JayDz,

 

Your request and follow up comments are not very specific and are leading to confusion. The reasons the others were suggesting adding a unique constraint to the table is they are under the assumption that you don't want "duplicate" records to begin with. But, I don't think we are really talking about duplicates. The way I understand it is you have a table that logs when users log in and you want to remove old records for users instead of maintaining a history.

 

So, I will ask a different question. Do you only want one record for each user to record the last successful login for each user? Or do you need to maintain some prior login records: e.g. a certain number or however many going back a certain number of days? If you only want to maintain one record for each user, then you do not need (or want) to implement a process to remove "duplicate" records. Instead, you would just do an "INSERT ON DUPLICATE KEY" query. That will either add a new record if one doesn't exist for the user or, if there is an existing record, it will overwrite it with the new data. That way you have just one query to insert/update the records in the table and there is no need for any secondary process to delete records. If you want something along the lines of the other scenarios I mentioned there are better solutions than what you are asking.

 

So, please describe what the purpose is of the table and what you are trying to accomplish in plain English rather than providing specifics of the mechanics that you think you want to accomplish.

Link to comment
Share on other sites

I got too many log points so I cannot be asked changing all of them, thats why I want this script to delete duplicates.
So I only want to delete the fields where the usern, passwd, auth, id is the same.

This way everytime a user logs in on one of our sites theres a log to our database which is what we want.

Link to comment
Share on other sites

you should only have one instance of the login code on any site.

 

what you are describing you want to do is react to a problem by cleaning up a mess in it after the fact. programming is a proactive process, not a reactive one.

I made this thread hoping someone of you knew what sql query to use, im not requesting to change the management of all my sites.

Im not going to change logins/systems, just looking for a query, and btw, my primary key is number

Link to comment
Share on other sites

I substituted your column and table names for you.

 

(This will remove duplicates of user/passwd/id and keep the latest)

DELETE logs 
FROM logs
INNER JOIN 
    (
    SELECT
          number
	, @seq := IF(CONCAT(usern, passwd, id)<>@prevvals, 1, @seq+1) as seq
        , @prevvals := CONCAT(usern, passwd, id) as prev
    FROM logs
	  JOIN (SELECT @prevvals:=null, @seq:=0) as init
    ORDER BY usern, passwd, id, date DESC
    ) as count USING (number)
WHERE seq > 1;
  • Like 1
Link to comment
Share on other sites

 

I substituted your column and table names for you.

 

(This will remove duplicates of user/passwd/id and keep the latest)

DELETE logs 
FROM logs
INNER JOIN 
    (
    SELECT
          number
	, @seq := IF(CONCAT(usern, passwd, id)<>@prevvals, 1, @seq+1) as seq
        , @prevvals := CONCAT(usern, passwd, id) as prev
    FROM logs
	  JOIN (SELECT @prevvals:=null, @seq:=0) as init
    ORDER BY usern, passwd, id, date DESC
    ) as count USING (number)
WHERE seq > 1;

Thanks man, appreciate it. I wanted usern, passwd, id, auth to be checked for duplicated so I added auth:

DELETE logs 
FROM logs
INNER JOIN 
    (
    SELECT
          number
, @seq := IF(CONCAT(usern, passwd, id, auth)<>@prevvals, 1, @seq+1) as seq
        , @prevvals := CONCAT(usern, passwd, id, auth) as prev
    FROM logs
 JOIN (SELECT @prevvals:=null, @seq:=0) as init
    ORDER BY usern, passwd, id, auth, date DESC
    ) as count USING (number)
WHERE seq > 1;

Although it gave the error:  #1054 - Unknown column 'number' in 'field list'

Number field looks like this:

 04866671e1be13d92d1e7cb23e812566.png

Link to comment
Share on other sites

Weird!

I started with

mysql> select * from logs order by usern,passwd,id,auth,date;
+--------+---------+------+------+------------+--------+--------+
| usern  | passwd  | id   | auth | date       | active | number |
+--------+---------+------+------+------------+--------+--------+
| aaaaaa | xxxxxxx |    1 | A    | 2015-09-04 |      1 |      4 |
| aaaaaa | xxxxxxx |    1 | C    | 2015-09-01 |      1 |      1 |
| aaaaaa | xxxxxxx |    1 | C    | 2015-09-02 |      1 |      2 |
| aaaaaa | xxxxxxx |    1 | C    | 2015-09-03 |      1 |      3 |
| aaaaaa | xxxxxxx |    1 | C    | 2015-09-05 |      1 |      5 |
| bbbbbb | yyyyyyy |    2 | A    | 2015-09-01 |      1 |      6 |
| bbbbbb | yyyyyyy |    2 | A    | 2015-09-02 |      1 |      7 |
| bbbbbb | yyyyyyy |    2 | A    | 2015-09-03 |      1 |      8 |
| bbbbbb | yyyyyyy |    2 | A    | 2015-09-04 |      1 |      9 |
| cccccc | zzzzzzz |    3 | A    | 2015-09-06 |      1 |     10 |
+--------+---------+------+------+------------+--------+--------+

then I ran your version of the query and was left with

+--------+---------+------+------+------------+--------+--------+
| usern  | passwd  | id   | auth | date       | active | number |
+--------+---------+------+------+------------+--------+--------+
| aaaaaa | xxxxxxx |    1 | A    | 2015-09-04 |      1 |      4 |
| aaaaaa | xxxxxxx |    1 | C    | 2015-09-05 |      1 |      5 |
| bbbbbb | yyyyyyy |    2 | A    | 2015-09-04 |      1 |      9 |
| cccccc | zzzzzzz |    3 | A    | 2015-09-06 |      1 |     10 |
+--------+---------+------+------+------------+--------+--------+
Link to comment
Share on other sites

 

Weird!

I started with

mysql> select * from logs order by usern,passwd,id,auth,date;
+--------+---------+------+------+------------+--------+--------+
| usern  | passwd  | id   | auth | date       | active | number |
+--------+---------+------+------+------------+--------+--------+
| aaaaaa | xxxxxxx |    1 | A    | 2015-09-04 |      1 |      4 |
| aaaaaa | xxxxxxx |    1 | C    | 2015-09-01 |      1 |      1 |
| aaaaaa | xxxxxxx |    1 | C    | 2015-09-02 |      1 |      2 |
| aaaaaa | xxxxxxx |    1 | C    | 2015-09-03 |      1 |      3 |
| aaaaaa | xxxxxxx |    1 | C    | 2015-09-05 |      1 |      5 |
| bbbbbb | yyyyyyy |    2 | A    | 2015-09-01 |      1 |      6 |
| bbbbbb | yyyyyyy |    2 | A    | 2015-09-02 |      1 |      7 |
| bbbbbb | yyyyyyy |    2 | A    | 2015-09-03 |      1 |      8 |
| bbbbbb | yyyyyyy |    2 | A    | 2015-09-04 |      1 |      9 |
| cccccc | zzzzzzz |    3 | A    | 2015-09-06 |      1 |     10 |
+--------+---------+------+------+------------+--------+--------+

then I ran your version of the query and was left with

+--------+---------+------+------+------------+--------+--------+
| usern  | passwd  | id   | auth | date       | active | number |
+--------+---------+------+------+------------+--------+--------+
| aaaaaa | xxxxxxx |    1 | A    | 2015-09-04 |      1 |      4 |
| aaaaaa | xxxxxxx |    1 | C    | 2015-09-05 |      1 |      5 |
| bbbbbb | yyyyyyy |    2 | A    | 2015-09-04 |      1 |      9 |
| cccccc | zzzzzzz |    3 | A    | 2015-09-06 |      1 |     10 |
+--------+---------+------+------+------------+--------+--------+

Nvm bro it worked perfectly, did something wrong myself.

Thanks man, so awesome! <3

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.