JayDz Posted September 10, 2015 Share Posted September 10, 2015 Hello Everyone!I would like to have a sql query which deletes all duplicate fields.So when a user logs in to our site it sends a log to our database, I want to delete duplicates with a cron-job.I dont know much about SQL or PDO so maybe some of you know the query to delete duplicates?Heres what the it looks like (name = logs): Quote Link to comment https://forums.phpfreaks.com/topic/298121-pdosql-delete-duplicates/ Share on other sites More sharing options...
Barand Posted September 10, 2015 Share Posted September 10, 2015 (edited) You don't specify what constitutes a duplicate so here is an example of how to do it which you can apply to your table. When you have deleted your duplicates, put a UNIQUE key on the relevant column/s to stop it happening again. My table mysql> SELECT * FROM articles ORDER BY category_id; +------------+-------------+---------+ | article_id | category_id | title | | (PK) | | | +------------+-------------+---------+ | 1 | 1 | Art 1 1 | | 2 | 1 | Art 1 2 | | 3 | 1 | Art 1 3 | | 4 | 2 | Art 2 1 | | 6 | 3 | Art 3 1 | | 7 | 3 | Art 3 2 | | 8 | 3 | Art 3 3 | | 9 | 3 | Art 3 4 | | 5 | 4 | Art 4 1 | +------------+-------------+---------+ 9 rows in set (0.00 sec) Task: delete duplicate category_ids The query uses a subquery to allocate a sequence number to each row withing each category so we can then delete those with a sequence number > 1 DELETE articles FROM articles INNER JOIN ( SELECT article_id , @seq := IF(category_id<>@prevcat, 1, @seq+1) as seq , @prevcat := category_id as category_id FROM articles JOIN (SELECT @prevcat:=null, @seq:=0) as init ORDER BY category_id ) as count USING (article_id) WHERE seq > 1 results mysql> SELECT * FROM articles; +------------+-------------+---------+ | article_id | category_id | title | +------------+-------------+---------+ | 1 | 1 | Art 1 1 | | 4 | 2 | Art 2 1 | | 5 | 4 | Art 4 1 | | 6 | 3 | Art 3 1 | +------------+-------------+---------+ 4 rows in set (0.00 sec) Edited September 10, 2015 by Barand 1 Quote Link to comment https://forums.phpfreaks.com/topic/298121-pdosql-delete-duplicates/#findComment-1520630 Share on other sites More sharing options...
Ch0cu3r Posted September 10, 2015 Share Posted September 10, 2015 So you are inserting a new row when the user logs in? Which results in duplicate rows over time. In that case you want to setup a unique key constraint, such as add the usern field as a Unique Key then use a INSERT ON DUPLICATE UPDATE query. Example INSERT INTO logs (usern, passwd, id, `date`, active, number) VALUES ('$username', '$password', $id, '$auth', NOW(), '$active', $number) ON DUPLICATE KEY UPDATE `date` = NOW(); The above query will attempt to insert a new row, if there is already a row with the same username then it will update that rows date value to the current date. If you want the existing row to be deleted and a new row to be inserted then use a REPLACE query Or if you do not want the row to be updated or deleted then use a INSERT IGNORE query Quote Link to comment https://forums.phpfreaks.com/topic/298121-pdosql-delete-duplicates/#findComment-1520632 Share on other sites More sharing options...
JayDz Posted September 11, 2015 Author Share Posted September 11, 2015 You don't specify what constitutes a duplicate so here is an example of how to do it which you can apply to your table. When you have deleted your duplicates, put a UNIQUE key on the relevant column/s to stop it happening again. My table Do you know the query how I could do this? Im bad at SQL :s So you are inserting a new row when the user logs in? Which results in duplicate rows over time. In that case you want to setup a unique key constraint, such as add the usern field as a Unique Key then use a INSERT ON DUPLICATE UPDATE query. ExampleINSERT INTO logs (usern, passwd, id, `date`, active, number) VALUES ('$username', '$password', $id, '$auth', NOW(), '$active', $number)ON DUPLICATE KEY UPDATE `date` = NOW(); The above query will attempt to insert a new row, if there is already a row with the same username then it will update that rows date value to the current date. If you want the existing row to be deleted and a new row to be inserted then use a REPLACE query Or if you do not want the row to be updated or deleted then use a INSERT IGNORE query I'm not looking to replace the current logging system, just a delete function sorry. Quote Link to comment https://forums.phpfreaks.com/topic/298121-pdosql-delete-duplicates/#findComment-1520650 Share on other sites More sharing options...
Barand Posted September 11, 2015 Share Posted September 11, 2015 Do you know the query how I could do this? Im bad at SQL :s You're not too good at reading either. I gave you a query that will do it. Apply the same method using your table and column names Quote Link to comment https://forums.phpfreaks.com/topic/298121-pdosql-delete-duplicates/#findComment-1520654 Share on other sites More sharing options...
JayDz Posted September 11, 2015 Author Share Posted September 11, 2015 You're not too good at reading either. I gave you a query that will do it. Apply the same method using your table and column names Hey man, sorry but this is only for one column? I want it to delete every duplicate (not date, active, number) so if the usern, passwd, matches but the auth doesn't match it should just be deleted Quote Link to comment https://forums.phpfreaks.com/topic/298121-pdosql-delete-duplicates/#findComment-1520660 Share on other sites More sharing options...
Psycho Posted September 11, 2015 Share Posted September 11, 2015 JayDz, Your request and follow up comments are not very specific and are leading to confusion. The reasons the others were suggesting adding a unique constraint to the table is they are under the assumption that you don't want "duplicate" records to begin with. But, I don't think we are really talking about duplicates. The way I understand it is you have a table that logs when users log in and you want to remove old records for users instead of maintaining a history. So, I will ask a different question. Do you only want one record for each user to record the last successful login for each user? Or do you need to maintain some prior login records: e.g. a certain number or however many going back a certain number of days? If you only want to maintain one record for each user, then you do not need (or want) to implement a process to remove "duplicate" records. Instead, you would just do an "INSERT ON DUPLICATE KEY" query. That will either add a new record if one doesn't exist for the user or, if there is an existing record, it will overwrite it with the new data. That way you have just one query to insert/update the records in the table and there is no need for any secondary process to delete records. If you want something along the lines of the other scenarios I mentioned there are better solutions than what you are asking. So, please describe what the purpose is of the table and what you are trying to accomplish in plain English rather than providing specifics of the mechanics that you think you want to accomplish. Quote Link to comment https://forums.phpfreaks.com/topic/298121-pdosql-delete-duplicates/#findComment-1520661 Share on other sites More sharing options...
JayDz Posted September 11, 2015 Author Share Posted September 11, 2015 I got too many log points so I cannot be asked changing all of them, thats why I want this script to delete duplicates.So I only want to delete the fields where the usern, passwd, auth, id is the same.This way everytime a user logs in on one of our sites theres a log to our database which is what we want. Quote Link to comment https://forums.phpfreaks.com/topic/298121-pdosql-delete-duplicates/#findComment-1520665 Share on other sites More sharing options...
Barand Posted September 11, 2015 Share Posted September 11, 2015 What is the primary key on that table? Quote Link to comment https://forums.phpfreaks.com/topic/298121-pdosql-delete-duplicates/#findComment-1520671 Share on other sites More sharing options...
mac_gyver Posted September 11, 2015 Share Posted September 11, 2015 I got too many log points ... you should only have one instance of the login code on any site. what you are describing you want to do is react to a problem by cleaning up a mess in it after the fact. programming is a proactive process, not a reactive one. Quote Link to comment https://forums.phpfreaks.com/topic/298121-pdosql-delete-duplicates/#findComment-1520682 Share on other sites More sharing options...
JayDz Posted September 12, 2015 Author Share Posted September 12, 2015 you should only have one instance of the login code on any site. what you are describing you want to do is react to a problem by cleaning up a mess in it after the fact. programming is a proactive process, not a reactive one. I made this thread hoping someone of you knew what sql query to use, im not requesting to change the management of all my sites. Im not going to change logins/systems, just looking for a query, and btw, my primary key is number Quote Link to comment https://forums.phpfreaks.com/topic/298121-pdosql-delete-duplicates/#findComment-1520716 Share on other sites More sharing options...
Barand Posted September 12, 2015 Share Posted September 12, 2015 I substituted your column and table names for you. (This will remove duplicates of user/passwd/id and keep the latest) DELETE logs FROM logs INNER JOIN ( SELECT number , @seq := IF(CONCAT(usern, passwd, id)<>@prevvals, 1, @seq+1) as seq , @prevvals := CONCAT(usern, passwd, id) as prev FROM logs JOIN (SELECT @prevvals:=null, @seq:=0) as init ORDER BY usern, passwd, id, date DESC ) as count USING (number) WHERE seq > 1; 1 Quote Link to comment https://forums.phpfreaks.com/topic/298121-pdosql-delete-duplicates/#findComment-1520722 Share on other sites More sharing options...
JayDz Posted September 13, 2015 Author Share Posted September 13, 2015 I substituted your column and table names for you. (This will remove duplicates of user/passwd/id and keep the latest) DELETE logs FROM logs INNER JOIN ( SELECT number , @seq := IF(CONCAT(usern, passwd, id)<>@prevvals, 1, @seq+1) as seq , @prevvals := CONCAT(usern, passwd, id) as prev FROM logs JOIN (SELECT @prevvals:=null, @seq:=0) as init ORDER BY usern, passwd, id, date DESC ) as count USING (number) WHERE seq > 1; Thanks man, appreciate it. I wanted usern, passwd, id, auth to be checked for duplicated so I added auth: DELETE logs FROM logs INNER JOIN ( SELECT number , @seq := IF(CONCAT(usern, passwd, id, auth)<>@prevvals, 1, @seq+1) as seq , @prevvals := CONCAT(usern, passwd, id, auth) as prev FROM logs JOIN (SELECT @prevvals:=null, @seq:=0) as init ORDER BY usern, passwd, id, auth, date DESC ) as count USING (number) WHERE seq > 1; Although it gave the error: #1054 - Unknown column 'number' in 'field list' Number field looks like this: Quote Link to comment https://forums.phpfreaks.com/topic/298121-pdosql-delete-duplicates/#findComment-1520754 Share on other sites More sharing options...
Barand Posted September 13, 2015 Share Posted September 13, 2015 Weird! I started with mysql> select * from logs order by usern,passwd,id,auth,date; +--------+---------+------+------+------------+--------+--------+ | usern | passwd | id | auth | date | active | number | +--------+---------+------+------+------------+--------+--------+ | aaaaaa | xxxxxxx | 1 | A | 2015-09-04 | 1 | 4 | | aaaaaa | xxxxxxx | 1 | C | 2015-09-01 | 1 | 1 | | aaaaaa | xxxxxxx | 1 | C | 2015-09-02 | 1 | 2 | | aaaaaa | xxxxxxx | 1 | C | 2015-09-03 | 1 | 3 | | aaaaaa | xxxxxxx | 1 | C | 2015-09-05 | 1 | 5 | | bbbbbb | yyyyyyy | 2 | A | 2015-09-01 | 1 | 6 | | bbbbbb | yyyyyyy | 2 | A | 2015-09-02 | 1 | 7 | | bbbbbb | yyyyyyy | 2 | A | 2015-09-03 | 1 | 8 | | bbbbbb | yyyyyyy | 2 | A | 2015-09-04 | 1 | 9 | | cccccc | zzzzzzz | 3 | A | 2015-09-06 | 1 | 10 | +--------+---------+------+------+------------+--------+--------+ then I ran your version of the query and was left with +--------+---------+------+------+------------+--------+--------+ | usern | passwd | id | auth | date | active | number | +--------+---------+------+------+------------+--------+--------+ | aaaaaa | xxxxxxx | 1 | A | 2015-09-04 | 1 | 4 | | aaaaaa | xxxxxxx | 1 | C | 2015-09-05 | 1 | 5 | | bbbbbb | yyyyyyy | 2 | A | 2015-09-04 | 1 | 9 | | cccccc | zzzzzzz | 3 | A | 2015-09-06 | 1 | 10 | +--------+---------+------+------+------------+--------+--------+ Quote Link to comment https://forums.phpfreaks.com/topic/298121-pdosql-delete-duplicates/#findComment-1520760 Share on other sites More sharing options...
JayDz Posted September 13, 2015 Author Share Posted September 13, 2015 Weird! I started with mysql> select * from logs order by usern,passwd,id,auth,date; +--------+---------+------+------+------------+--------+--------+ | usern | passwd | id | auth | date | active | number | +--------+---------+------+------+------------+--------+--------+ | aaaaaa | xxxxxxx | 1 | A | 2015-09-04 | 1 | 4 | | aaaaaa | xxxxxxx | 1 | C | 2015-09-01 | 1 | 1 | | aaaaaa | xxxxxxx | 1 | C | 2015-09-02 | 1 | 2 | | aaaaaa | xxxxxxx | 1 | C | 2015-09-03 | 1 | 3 | | aaaaaa | xxxxxxx | 1 | C | 2015-09-05 | 1 | 5 | | bbbbbb | yyyyyyy | 2 | A | 2015-09-01 | 1 | 6 | | bbbbbb | yyyyyyy | 2 | A | 2015-09-02 | 1 | 7 | | bbbbbb | yyyyyyy | 2 | A | 2015-09-03 | 1 | 8 | | bbbbbb | yyyyyyy | 2 | A | 2015-09-04 | 1 | 9 | | cccccc | zzzzzzz | 3 | A | 2015-09-06 | 1 | 10 | +--------+---------+------+------+------------+--------+--------+ then I ran your version of the query and was left with +--------+---------+------+------+------------+--------+--------+ | usern | passwd | id | auth | date | active | number | +--------+---------+------+------+------------+--------+--------+ | aaaaaa | xxxxxxx | 1 | A | 2015-09-04 | 1 | 4 | | aaaaaa | xxxxxxx | 1 | C | 2015-09-05 | 1 | 5 | | bbbbbb | yyyyyyy | 2 | A | 2015-09-04 | 1 | 9 | | cccccc | zzzzzzz | 3 | A | 2015-09-06 | 1 | 10 | +--------+---------+------+------+------------+--------+--------+ Nvm bro it worked perfectly, did something wrong myself. Thanks man, so awesome! <3 Quote Link to comment https://forums.phpfreaks.com/topic/298121-pdosql-delete-duplicates/#findComment-1520780 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.