Jump to content

Deleting Dupe records form a large table


dmacman

Recommended Posts

Hi All,

 

I have a user table that I need to delete all records that are duplicates (based on the EmailAddress field). I have my site coded to not allow dupes (by query first then either update or insert), but I also have a page for employees to import records from a CSV with fgetscv that sometimes inserts records where that email address exists.

 

I tried about a dozen different approaches, and none worked.

 

Here is a sample of what I tried and end up with it timing out (The table has 1.8 million records).

 

$sql = "SELECT MIN(maxID) AS totalcount FROM max8 GROUP BY EmailAddress";
$result = mysql_query($sql) or die (mysql_error());
$num_rows = mysql_num_rows($result);
$unique_ids = array();
for($i=0;$i<$num_rows;$i++) {
    $unique_ids[] = mysql_result($result,$i,0);
}
mysql_free_result($result);
$id_list = implode(',',$unique_ids);
$sql = "DELETE FROM max8 WHERE maxID not in($id_list)";
$query = mysql_query($sql) or die (mysql_error()); 

mysql_close($conn);

echo 'All duplicates where deleted';

 

I also noticed some addressed this by making a temp table, then exporting a unique select query, then comparing and deleting, but that seems like a round-about way.

 

My end goal it to make a process page, that I can run and either delete all the dupes (or if it is a timeout issue) delete them state by state.

 

Thoughts?

 

Regards,

Don

What you are best doing is using temporary tables.

 

1. Create a temporary table that can store the users email address and a count of how many times it exists.

CREATE TEMPORARY TABLE userAddresses SELECT COUNT(id) AS totalAddresses, email FROM users GROUP BY email

 

2. You can then select from the temp table all the users who have an email address that exists more than once.

SELECT email FROM userAddresses WHERE totalAddresses > 1

 

3. This gives you the offending email addresses that you can then delete from your main table

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.