Jump to content

Cleaning up large Mysql tables


lesliea

Recommended Posts

Given the following information, which option requires the least overhead/time/processing power while maintaining some level of best practice regarding MySQL tables?

 

I have a users table with over 525,000 entries.  Each entry represents a user.  I also have a settings table with over 740,000 entries.  Each entry also represents a user.  Sometime in the past an administrator clearly performed a cleanup and deleted users from the users table without deleting their corresponding settings entry.  I've come up with two ways to clean up the settings table:

 

(I am selecting by the username since while users do have a userid, it's not used anywhere.  Go figure)

$settings_table = mysql_query("SELECT username FROM settings");
while($user = mysql_fetch_array($settings_table)){
  $user_query = mysql_query("SELECT username FROM users WHERE username='$user[username]'");
  if(mysql_num_rows($user_query) == 0) { mysql_query("DELETE FROM settings WHERE username='$user[username]'");
}

 

Alternatively I could do this:

$settings_table = mysql_query("SELECT username FROM settings");
$settings_array = array();
while($suser = mysql_fetch_array($settings_table)){
  $settings_array[] = $suser['username'];
}

$users_array = array();
$users_table = mysql_query("SELECT username FROM users");
while($uuser = mysql_fetch_array($users_table)){
  $users_array[] = $uuser['username'];
}

foreach($settings_array as $user) {
  if(!in_array($user, $users_array) { mysql_query("DELETE FROM settings WHERE username=$user"); }
}

 

As a third option, I could ignore the discrepancy on the grounds that it's not really causing any harm.  Suggestions?

 

I also have a similar question regarding an items table - users have been deleted without removing their items - only the items table is good 2.5 million rows long.  Suggestions here would be appreciated as well.

 

Thanks!

Link to comment
Share on other sites

Hi

 

You can use the following to get a list of the users on the settings table that are not on the users table.

 

SELECT settings.username
FROM settings
LEFT OUTER JOIN users
ON settings.username = users.username
WHERE users.username IS NULL

 

Or to just do the delete

 

DELETE FROM settings
WHERE username IN
(SELECT settings.username
FROM settings
LEFT OUTER JOIN users
ON settings.username = users.username
WHERE users.username IS NULL)

But a bit nervous about deleting records like that. Back things up first!

 

All the best

 

Keith

Link to comment
Share on other sites

Your transaction log will tend to get pretty large while running this query.  Especially, if you run it against the items table.  I'm not sure exactly how mySql handles this, but if you run into problems with the transaction log getting full and aborting the statement (thereby rolling back the transaction after running for 2 hours), you could try putting a limit on the inner query and running it repeatedly until it affects zero rows:

 

DELETE FROM settings
WHERE username IN
(SELECT settings.username
FROM settings
LEFT OUTER JOIN users
ON settings.username = users.username
WHERE users.username IS NULL) 
LIMIT 1000

play with the limit size to see where a good point is.

 

Also, i hope the username is indexed in the settings table, otherwise, you'll be doing table scans until the cows come home. It sounds like it should be the primary key on that table anyway. 

 

Also, if you are using innoDB tables (instead of myisam) take a look at foreign keys and cascading deletes. innoDB is the only table type that supports it. Basically, if you define a Foreign Key from users to settings with CASCADE DELETE; then a delete in the users table will "cascade" (automatically) delete the corresponding entry in the settings table. This is to prevent the problem in the future. You will not be able to define the foreign key until the table is cleaned up.

 

 

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.