lesliea Posted August 29, 2010 Share Posted August 29, 2010 Given the following information, which option requires the least overhead/time/processing power while maintaining some level of best practice regarding MySQL tables? I have a users table with over 525,000 entries. Each entry represents a user. I also have a settings table with over 740,000 entries. Each entry also represents a user. Sometime in the past an administrator clearly performed a cleanup and deleted users from the users table without deleting their corresponding settings entry. I've come up with two ways to clean up the settings table: (I am selecting by the username since while users do have a userid, it's not used anywhere. Go figure) $settings_table = mysql_query("SELECT username FROM settings"); while($user = mysql_fetch_array($settings_table)){ $user_query = mysql_query("SELECT username FROM users WHERE username='$user[username]'"); if(mysql_num_rows($user_query) == 0) { mysql_query("DELETE FROM settings WHERE username='$user[username]'"); } Alternatively I could do this: $settings_table = mysql_query("SELECT username FROM settings"); $settings_array = array(); while($suser = mysql_fetch_array($settings_table)){ $settings_array[] = $suser['username']; } $users_array = array(); $users_table = mysql_query("SELECT username FROM users"); while($uuser = mysql_fetch_array($users_table)){ $users_array[] = $uuser['username']; } foreach($settings_array as $user) { if(!in_array($user, $users_array) { mysql_query("DELETE FROM settings WHERE username=$user"); } } As a third option, I could ignore the discrepancy on the grounds that it's not really causing any harm. Suggestions? I also have a similar question regarding an items table - users have been deleted without removing their items - only the items table is good 2.5 million rows long. Suggestions here would be appreciated as well. Thanks! Quote Link to comment https://forums.phpfreaks.com/topic/212012-cleaning-up-large-mysql-tables/ Share on other sites More sharing options...
kickstart Posted August 29, 2010 Share Posted August 29, 2010 Hi You can use the following to get a list of the users on the settings table that are not on the users table. SELECT settings.username FROM settings LEFT OUTER JOIN users ON settings.username = users.username WHERE users.username IS NULL Or to just do the delete DELETE FROM settings WHERE username IN (SELECT settings.username FROM settings LEFT OUTER JOIN users ON settings.username = users.username WHERE users.username IS NULL) But a bit nervous about deleting records like that. Back things up first! All the best Keith Quote Link to comment https://forums.phpfreaks.com/topic/212012-cleaning-up-large-mysql-tables/#findComment-1104924 Share on other sites More sharing options...
lesliea Posted August 29, 2010 Author Share Posted August 29, 2010 You are brilliant. Thank you! Yes, definitely will back up first. I wouldn't like to explain to my "boss" (not getting paid, not technically my boss) how I managed to delete everyone's forum settings. Quote Link to comment https://forums.phpfreaks.com/topic/212012-cleaning-up-large-mysql-tables/#findComment-1104933 Share on other sites More sharing options...
DavidAM Posted August 29, 2010 Share Posted August 29, 2010 Your transaction log will tend to get pretty large while running this query. Especially, if you run it against the items table. I'm not sure exactly how mySql handles this, but if you run into problems with the transaction log getting full and aborting the statement (thereby rolling back the transaction after running for 2 hours), you could try putting a limit on the inner query and running it repeatedly until it affects zero rows: DELETE FROM settings WHERE username IN (SELECT settings.username FROM settings LEFT OUTER JOIN users ON settings.username = users.username WHERE users.username IS NULL) LIMIT 1000 play with the limit size to see where a good point is. Also, i hope the username is indexed in the settings table, otherwise, you'll be doing table scans until the cows come home. It sounds like it should be the primary key on that table anyway. Also, if you are using innoDB tables (instead of myisam) take a look at foreign keys and cascading deletes. innoDB is the only table type that supports it. Basically, if you define a Foreign Key from users to settings with CASCADE DELETE; then a delete in the users table will "cascade" (automatically) delete the corresponding entry in the settings table. This is to prevent the problem in the future. You will not be able to define the foreign key until the table is cleaned up. Quote Link to comment https://forums.phpfreaks.com/topic/212012-cleaning-up-large-mysql-tables/#findComment-1104956 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.