Jump to content

Any way to make my delete statement run more efficiently?


Recommended Posts

So I went to one of my websites today and found a message that my webhost had disabled my account.  Ugh..  sickening, as I make a very decent AdSense income.  I managed to email for an hour and they reinstated my account but said I must manage my database queries or the account will be gone.  I'll likely start looking into switching hosts..  which blows because I have almost 50 sites. 

 

Anyway...  The problem that is always caused is by simply deleting records over 30 days.  This sometimes takes a minute or two.. and sometimes, the process gets stuck (as verified when looking at the SPIDs in SSH).  Since I'm no MySQL expert, I wanted to ask if anyone knew of a better way to run my deletes?  All my websites use a 5 column, single table database.  Here's the query I currently run (from my php page):

<?
include 'config.php';
$result = mysql_query("DELETE FROM things WHERE date < DATE_ADD(CURRENT_TIMESTAMP(), INTERVAL -1 MONTH) ORDER BY date") 
or die(mysql_error()); 
?>

 

I had a thought that maybe I could rewrite this to delete 500 records at a time (for example) and keep looping until the > 30 days had been met. 

 

Is there an even better way to handle this problem that someone might suggest?  I'm being told my querys can't run for more than 20 seconds at a time (so this is a processor issue - not memory). 

 

Thank you. 

Why are you using ORDER BY date in that query? Using order by would only make sense if you wanted to delete the x oldest or the x newest rows.

 

Do you have an index on the date column?

 

How often are you running that query?

 

How many rows does it typically delete and how many rows are left that are newer than 30 days old? It might be easier to just make a new table with the rows you want to keep and rename/delete tables.

 

Edit: Also, is your date column a datetime value or just a date?

Why are you using ORDER BY date in that query? Using order by would only make sense if you wanted to delete the x oldest or the x newest rows.

 

Do you have an index on the date column?

 

How often are you running that query?

 

How many rows does it typically delete and how many rows are left that are newer than 30 days old? It might be easier to just make a new table with the rows you want to keep and rename/delete tables.

 

Edit: Also, is your date column a datetime value or just a date?

 

1)  Good point on the 'Order by'.  I'll remove that.

 

2)  I run the query once a night during a scheduled cron job.  New records are inserted via a separate cron job (not causing any issues) and old records deleted with this code. 

 

3)  Depending on the database, anywhere from 500 (these databases seem to have no problem) to 10,000 records (for my 100,000 + records database). 

 

4)  My 'Date' column has a type of timestamp

 

5)  Not sure about the date column being indexed (I didn't write this).  How can I check and should I do this? 

 

 

 

 

If you're running 50 sites, and these are generating any substantial revenue, you'd be better off on a dedicated server, I would think.

 

Yes you need an index on the 'date' column whatever it's named that you're using in the where clause as described.

Yes you also should lose the order by.

 

You can (in mysql) specify a limit to the query.  This might be a good idea as well, to lessen the impact of a giant delete.

 

Are these tables myisam or innodb?  With innodb tablespaces in particular space freed by deletions will not be reclaimed so your tablespaces may be getting larger and larger over time, despite this housekeeping.

If you're running 50 sites, and these are generating any substantial revenue, you'd be better off on a dedicated server, I would think.

 

Yes you need an index on the 'date' column whatever it's named that you're using in the where clause as described.

Yes you also should lose the order by.

 

You can (in mysql) specify a limit to the query.  This might be a good idea as well, to lessen the impact of a giant delete.

 

Are these tables myisam or innodb?  With innodb tablespaces in particular space freed by deletions will not be reclaimed so your tablespaces may be getting larger and larger over time, despite this housekeeping.

 

I think you may be absolutely right about dedicated hosting.  I was actually out looking just now.  Pricey stuff!  Any recommendations? 

 

Great info on the myisam/innodb!  I had no idea!  I'll look into how to check this next. 

 

-- Any idea how I would specify a limit on the delete statement above?  To me, that sounds like a superb workaround while I switch hosts. 

 

Thanks a bunch. 

Well the first thing you should do is add the index.  It's a simple syntax  CREATE INDEX ... http://dev.mysql.com/doc/refman/5.0/en/create-index.html

 

You can also use phpMyAdmin to add an index.

 

PFMaBiSmAd gave you the syntax to show the create table syntax which will show what engine type you have.

 

To add the LIMIT, just tack it on at the end of the DELETE query...  LIMIT 500.

 

You can check mysql_affected_rows and keep issuing the query until that returns 0.

 

 

 

 

Well the first thing you should do is add the index.  It's a simple syntax  CREATE INDEX ... http://dev.mysql.com/doc/refman/5.0/en/create-index.html

 

You can also use phpMyAdmin to add an index.

 

PFMaBiSmAd gave you the syntax to show the create table syntax which will show what engine type you have.

 

To add the LIMIT, just tack it on at the end of the DELETE query...  LIMIT 500.

 

You can check mysql_affected_rows and keep issuing the query until that returns 0.

 

I didn't realize his syntax would also show that.  I did run it and it show myisam. 

 

I've been adding the indexes to all the databases for the last 45 mins..  it does seem to make the query run much faster.  I'll create a local test environment tonight and add a few hundred thousand records to really test. 

 

Thanks for the tip on mysql_affected_rows.  This sounds like a great way to control processor usage so I'm going to look hard into this as well. 

Is your cron job(s) doing this all at the same time for all the tables? If so, your host is probably having an issue with the peak usage, not necessarily the average usage, and you could break up and distribute the delete operations over time to reduce the peak processing.

Is your cron job(s) doing this all at the same time for all the tables? If so, your host is probably having an issue with the peak usage, not necessarily the average usage, and you could break up and distribute the delete operations over time to reduce the peak processing.

 

No.. I have them all scheduled 10 minutes apart all night long. 

 

The duration of the query happens even when running the query right in MySQL, but.. adding index seems to have helped a lot with the smaller databases I checked.  I'll know more soon!  I'm going to make a really big db tonight. 

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.