There are a couple great threads on SO about how to store complex recurring events for a calendar:
Calendar Recurring/Repeating Events - Best Storage Method
SQL Infinite Calendar Pattern
However these are for storing events that get read at any time.
I'm trying to repurpose these storage methods for having scripts run at those times. Once. Reliably. And be run late if they don't run on time. I might have thousands or tens of thousands of events, so I've read that I should not use Apache's Crontab for this.
For example, I have a script that needs to send an email to a customer every Tuesday and Thursday at 2PM. Thanks to those threads I now understand how to store that interval, and even how to query for it. What I can't figure out how to reliably only send the email to the customer once.
Here's the best solution I've come up with:
I run a cron job every minute. It polls the database and finds events that should have run in the last 5 minutes, and that haven't been run in the last 5 minutes. It adds the event to a queue (another table?) to be run by a separate script. It then updates the event record to say it ran now.
Issues:
If the cron job ever fails, or takes more than 5 minutes, it could miss events that fall outside of the 5 minute window.
What if an event is scheduled every 2 minutes?
How can I guaranty the event only fires once when scheduled, even if the interval between events is every minute?
How would you solve this?
Instead of storing the last time it ran, should I calculate and store the next time it should run and just query based on that?
Thanks for any help!