Jump to content

[SOLVED] CLI php: daemon script?


aunquarra

Recommended Posts

I need to make some updates to several db tables, and currently I'm making these updates once daily with a cron job that just runs a 'wget' statement to pull a php file that does the work on the backend and displays debug info to the browser (or in this case wget).

 

The volume of data I'm dealing with now has increased to the point where I really can't continue to deal with an entire day's worth of data at once, so I though I might break it down into smaller chunks (12-hour cycles, 6-hour cycles, even hourly). Then, the thought occurred to me to just drop apache out of the equation entirely and just run php from the shell as a daemon that just continually runs, looking for data that needs updating and handling it as it's found...

 

So I guess my question is whether or not this is a good idea. I really need the flexibility and functionality PHP offers, so I can't really justify moving the system to C or Perl (I know I'm preaching to the choir here), but I'm not sure if having a single PHP script running 24/7 is a good or bad idea.

 

Obviously, if I did it, I would want to be extremely careful with memory management, but I try to do that anyway (I just don't trust apache to give it back on its own).

 

Thanks in advance for your input.

Link to comment
Share on other sites

I don't realy see any problems with it. However, there are some things you should think of:

 

1. How efficient will the deamon be if it has to search for the changes it self. It might be a lot more efficient to provide it with a list of changes, rather than having to 'search' for them.

 

2. Account for server reboots. If possible, setup the server to run the script at boot. Otherwise, setup a cron job that checks if the script is running twice a day.

 

3. Status reports.

 

4. Memory leaks. Be very, very careful. Implement memory_get_usage() checks, just be sure.

 

5. Race conditions. When running anything asynchronously, you have to beware of race conditions. This is especially a concern if you dispatch assignments from other scripts to the deamon, be it directly through sockets, or indirectly via a storage medium.

 

All of these concerns (with the exception of nr 4), have to do with how you communicate with the deamon while it is running.

 

 

</€0.02>

 

Link to comment
Share on other sites

I've written a few deamons in PHP and have never had an issue. Mind you they've all been pretty simple.

 

Theres also a http server written in pure PHP, (nanoweb), Ive had quite a bit of success running this server for a few months.

 

So yeah, its possible, even worth while looking into.

Link to comment
Share on other sites

Triggers are actually only triggered when something specific happens within the database. Eg; You can set a trigger to update column A whenever a new record is added to column B, they are not able to trigger themelves at time intervals.

Link to comment
Share on other sites

there is no right answer here. The bottom line is it depends on what kind of data he is cleaning up in his databases. Triggers get run when events get triggered but triggers trigger can trigger stored procedures If you have an event that gets triggered at the right times (rarely but regularly) then triggers would be a better choice, otherwise cron\php might be a better choice.

 

 

Link to comment
Share on other sites

Well, I think I've decided to do it. I've been researching process control functions, and I'm really digging how PHP handles stuff (no surprise there).

 

Doing triggers is really unnecessary since each iteration of the process could take anywhere from a half to five minutes to complete, and there is new data to be handled almost every fifteen seconds. I could do a cron... but you're right. I think this'll be more fun. Besides, it'll scale better.

 

Thanks for all the tips.

Link to comment
Share on other sites

Even though this has been marked solved, I'm curious: will you have the daemon look for changes, or provide it the changes? If you want a  scalable solution, consider the second option.

 

Another option would be to fork individual changes (depending on the processing involved with a single change). That would certainly make it easy enough to provide the script with changes.

Link to comment
Share on other sites

Well, just for background, what I need to do is look at several SQL servers, read new data from tables on each, then insert data on a separate SQL system based on the new data and some calculations, and then once it's all done, flag the original data as old so it doesn't get read again.

 

The way I had planned to do it was to just loop through a list of the servers/tables it needs to pull data from, and on each one:

  1. Run a query for the new information (with a LIMIT 1000 statement, just to keep it bite-sized)

  2. Do the needed calculations

  3. Handle any necessary inserts into the separate system, and remember the mysql_insert_id()

  4. Mark them as done in the original tables by putting the mysql_insert_id() in there...

  5. Have it sleep() for a few seconds, just to keep crazy loops from chewing through the CPU if, for some unknown reason, there's no data.

 

If there is a better model, I'm all ears. This is all new territory to me.

Link to comment
Share on other sites

Basically the thought I have is that in some cases it might be better to let everybody "do their own dishes".

 

If it's a lot of work, I can understand you don't want the client to wait until the dishes are done.This is where I think the pcntl_* functions could be very useful:

 

<?php
$pid = pcntl_fork();

if($pid == -1) {
     throw new Exception('Application error.');
} 
elseif($pid === 0) {
//Child process.
$cleanUp = new messCleaner($someDataIndicatingWhatToCleanUp);
$cleanUp->callTheMaid();
        die();
} 
else {
    //Resume normal script execution.
}
?> 

 

I need to point out that I do not have any experience with using the pcntl_* functions.

 

I THINK it would be more efficient than using the CLI through exec(), because php doesn't have to reinitialize, but the only way I could be sure is with a benchmark.

 

Link to comment
Share on other sites

Right. Currently, I'm looking for untouched data in four databases spanning three separate mysql clusters.

 

It's crazy, I know, but that's one of the main purposes of the system I wrote: to centralize, organize, and standardize data from completely unique subsystems. Bleh.

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.