Refreshing news page with php cron

eightgames · September 27, 2007

All right, so here's the issue: I've created a news site that pulls in feeds from about 10 sites. The initial load is really slow but after that, you can hit refresh and the page loads fast because it has been cached. What I would like to do is set up a php cron so that the page is automatically refreshed "behind-the-scenes", exactly like this guy did: http://simplepie.org/support/viewtopic.php?pid=4025

Now here are the issues I've run into. When I attempted to create the following cron, it wouldn't work because my host does not support wget:

0,15,30,45 * * * * www-data wget -q --spider http://my-website.com

Instead, my host suggested using the following line:

php /home/yourusername/public_html/script.php (change 'yourusername' to your cpanel username)

So in the "script.php" file, I added this code:

<?php
shell_exec("wget -q --spider http://my-website.com");
?>

And received the following error through my e-mail:

sh: wget: command not found
X-Powered-By: PHP/4.4.7

Content-type: text/html

I'm new to PHP and crons but I'm assuming this message again means that the wget is causing the problem. Another user suggested using either fsockopen, fopen, or the curl pear extension and at this point I'm lost. Any advice on what I should be adding to the "script.php" file would be very much appreciated. Thanks.

BlueSkyIS · September 27, 2007

I'd try curl (it's what I'm most familiar with), but check phpinfo() to make sure it's compiled in. i'm not optimistic as your host doesn't provide wget.

eightgames · September 27, 2007

Thanks for the fast response. Unfortunately, I'm brand new when it comes to any sort of server side programming which is why this has been so difficult. I've heard of curl, but have no idea where to begin. Would you mind posting an example that would achieve the same as the code above but written in curl?

BlueSkyIS · September 27, 2007

this has more info than i can provide, including examples:

http://us2.php.net/curl

make sure you have curl compiled in before you waste any time trying to use it. i guess you could always just call curl_init() and see if you get an 'undefined function' error.

rarebit · September 27, 2007

Depending upon what you actually want to do, there's many way's to do it...

php /home/yourusername/public_html/script.php

That will call one of your webpages, but since it's the computer that's calling it there's no point in it echo'ing out any data, instead you should save the results to a file somewhere.

According to 'man', this is what wget has got to say about the spider switch:

--spider
When invoked with this option, Wget will behave as a Web spider,

which means that it will not download the pages, just check that

they are there. For example, you can use Wget to check your book-

marks:

wget --spider --force-html -i bookmarks.html

This feature needs much more work for Wget to get close to the

functionality of real web spiders.

Now it might be too much for you to do a true spider, however a simple option would be to have an array of url's to check, you could run 'ls >> urls.txt' and generate a list in a file and either copy it or read it in... (you know)

Then, you need to check for each page. Without using curl you can use various functions, e.g. file(), file_get_contents(), stream_get_contents(), fsockopen(), etc... (check the manual)

fsockopen gives a good example, it works like a browser does, here's the example from the manual (http://uk3.php.net/manual/en/function.fsockopen.php):

<?php
$fp = fsockopen("www.example.com", 80, $errno, $errstr, 30);
if (!$fp) {
    echo "$errstr ($errno)<br />\n";
} else {
    $out = "GET / HTTP/1.1\r\n";
    $out .= "Host: www.example.com\r\n";
    $out .= "Connection: Close\r\n\r\n";

    fwrite($fp, $out);
    while (!feof($fp)) {
        echo fgets($fp, 128);
    }
    fclose($fp);
}
?>

However if your just wanting to check that the page exists, then you won't need to get the lot so cut the while loop short, get just enough to check you ain't got a 404 or similar...

eightgames · September 27, 2007

Thanks for the detailed response. That's a lot to digest for a newbie so I apologize if any of the following should be obvious.

I do understand that this line of code calls one of my webpages, which is what I want. I tried taking out the script.php and pointing the cron straight to the file that I want refreshed (example code follows). In my e-mail. I received the entire html source for this page. Not sure how to have it call the page without actually saving the results. SimplePie, which the site uses, already caches the feeds so I just need a php cron to access the page in order to activate the caching automatically. Here's the cron I used that retreived the html source:

php /home/yourusername/public_html/site_directory/index.php

As far as using the fsockopen example, where would I place that code? Does it go in the script.php that would be run by:

php /home/yourusername/public_html/script.php

Also, when you say cut the while loop short, does that mean lower the 128 value or take out that part of code so that it looks like:

<?php
$fp = fsockopen("www.example.com", 80, $errno, $errstr, 30);
if (!$fp) {
    echo "$errstr ($errno)<br />\n";
} else {
    $out = "GET / HTTP/1.1\r\n";
    $out .= "Host: www.example.com\r\n";
    $out .= "Connection: Close\r\n\r\n";

    fwrite($fp, $out);
    }
    fclose($fp);
}
?>

Thanks for your help.

Sign In

Refreshing news page with php cron

Recommended Posts

eightgames

Link to comment

Share on other sites

BlueSkyIS

Link to comment

Share on other sites

eightgames

Link to comment

Share on other sites

BlueSkyIS

Link to comment

Share on other sites

rarebit

Link to comment

Share on other sites

eightgames

Link to comment

Share on other sites

Archived

Browse

Activity

Important Information