Jump to content

Refreshing news page with php cron


eightgames

Recommended Posts

All right, so here's the issue: I've created a news site that pulls in feeds from about 10 sites. The initial load is really slow but after that, you can hit refresh and the page loads fast because it has been cached. What I would like to do is set up a php cron so that the page is automatically refreshed "behind-the-scenes", exactly like this guy did: http://simplepie.org/support/viewtopic.php?pid=4025

 

Now here are the issues I've run into. When I attempted to create the following cron, it wouldn't work because my host does not support wget:

 

0,15,30,45 * * * * www-data wget -q --spider http://my-website.com

 

Instead, my host suggested using the following line:

 

php /home/yourusername/public_html/script.php (change 'yourusername' to your cpanel username)

 

So in the "script.php" file, I added this code:

 

<?php
shell_exec("wget -q --spider http://my-website.com");
?>

 

And received the following error through my e-mail:

 

sh: wget: command not found

X-Powered-By: PHP/4.4.7

Content-type: text/html

 

I'm new to PHP and crons but I'm assuming this message again means that the wget is causing the problem. Another user suggested using either fsockopen, fopen, or the curl pear extension and at this point I'm lost. Any advice on what I should be adding to the "script.php" file would be very much appreciated. Thanks.

Link to comment
Share on other sites

Thanks for the fast response. Unfortunately, I'm brand new when it comes to any sort of server side programming which is why this has been so difficult. I've heard of curl, but have no idea where to begin. Would you mind posting an example that would achieve the same as the code above but written in curl?

Link to comment
Share on other sites

Depending upon what you actually want to do, there's many way's to do it...

php /home/yourusername/public_html/script.php

That will call one of your webpages, but since it's the computer that's calling it there's no point in it echo'ing out any data, instead you should save the results to a file somewhere.

 

According to 'man', this is what wget has got to say about the spider switch:

--spider

          When invoked with this option, Wget will behave as a Web spider,

          which means that it will not download the pages, just check that

          they are there.  For example, you can use Wget to check your book-

          marks:

 

                  wget --spider --force-html -i bookmarks.html

 

          This feature needs much more work for Wget to get close to the

          functionality of real web spiders.

 

Now it might be too much for you to do a true spider, however a simple option would be to have an array of url's to check, you could run 'ls >> urls.txt' and generate a list in a file and either copy it or read it in... (you know)

 

Then, you need to check for each page. Without using curl you can use various functions, e.g. file(), file_get_contents(), stream_get_contents(), fsockopen(), etc... (check the manual)

 

fsockopen gives a good example, it works like a browser does, here's the example from the manual (http://uk3.php.net/manual/en/function.fsockopen.php):

<?php
$fp = fsockopen("www.example.com", 80, $errno, $errstr, 30);
if (!$fp) {
    echo "$errstr ($errno)<br />\n";
} else {
    $out = "GET / HTTP/1.1\r\n";
    $out .= "Host: www.example.com\r\n";
    $out .= "Connection: Close\r\n\r\n";

    fwrite($fp, $out);
    while (!feof($fp)) {
        echo fgets($fp, 128);
    }
    fclose($fp);
}
?> 

However if your just wanting to check that the page exists, then you won't need to get the lot so cut the while loop short, get just enough to check you ain't got a 404 or similar...

 

Link to comment
Share on other sites

Thanks for the detailed response. That's a lot to digest for a newbie so I apologize if any of the following should be obvious.

 

I do understand that this line of code calls one of my webpages, which is what I want. I tried taking out the script.php and pointing the cron straight to the file that I want refreshed (example code follows). In my e-mail. I received the entire html source for this page. Not sure how to have it call the page without actually saving the results. SimplePie, which the site uses, already caches the feeds so I just need a php cron to access the page in order to activate the caching automatically. Here's the cron I used that retreived the html source:

 

php /home/yourusername/public_html/site_directory/index.php

 

As far as using the fsockopen example, where would I place that code? Does it go in the script.php that would be run by:

 

php /home/yourusername/public_html/script.php

 

Also, when you say cut the while loop short, does that mean lower the 128 value or take out that part of code so that it looks like:

 

<?php
$fp = fsockopen("www.example.com", 80, $errno, $errstr, 30);
if (!$fp) {
    echo "$errstr ($errno)<br />\n";
} else {
    $out = "GET / HTTP/1.1\r\n";
    $out .= "Host: www.example.com\r\n";
    $out .= "Connection: Close\r\n\r\n";

    fwrite($fp, $out);
    }
    fclose($fp);
}
?> 

 

Thanks for your help.

 

 

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.