Jump to content

Recommended Posts

OK, so I have a few pages that parse data from a 3rd party website everytime the page is loaded. I am wanting to prevent this and set up my website a way so that it doesn't parse the data from the 3rd party site everytime the page is opened but maybe only a couple times a day (to update the data). To do this, I am wanting to make a copy of the 3rd party page, save it onto my site, then allow my other page to parse data from the copy everytime it is opened.

 

I'm not sure how to go about doing this. I was going to have a friend run a cronjob for me that would open a php page with code like this:

 

$contents = file_get_html('http://www.site.com/');
file_put_contents('test.html', $contents);
$contents = file_get_html('http://www.site.com/2');
file_put_contents('test2.html', $contents);
$contents = file_get_html('http://www.site.com/3');
file_put_contents('test3.html', $contents);
$contents = file_get_html('http://www.site.com/4');
file_put_contents('test4.html', $contents);

 

However, after I add like 5 sites to this, the page gives me this error:

 

Fatal error: Allowed memory size of 18874368 bytes exhausted (tried to allocate 40 bytes) in /etc/etc/etc/simple_html_dom.php on line 618

 

I've looked up this error but everyone says to increase the memory size but I'm going to end up having possibly 100+ sites on this thing and don't think it's necessary to increase the memory x20 for this time of job. Am I going about this the wrong way? If not, how can I prevent it from using memory so quickly? I'm not going to be reusing the site so I'm not sure why using the same variable for another page causes it to run out of memory rather than reusing the same memory after the page has been copied.

Link to comment
https://forums.phpfreaks.com/topic/152139-solved-make-copies-of-webpages/
Share on other sites

Use a loop

and store the data

wut good does it do u if u dont store the pages, if yer trying to minimize the amount of retrievals.

$sites=array('http://www.site.com/','http://www.site.com/2','http://www.site.com/3','http://www.site.com/4');
foreach($sites as $idx=>$site)
{
      $filename="test{$idx}.html";
      $contents='';
      if(file_exists($filename) && ((filemtime($filename) + (4 *3600)) < time())
        $contents=file_get_contents($filename);
      else {
         $contents=file_get_html($site);
         file_put_contents($filename,$contents);
      }
}

This shud work

note the 4*3600, this is amount of time (4 hrs) to use our cached files :)

 

Use a loop

and store the data

wut good does it do u if u dont store the pages, if yer trying to minimize the amount of retrievals.

$sites=array('http://www.site.com/','http://www.site.com/2','http://www.site.com/3','http://www.site.com/4');
foreach($sites as $idx=>$site)
{
      $filename="test{$idx}.html";
      $contents='';
      if(file_exists($filename) && ((filemtime($filename) + (4 *3600)) < time())
        $contents=file_get_contents($filename);
      else {
         $contents=file_get_html($site);
         file_put_contents($filename,$contents);
      }
}

This shud work

note the 4*3600, this is amount of time (4 hrs) to use our cached files :)

 

That looks like it would work for my "example" but the site(s) isn't/aren't setup as easy as site.com/1 site.com/2 etc. I just came up with that quickly to explain the situation.

 

The URLs of the pages I cam pulling from can be words (basketball.html) or URL variables (schedule.php?ID=1231, schedule.php?ID1934, etc) and don't come from the same domain so I don't think a loop would work for this instance.

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.