kaje Posted April 1, 2009 Share Posted April 1, 2009 OK, so I have a few pages that parse data from a 3rd party website everytime the page is loaded. I am wanting to prevent this and set up my website a way so that it doesn't parse the data from the 3rd party site everytime the page is opened but maybe only a couple times a day (to update the data). To do this, I am wanting to make a copy of the 3rd party page, save it onto my site, then allow my other page to parse data from the copy everytime it is opened. I'm not sure how to go about doing this. I was going to have a friend run a cronjob for me that would open a php page with code like this: $contents = file_get_html('http://www.site.com/'); file_put_contents('test.html', $contents); $contents = file_get_html('http://www.site.com/2'); file_put_contents('test2.html', $contents); $contents = file_get_html('http://www.site.com/3'); file_put_contents('test3.html', $contents); $contents = file_get_html('http://www.site.com/4'); file_put_contents('test4.html', $contents); However, after I add like 5 sites to this, the page gives me this error: Fatal error: Allowed memory size of 18874368 bytes exhausted (tried to allocate 40 bytes) in /etc/etc/etc/simple_html_dom.php on line 618 I've looked up this error but everyone says to increase the memory size but I'm going to end up having possibly 100+ sites on this thing and don't think it's necessary to increase the memory x20 for this time of job. Am I going about this the wrong way? If not, how can I prevent it from using memory so quickly? I'm not going to be reusing the site so I'm not sure why using the same variable for another page causes it to run out of memory rather than reusing the same memory after the page has been copied. Quote Link to comment https://forums.phpfreaks.com/topic/152139-solved-make-copies-of-webpages/ Share on other sites More sharing options...
WolfRage Posted April 1, 2009 Share Posted April 1, 2009 Try using unset() on the $content variable and see if that releases the memory in between usage. Quote Link to comment https://forums.phpfreaks.com/topic/152139-solved-make-copies-of-webpages/#findComment-799036 Share on other sites More sharing options...
laffin Posted April 1, 2009 Share Posted April 1, 2009 Use a loop and store the data wut good does it do u if u dont store the pages, if yer trying to minimize the amount of retrievals. $sites=array('http://www.site.com/','http://www.site.com/2','http://www.site.com/3','http://www.site.com/4'); foreach($sites as $idx=>$site) { $filename="test{$idx}.html"; $contents=''; if(file_exists($filename) && ((filemtime($filename) + (4 *3600)) < time()) $contents=file_get_contents($filename); else { $contents=file_get_html($site); file_put_contents($filename,$contents); } } This shud work note the 4*3600, this is amount of time (4 hrs) to use our cached files Quote Link to comment https://forums.phpfreaks.com/topic/152139-solved-make-copies-of-webpages/#findComment-799040 Share on other sites More sharing options...
kaje Posted April 1, 2009 Author Share Posted April 1, 2009 Try using unset() on the $content variable and see if that releases the memory in between usage. Same error. Quote Link to comment https://forums.phpfreaks.com/topic/152139-solved-make-copies-of-webpages/#findComment-799048 Share on other sites More sharing options...
kaje Posted April 1, 2009 Author Share Posted April 1, 2009 Use a loop and store the data wut good does it do u if u dont store the pages, if yer trying to minimize the amount of retrievals. $sites=array('http://www.site.com/','http://www.site.com/2','http://www.site.com/3','http://www.site.com/4'); foreach($sites as $idx=>$site) { $filename="test{$idx}.html"; $contents=''; if(file_exists($filename) && ((filemtime($filename) + (4 *3600)) < time()) $contents=file_get_contents($filename); else { $contents=file_get_html($site); file_put_contents($filename,$contents); } } This shud work note the 4*3600, this is amount of time (4 hrs) to use our cached files That looks like it would work for my "example" but the site(s) isn't/aren't setup as easy as site.com/1 site.com/2 etc. I just came up with that quickly to explain the situation. The URLs of the pages I cam pulling from can be words (basketball.html) or URL variables (schedule.php?ID=1231, schedule.php?ID1934, etc) and don't come from the same domain so I don't think a loop would work for this instance. Quote Link to comment https://forums.phpfreaks.com/topic/152139-solved-make-copies-of-webpages/#findComment-799050 Share on other sites More sharing options...
WolfRage Posted April 1, 2009 Share Posted April 1, 2009 You could try sleeping the script in between request, in order to give it more time to free up resources in between request. Quote Link to comment https://forums.phpfreaks.com/topic/152139-solved-make-copies-of-webpages/#findComment-799054 Share on other sites More sharing options...
kaje Posted April 2, 2009 Author Share Posted April 2, 2009 I ended up fixing the issue by using a file_get_contents rather than file_get_html. Thanks for the help, guys. Quote Link to comment https://forums.phpfreaks.com/topic/152139-solved-make-copies-of-webpages/#findComment-799179 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.