Anidazen Posted January 20, 2007 Share Posted January 20, 2007 Hey guys,Asked about this a couple days ago and got no response, so I am re-posting and elaborating.Basically, here's the issue: - I have a script that fetches several websites while a user waits. Due to the custom nature of each search, this information cannot be cached efficiently - it has to be real time. - To do standard CURL requests, fetching and parsing each website in sequence produces an uncomfortable load time. A high quality load-bar can only buy you so much time! - A very kind member of these forums (printf) helped with a class that helped, but this was ages ago and printf no longer visits the boards (i believe). The problem is that the class is simply too unstable - with random timeouts occuring for one reason or another.So I'm looking for some stable way to download more than one website at once in PHP. I really can't believe that wanting to do this is as rare as it seems to be, I'd have thought it'd be mainstream.Anyway - does anyone have any suggestions how to do this? I am considering taking an AJAX style approach, loading each request in individual frames then either passing the information through the browser (JavaScript) or through the server (MySQL).One glimmer of hope appears to be the "PECL HTTP" class, of this website: http://pecl.php.net/package/pecl_httpIt says it supports parallel requests in PHP 5+. I don't know anything about this, and maybe somebody on this forum can give me some more info. (Does this mean seperate, concurrent pages are possible?) Seems to be very, very little community-based information on this class, and the documentation is far from helpful.Edit: Forgot to mention: is there some other technology that would be more suited to this task than PHP?So I know I've raised a lot of questions in one single post, but if people could give some help or advice to any of it, then it would be appreciated. Quote Link to comment Share on other sites More sharing options...
ShogunWarrior Posted January 20, 2007 Share Posted January 20, 2007 You could have a look at the cURL multi-url functions.As for the other technology question, yes it could probably be done more efficiently in other languages, even if it's only Perl. Quote Link to comment Share on other sites More sharing options...
printf Posted January 20, 2007 Share Posted January 20, 2007 The class has been updated many times, I think it on version 1.2 now, I know earlier versions had some problems, but without knowing what your doing with the class, makes it difficult to even help you figure how I can make you a custom version for what your doing. The new version can fetch 1000 pages with 20 concurrent streams in less than 5 seconds. I have people using it with the XML extended class fetching thousands of document every hour. For Windows users I even added a service option, the class can listen on a certain port and handle soap, xml, http request. I have it running as a spider, it does around 400,000 + pages a hour, that includes full indexing with the extended extractor class (page, images, CSS, JavaScrpt). PM me and I will help you...printf Quote Link to comment Share on other sites More sharing options...
Anidazen Posted January 20, 2007 Author Share Posted January 20, 2007 Print!Awesome to see you're still around, I thought you'd left the boards. :)PM incoming. Quote Link to comment Share on other sites More sharing options...
wpt394 Posted July 21, 2007 Share Posted July 21, 2007 Does anyone know what class is being referred to here???? I'm using multi curl to get information from a bunch of webpages, but my request still takes 20seconds or so....Would be nice to try to speed it up a little bit. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.