Jump to content

Fastest way to read webpages?


zssz

Recommended Posts

I am currently making a script that will read a webpage and extract certain data from it. I know how to read from a webpage and parse the data, but I cannot seem to find a fast way to obtain this data.

 

I am cycling through ~5000 pages collecting data, it takes about a second per page to locate the data, parse it, and then print it out.

 

Right now I am using cURL to connect to the website.

Is it possible to get this time down by using something other than cURL?

 

Ideally, I want to be able to write this data to a file (or eventually a database) every hour or 2. With the current method, it would take about 80-85 minutes to collect the data, which seems impractical.

Link to comment
https://forums.phpfreaks.com/topic/140108-fastest-way-to-read-webpages/
Share on other sites

I think you need to make requests in parallel to speed things up.  For example, you could have 3 copies of your script.

 

Script 1: Processes urls 0, 3, 6, 9, ...

Script 2: Processes urls 1, 4, 7, 10, ...

Script 3: Processes urls 2, 5, 8, 11, ...

 

Then you'll get things done in 1/3 the time, as you have 3 requests active at any one time.

 

There may be an interface in php to do this within one script, but I have never used such a thing.

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.