Jump to content

Recommended Posts

Here is the deal, its a simple script.

 

1-read a site link from a sql table**(sql1)

2-mark the sql1 line as read

3-goes to the site and capture several pieces of data(i used curl)

4-modify the read data

5-writes the data in another sql(sql2)

 

So i did this with a few links, but i have to do this with 5~10 millions of links, what would be the better way to get performance and speed?

Edited by bores_escalovsk

Other than moving step 2 to the end so that you are assured that the process is complete, what's wrong with this approach?

 

(assuming that an sql file is a table)

im not sure about how much time, and my script is not multi tasking..

  • Solution

You'll want to add some concurrency to the process. The easiest place would be in the downloading stage using a library such as Rolling-CURL to download several URLs at once.

 

If you're feeling adventurous then you could also look into pcntl_fork to create multiple processes which will process urls/results concurrently as well.

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.