Jump to content

How can I route my script through proxies?


tibberous

Recommended Posts

I have a script that downloads a website every half-hour. The site is very large, I'm probably downloading 100 meg a day from it, and I'm afraid of my ip getting baned. It also kind of looks like I'm doing a DOS attack against it, but I'm really just spidering it.

 

Is there some way I can reliably run all my connections through different proxy servers? I am using a curl alternative library to randomly forge my HTTP_USER agent, but I'm making a single hit, to every page on the site, every half hour - pretty easy to tell what is going on.

First of all, it would be great if you could somehow check when his page was last modified. Maybe it's stored in a header or something? And only download the page if your copy of the page is old. This could cut down on the bandwidth usage A LOT. Unless every single page changes every half hour.

 

 

As for proxying in curl.. http://www.google.com/search?q=PHP+curl+proxy&start=0&ie=utf-8&oe=utf-8

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.