xgals Posted March 7, 2013 Share Posted March 7, 2013 I am trying to parse a huge website. The scripts are OK, working fine, however, there is one problem... Somehow the server knows, that the site is being parsed and constantly keeps blocking my IP. I have to restart router every time in order to proceed. Is there any way to avoid or bypass that programmatically? Quote Link to comment Share on other sites More sharing options...
Manixat Posted March 7, 2013 Share Posted March 7, 2013 (edited) Yes, don't parse. If people have put work into not letting you parse their site, they obviously don't want you to. Edited March 7, 2013 by Manixat Quote Link to comment Share on other sites More sharing options...
xgals Posted March 7, 2013 Author Share Posted March 7, 2013 Yes, don't parse. If people have put work into not letting you parse their site, they obviously don't want you to. I did not ask for moral advice, only technical. Quote Link to comment Share on other sites More sharing options...
Manixat Posted March 7, 2013 Share Posted March 7, 2013 I suppose you won't get any unmoral technical advice in this forum. Quote Link to comment Share on other sites More sharing options...
AyKay47 Posted March 7, 2013 Share Posted March 7, 2013 (edited) No advice on this topic will be given on this forum. Your IP is being blocked because what you are doing is most likely against their TOS. Edited March 7, 2013 by AyKay47 Quote Link to comment Share on other sites More sharing options...
ignace Posted March 7, 2013 Share Posted March 7, 2013 (edited) It could be blocked for several reasons though, most likely you are making too many requests. Put a decent sleep (1 second or more) amount after X requests. Edited March 7, 2013 by ignace Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.