Jump to content

Really Weird CURL Behaviour!


sloth456

Recommended Posts

This has been really frustrating me for about 2 days now.

 

    $url="http://www.goldpoll.com";
    $agent="Firefox/3.5.7";
    $referer="";

    $ch = curl_init();
    curl_setopt($ch, CURLOPT_URL, $url);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
    curl_setopt($ch, CURLOPT_REFERER, $referer);
    curl_setopt($ch, CURLOPT_USERAGENT, $agent);

    curl_close($ch);

    $source=curl_exec($ch);

    echo $source;

 

As you can see, all it does is scrape http://www.goldpoll.com.  I'm running the scraper locally, everytime I run it my browser redirects to

 

localhost/public/j?kdwN+HG+V30X1eO0TNripy8=

 

The characters at the end are random everytime.  I thought my when I'm echo'ing the code I'm also echo'ing out some redirection code, so I commented it out, I still get exactly the same thing hapening.

 

I thought, maybe there is some kind of setting not right in my server.  So I changed the url to google.com. It seems to work fine for google.

 

I thought, maybe goldpoll is blocking my I.P, but if I navigate there through my browser it works fine.

 

So I just don't get it, its really confusing me.  Does Goldpoll.com have some kind of advanced protection against scrapers?

 

Any help would be massively appreciated!

Link to comment
Share on other sites

Your code for me (atleast on my *nix server) returns nothing. Reading the headers of the site I get this:

Array
(
    [0] => HTTP/1.0 200 OK
    [Content-type] => text/html
    [Cache-Control] => no-cache, no-store, must-revalidate, max-age=0
    [Expires] => Thu, 01 Jan 1970 00:00:00 GMT
    [Connection] => close
)
1

 

And content this:

<html><head><meta·http-equiv="Cache-Control"·content="no-cache,·no-store,·must-revalidate,·max-age=0"><meta·http-equiv="Expires"·content="Thu,·01·Jan·1970·00:00:00·GMT"></head><body><script·language="JavaScript">var·strbuf·=·new·Array();strbuf[15]='y8';strbuf[14]='X';strbuf[13]='V';strbuf[12]='i';strbuf[11]='1';strbuf[10]='?mB';strbuf[9]='/j';strbuf[8]='=';strbuf[7]='hjl';strbuf[6]='2';strbuf[5]='kdp';strbuf[4]='k';strbuf[3]='js';strbuf[2]='19';strbuf[1]='D';strbuf[0]='Od';var·arr=[9,10,3,5,13,2,4,1,14,12,0,11,6,7,15,8];var·b='';for·(q·=·0;q<16;q++){b+=strbuf[arr[q]];}window.location.href=b;</script></body></html>

 

What iswith the JS? That is probably the error. It's supposed to redirect them to /local? apparently, pulling the JS and displaying it with CURL just redirects you incorrectly to localhost or whatnot. Not advances security, poor site or not so efficient obfuscation.

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.