Jump to content

Really Weird CURL Behaviour!


sloth456

Recommended Posts

This has been really frustrating me for about 2 days now.

 

    $url="http://www.goldpoll.com";
    $agent="Firefox/3.5.7";
    $referer="";

    $ch = curl_init();
    curl_setopt($ch, CURLOPT_URL, $url);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
    curl_setopt($ch, CURLOPT_REFERER, $referer);
    curl_setopt($ch, CURLOPT_USERAGENT, $agent);

    curl_close($ch);

    $source=curl_exec($ch);

    echo $source;

 

As you can see, all it does is scrape http://www.goldpoll.com.  I'm running the scraper locally, everytime I run it my browser redirects to

 

localhost/public/j?kdwN+HG+V30X1eO0TNripy8=

 

The characters at the end are random everytime.  I thought my when I'm echo'ing the code I'm also echo'ing out some redirection code, so I commented it out, I still get exactly the same thing hapening.

 

I thought, maybe there is some kind of setting not right in my server.  So I changed the url to google.com. It seems to work fine for google.

 

I thought, maybe goldpoll is blocking my I.P, but if I navigate there through my browser it works fine.

 

So I just don't get it, its really confusing me.  Does Goldpoll.com have some kind of advanced protection against scrapers?

 

Any help would be massively appreciated!

Link to comment
https://forums.phpfreaks.com/topic/189422-really-weird-curl-behaviour/
Share on other sites

Your code for me (atleast on my *nix server) returns nothing. Reading the headers of the site I get this:

Array
(
    [0] => HTTP/1.0 200 OK
    [Content-type] => text/html
    [Cache-Control] => no-cache, no-store, must-revalidate, max-age=0
    [Expires] => Thu, 01 Jan 1970 00:00:00 GMT
    [Connection] => close
)
1

 

And content this:

<html><head><meta·http-equiv="Cache-Control"·content="no-cache,·no-store,·must-revalidate,·max-age=0"><meta·http-equiv="Expires"·content="Thu,·01·Jan·1970·00:00:00·GMT"></head><body><script·language="JavaScript">var·strbuf·=·new·Array();strbuf[15]='y8';strbuf[14]='X';strbuf[13]='V';strbuf[12]='i';strbuf[11]='1';strbuf[10]='?mB';strbuf[9]='/j';strbuf[8]='=';strbuf[7]='hjl';strbuf[6]='2';strbuf[5]='kdp';strbuf[4]='k';strbuf[3]='js';strbuf[2]='19';strbuf[1]='D';strbuf[0]='Od';var·arr=[9,10,3,5,13,2,4,1,14,12,0,11,6,7,15,8];var·b='';for·(q·=·0;q<16;q++){b+=strbuf[arr[q]];}window.location.href=b;</script></body></html>

 

What iswith the JS? That is probably the error. It's supposed to redirect them to /local? apparently, pulling the JS and displaying it with CURL just redirects you incorrectly to localhost or whatnot. Not advances security, poor site or not so efficient obfuscation.

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.