Jump to content

cURL results not what expected


kireol

Recommended Posts

hello,

 

I was playing with cURL on a website and it didnt work.  So I switched to a different website and it worked just fine.  So that little part of my brain that is always working started wondering why cURL wouldnt work with the first website that I had tried.

 

 

I'm hoping that someone here can help me figure out why this particular website wont work with my cURL script, and help me get it working.

 

if you open up a browser and enter http://eppraisal.com/index.aspx?a=616+Orchard+View&z=48073

 

that will load up a page and show you the values of a house.  If you view source, you can see the values of the house in there.

 

with my script...

<?php
    $ch = curl_init(); 
    $LOGINURL = "http://eppraisal.com/index.aspx?a=616+Orchard+View&z=48073";
    $agent = "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; .NET CLR 1.1.4322)";
    if ($cookies != '')
    {
        if (substr(PHP_OS, 0, 3) == 'WIN')
                {$cookies = str_replace('\\','/', getcwd().'/'.$cookies);}
        curl_setopt($ch, CURLOPT_COOKIEJAR, $cookies);
        curl_setopt($ch, CURLOPT_COOKIEFILE, $cookies);
    }
    curl_setopt($ch, CURLOPT_URL,$LOGINURL);
    curl_setopt($ch, CURLOPT_USERAGENT, $agent);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); 
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
    $result = curl_exec ($ch);
    echo $result;
?>

 

I would expect the same page to come up.  It does load a page, but not the same page as a browser would.  What am I missing? 

 

Many thanks ahead of time.

 

-=Kireol

Link to comment
Share on other sites

while i do agree that it loads a ton of js files, i must be missing or not understanding something.

 

if you view the source page of what my script gets, and view the source of a browser, they are radically different.  They start out the same first 53 lines.  then they are way different.  I dont see how the javascript would change the main HTML page unless something like a redirect happened.  Correct me if I'm wrong please, trying to figure this one out.

Link to comment
Share on other sites

Thanks for the help so far.

 

 

Yes, I'm using the same agent. 

 

In my actual IE browser, once the page is loaded, I can View|source.There will be a line

 

                                    <td align="center"><span id="ctl00_ContentPlaceHolder1_MiddleValueRange" class="ValueRange">$288,725</span></td>

 

If I run my cURL code and spit it out to a text file or to a browser, that line doesnt exist.  None of the info for the house exists.  the values, the # of bedrooms, square footage, etc.  it's completely different.  but the header (first 53 lines of HTML) is the same.

Link to comment
Share on other sites

I suspect it's using javascript to do the redirect.  When I view the site in firefox a box comes up asking me to wait, then I am sent to that second URL.  The second url seems to know what property I want, so that data is probably stored server side.  And the second url DOES appear to have prices in its source.

Link to comment
Share on other sites

Ya, I noticed the same in FF.  So, do you think to do that page correctly, I would have to walk through the HTML/js that I'm getting, recreate all of that code/logic in PHP, and in theory it sure give me a URL to the final page?  heh, doesnt sound worth it anymore.  ;) 

    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);  does that anyway.  not sure why this web site is different.

 

oh well.  thanks though

Link to comment
Share on other sites

Your script doesn't have to walk through the js, but you do, so you know which requests to do.. I expect they are the same requests every time.  I've done something similar before.  It's a huge lot of work, but it is possible.  I even had to parse certain tokens out of some urls to get it working properly.

 

But my script never understood the js itself.  It was me interpreting the js and telling it to blindly send off some requests that the js would have told it to do.

 

FOLLOWLOCATION doesn't work because the redirect is done in javascript, not at the HTTP level.

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.