PeerFly Posted February 9, 2009 Share Posted February 9, 2009 Ok, here's what I am trying to do: I need to log into another website, and download their reports. Seems easy right? Well, it is... because I've done it with numerous other websites. However, this one is tricky. The only way I can download the report at this website is by having the correct "key" at the end of the URL. The key is generated by their server and carries over with each link that you click on their site once logged in. Sort of like a session ID, but it's not a session ID at all. Anyway, this key is unique to the current cookie session for that user. If they were to log out, none of those links with that unique trailing "key" would work. They would have to log back in and use the new links. The trailing key is nothing but 10 numbers long. So, I've devised a small curl script that does everything... it logs into the site, grabs that trailing "key" from a hidden text field, and it downloads the CSV report. However, when I look at the downloaded report, it shows a "This page does not exist" custom apache page from their server. The problem is that it is either losing the cookie when it goes to download the report URL, or it logs out/back in when it goes to download the report URL. Here's the code (I've removed the URL's): $graburl = **URL OF CSV REPORT**; $cookiefile = tempnam("/tmp", "cookies"); $ch = curl_init(); curl_setopt($ch, CURLOPT_URL, $loginurl); curl_setopt($ch, CURLOPT_COOKIEFILE, $cookiefile); curl_setopt($ch, CURLOPT_COOKIEJAR, $cookiefile); curl_setopt($ch, CURLOPT_POST, 1); curl_setopt($ch, CURLOPT_POSTFIELDS, "$usernamefield=$username&$passwordfield=$password"); ob_start(); $output = curl_exec($ch); ob_end_clean(); **ADD CODE** $link = '**URL OF REPORT PAGE TO FETCH THE KEY**'; curl_setopt($ch, CURLOPT_URL, $link); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); $page = curl_exec($ch); $sk = '%<input type="hidden" id="window_name" value="(.*)">%'; preg_match_all($sk, $page, $results, PREG_PATTERN_ORDER); **ADD CODE** if ($output == '1') { curl_setopt($ch, CURLOPT_URL, "$graburl&session_key=".$results[1][0]); $putcsv = "report.csv"; curl_setopt($ch, CURLOPT_HEADER, 0); curl_setopt($ch, CURLOPT_COOKIEFILE, $cookiefile); curl_setopt($ch, CURLOPT_COOKIEJAR, $cookiefile); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); $result = curl_exec($ch); file_put_contents("**MY URL**".$putcsv."", "$result"); } curl_close($ch); Note that this code works beautifully on other sites where I get reports from as well. It's just the extra code marked in "**ADD CODE**" and the "&session_key=".$results[1][0]" added to the graburl that is giving me problems. It may have something to do with grabbing to URL's in the same curl session, I can't figure it out. Quote Link to comment https://forums.phpfreaks.com/topic/144540-php-curl-issue/ Share on other sites More sharing options...
sKunKbad Posted February 9, 2009 Share Posted February 9, 2009 If you've done this before, it seems that it should work. Are you on the same server? Are there any redirects involved? Perhaps FOLLOWLOCATION would help? Quote Link to comment https://forums.phpfreaks.com/topic/144540-php-curl-issue/#findComment-758555 Share on other sites More sharing options...
PeerFly Posted February 10, 2009 Author Share Posted February 10, 2009 No, this isn't on the same server, I'm curling into another server. No redirects involved. If I make it print the $results[1][0] bit and exit it before the rest of the script executes, I can tell that it captures the key as it shows a new 10 digit unique code with each page refresh. So, it's grabbing the key just fine, but when the script goes to call that second URL within the same session, it either loses the cookie or gets a new one. It seems as if I need a method to call multiple URL's using the same curl session (and same cookies). Quote Link to comment https://forums.phpfreaks.com/topic/144540-php-curl-issue/#findComment-758572 Share on other sites More sharing options...
PeerFly Posted February 10, 2009 Author Share Posted February 10, 2009 Anyone? Quote Link to comment https://forums.phpfreaks.com/topic/144540-php-curl-issue/#findComment-758715 Share on other sites More sharing options...
printf Posted February 10, 2009 Share Posted February 10, 2009 Instead of using cURL cookie file, read the header from the curl login request, grab the Set-Cookie line from the header and attach that to your cURL download request. The server is most likely setting a browser ONLY based cookie, so you need to grab it from the header because it's not going to be placed in the cookie file! Quote Link to comment https://forums.phpfreaks.com/topic/144540-php-curl-issue/#findComment-758746 Share on other sites More sharing options...
PeerFly Posted February 10, 2009 Author Share Posted February 10, 2009 The server is only setting a browser based cookie so your solution sounds very promising. Can you give me a quick example? How can I grab the cookie from the header and carry/attach it to the download request? Quote Link to comment https://forums.phpfreaks.com/topic/144540-php-curl-issue/#findComment-758755 Share on other sites More sharing options...
PeerFly Posted February 10, 2009 Author Share Posted February 10, 2009 I think I just figured it out... I will post here if I'm right. Quote Link to comment https://forums.phpfreaks.com/topic/144540-php-curl-issue/#findComment-758760 Share on other sites More sharing options...
PeerFly Posted February 10, 2009 Author Share Posted February 10, 2009 Printf, what should the contents of CURLOPT_COOKIE look like? Does it start with the Set-Cookie or does it even include that? I think I've got it figured out but having troubles carrying the cookie in CURLOPT_COOKIE once I've grabbed it from the header. Quote Link to comment https://forums.phpfreaks.com/topic/144540-php-curl-issue/#findComment-758796 Share on other sites More sharing options...
PeerFly Posted February 11, 2009 Author Share Posted February 11, 2009 Anyone know more about this? Quote Link to comment https://forums.phpfreaks.com/topic/144540-php-curl-issue/#findComment-759441 Share on other sites More sharing options...
PeerFly Posted February 11, 2009 Author Share Posted February 11, 2009 Ok, this is what it's come down to... I am able to browse to multiple pages on their server using the same cookie, no problem. I am able to grab that key from a hidden text field and append it to the url of the csv report, no problem. However, when I parse the results and put the contents into a file, it always brings back the same error you would get if you tried accessing the page with a different cookie or a wrong trailing 10 digit key. However, it's all correct! If I do this manually on their site, it works fine so I know it isn't anything to do with them. I'm absolutely stuck. Quote Link to comment https://forums.phpfreaks.com/topic/144540-php-curl-issue/#findComment-759537 Share on other sites More sharing options...
printf Posted February 11, 2009 Share Posted February 11, 2009 Run this script, changing the setting at the top and then zip up the * out.txt * file and PM me a link to download it. After I will post you some code to handle the fetching of your CSV file... $path = 'http://www.site.com/login.asp'; $post = $usernamefield . '=' . $username . '&' . $passwordfield . '=' . $password; $io = curl_init (); curl_setopt ( $io, CURLOPT_URL, $path ); curl_setopt ( $io, CURLOPT_TIMEOUT, 4 ); curl_setopt ( $io, CURLOPT_ENCODING, '' ); curl_setopt ( $io, CURLOPT_MAXREDIRS, 3 ); curl_setopt ( $io, CURLOPT_FOLLOWLOCATION, true ); curl_setopt ( $io, CURLOPT_RETURNTRANSFER, true ); curl_setopt ( $io, CURLOPT_USERAGENT, 'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)' ); curl_setopt ( $io, CURLOPT_HEADER, true ); curl_setopt ( $io, CURLOPT_CUSTOMREQUEST, 'POST' ); curl_setopt ( $io, CURLOPT_POSTFIELDS, $post ); file_put_contents ( './out.txt', trim ( curl_exec ( $io ) ) ); curl_close ( $io ); Quote Link to comment https://forums.phpfreaks.com/topic/144540-php-curl-issue/#findComment-759550 Share on other sites More sharing options...
PeerFly Posted February 11, 2009 Author Share Posted February 11, 2009 PM sent. I appreciate your help printf. Quote Link to comment https://forums.phpfreaks.com/topic/144540-php-curl-issue/#findComment-759582 Share on other sites More sharing options...
printf Posted February 11, 2009 Share Posted February 11, 2009 One question... The file download request... Is it also by way of (HTTPS) IE... (https)://www.site.com/path/file.php?(params) Quote Link to comment https://forums.phpfreaks.com/topic/144540-php-curl-issue/#findComment-759644 Share on other sites More sharing options...
PeerFly Posted February 11, 2009 Author Share Posted February 11, 2009 Yes, that's correct. Https Quote Link to comment https://forums.phpfreaks.com/topic/144540-php-curl-issue/#findComment-759947 Share on other sites More sharing options...
printf Posted February 12, 2009 Share Posted February 12, 2009 I wish you would have given me all the correct urls, but I hopefully you can figure it out. <?php function curlExecute ( $options ) { $io = curl_init (); curl_setopt ( $io, CURLOPT_HEADER, true ); curl_setopt ( $io, CURLOPT_HTTPHEADER, array ( 'Expect:') ); curl_setopt ( $io, CURLOPT_TIMEOUT, 4 ); curl_setopt ( $io, CURLOPT_ENCODING, '' ); curl_setopt ( $io, CURLOPT_MAXREDIRS, 5 ); curl_setopt ( $io, CURLOPT_FOLLOWLOCATION, true ); curl_setopt ( $io, CURLOPT_RETURNTRANSFER, true ); curl_setopt ( $io, CURLOPT_USERAGENT, 'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)' ); if ( $options['request_type'] == 'post' ) { curl_setopt ( $io, CURLOPT_POST, true ); curl_setopt ( $io, CURLOPT_POSTFIELDS, $options['post_fields'] ); curl_setopt ( $io, CURLOPT_URL, $options['post_url'] ); } else { if ( ! empty ( $options['get_fields'] ) ) { $options['get_url'] .= '?'; foreach ( $options['get_fields'] AS $k => $v ) { $options['get_url'] .= $k . '=' . urlencode ( $v ) . '&'; } } curl_setopt ( $io, CURLOPT_URL, substr ( $options['get_url'], 0, -1 ) ); } if ( true === $options['use_ssl'] ) { curl_setopt ( $io, CURLOPT_SSL_VERIFYHOST, false ); curl_setopt ( $io, CURLOPT_SSL_VERIFYPEER, false ); } if ( true === $options['use_cookie'] ) { if ( ! file_exists ( $options['cookie_file'] ) ) { file_put_contents ( $options['cookie_file'], '' ); } curl_setopt ( $io, CURLOPT_COOKIEJAR, $options['cookie_file'] ); curl_setopt ( $io, CURLOPT_COOKIEFILE, $options['cookie_file'] ); } $out = curl_exec ( $io ); curl_close ( $io ); return explode ( "\r\n\r\n", $out, 2 ); } /* 1... login page */ $options = array ( 'request_type' => 'post', 'use_ssl' => true, 'use_cookie' => true, 'cookie_file' => './cookie.txt', 'post_fields' => array ( 'login_type' => '', 'login_name' => 'data', 'login_password' => 'data' ), 'get_fields' => array (), 'post_url' => 'https://login.azoogleads.com/affiliate/login/process_login', 'get_url' => '' /* don't inculde the '?' */ ); list ( $header, $body ) = curlExecute ( $options ); /* 2... grab key page */ $options = array ( 'request_type' => 'get', 'use_ssl' => true, 'use_cookie' => true, 'cookie_file' => './cookie.txt', 'post_fields' => array (), 'get_fields' => array (), 'post_url' => '', 'get_url' => 'http://site.com/affiliate_files.php' /* don't inculde the '?' */ ); list ( $header, $body ) = curlExecute ( $options ); /* do your regex to get the key in the hidden field */ /* 3... download the file */ $options = array ( 'request_type' => 'get', 'use_ssl' => true, 'use_cookie' => true, 'cookie_file' => './cookie.txt', 'post_fields' => array (), 'get_fields' => array ('session_key' => 'session_key_value' ), 'post_url' => '', 'get_url' => 'http://site.com/downloads.php' /* don't inculde the '?' */ ); list ( $header, $body ) = curlExecute ( $options ); /* echo $body; */ ?> Quote Link to comment https://forums.phpfreaks.com/topic/144540-php-curl-issue/#findComment-760113 Share on other sites More sharing options...
PeerFly Posted February 12, 2009 Author Share Posted February 12, 2009 Ok, this is definitely progress. I've added the code and echoed all 3 of the header/bodies. The first print out is: HTTP/1.1 302 Found Date: Thu, 12 Feb 2009 10:53:30 GMT Server: Apache/2.0.59 X-Powered-By: PHP/5.1.5 Set-Cookie: affiliate_login_credentials=deleted; expires=Wed, 13-Feb-2008 10:53:30 GMT; path=/; domain=azoogleads.com Set-Cookie: affiliate_login_credentials=1018ca0891839f257f1b5ab3246a986d56974; path=/; domain=azoogleads.com; secure Location: /affiliate/home/welcome_page Content-Length: 0 Content-Type: text/html The second print out (the key page), shows the following header as well as the body: HTTP/1.1 200 OK Date: Thu, 12 Feb 2009 11:03:38 GMT Server: Apache/2.0.59 X-Powered-By: PHP/5.1.5 Transfer-Encoding: chunked Content-Type: text/html The third print out (which should be the report) shows nothing. If I place the third $body into a file_put_contents for the csv report, the file is empty. So it seems to be giving the same result... Quote Link to comment https://forums.phpfreaks.com/topic/144540-php-curl-issue/#findComment-760371 Share on other sites More sharing options...
PeerFly Posted February 12, 2009 Author Share Posted February 12, 2009 If I instruct the script to create a new cookie with each grab (cookie1, cookie2, cookie3) it creates 2 cookies that contain totally different information. This must be the problem. How can I make the second cookie append to the first cookie file instead of overwriting it with new information? Here is the first cookie: .azoogleads.com TRUE / TRUE 0 affiliate_login_credentials 3c94f2944ffba8be017af32715fe13f383531 Here is the second cookie: login.azoogleads.com FALSE / FALSE 1234437466 requested_controller affiliatestats login.azoogleads.com FALSE / FALSE 1234437466 requested_action SubReport Note that if I do it with just one cookie file, only the first cookie data is present on the file (cookie1). Nothing is added to the file, it's just overwritten. Quote Link to comment https://forums.phpfreaks.com/topic/144540-php-curl-issue/#findComment-760373 Share on other sites More sharing options...
PeerFly Posted February 12, 2009 Author Share Posted February 12, 2009 I wish these forums gave more time to modify recent posts. Anyway, I changed the file_put_contents code for the cookie file to : if ( ! file_exists ( $options['cookie_file'] ) ) { file_put_contents ( $options['cookie_file'], '' ); } else { file_put_contents ( $options['cookie_file'], '', FILE_APPEND ); } Which you would think by doing that it would append the other cookie to the file but it looks like it doesn't even get added at all. I don't get it. Quote Link to comment https://forums.phpfreaks.com/topic/144540-php-curl-issue/#findComment-760377 Share on other sites More sharing options...
PeerFly Posted February 12, 2009 Author Share Posted February 12, 2009 Disregard the above post... I didn't realize that no content was being added to the file_put_contents. Here's what I changed: if ( ! file_exists ( 'cookie.txt' ) ) { file_put_contents ( 'cookie.txt', '' ); curl_setopt ( $io, CURLOPT_COOKIEJAR, 'cookie.txt' ); curl_setopt ( $io, CURLOPT_COOKIEFILE, 'cookie.txt' ); } else { file_put_contents ( 'cookie2.txt', '' ); curl_setopt ( $io, CURLOPT_COOKIEJAR, 'cookie2.txt' ); curl_setopt ( $io, CURLOPT_COOKIEFILE, 'cookie2.txt' ); $content = file_get_contents( 'cookie2.txt' ); file_put_contents ( 'cookie.txt', '$content', FILE_APPEND ); curl_setopt ( $io, CURLOPT_COOKIEJAR, 'cookie.txt' ); curl_setopt ( $io, CURLOPT_COOKIEFILE, 'cookie.txt' ); } After all that, it's giving me the same result from your original code printf. There is no append to the cookie file and no csv report being made. I've gotta hit the sack, thanks for all your help. Hopefully you or someone else can find a solution to this issue. Looks as though it is something to do with cookies for sure. Quote Link to comment https://forums.phpfreaks.com/topic/144540-php-curl-issue/#findComment-760383 Share on other sites More sharing options...
PeerFly Posted February 13, 2009 Author Share Posted February 13, 2009 Anyone have any idea on how to make this work? Quote Link to comment https://forums.phpfreaks.com/topic/144540-php-curl-issue/#findComment-761209 Share on other sites More sharing options...
printf Posted February 13, 2009 Share Posted February 13, 2009 If I could see what you are actually doing I could help you so much better. Now I am just playing the guessing game. I know I can fix it, because there no way a server can block login access if you play the browser game. I mean act like a real web browser. If you want me to fix it without having to keep guessing send me the (login info, the url to login (login type affiliate or advertiser), the url to the page you grab that hidden input key, and the url to the file download) I promise I will not look at or touch anything not related to downloading the file you want. That's all I can offer you, because it's the only way I can see exactly what is happening, and it will allow me to fix the problem without guessing. If you want to do that, send me that information to (jbricci(AT)gmail.com) and I will have a script for you that works later today. Quote Link to comment https://forums.phpfreaks.com/topic/144540-php-curl-issue/#findComment-761216 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.