mejpark Posted June 13, 2009 Share Posted June 13, 2009 Hello. I am building a basic link checker at work using cURL. My application has a function called getHeaders() that returns an array of HTTP headers: function getHeaders($url) { if(function_exists('curl_init')) { // create a new cURL resource $ch = curl_init(); // set URL and other appropriate options $options = array( CURLOPT_URL => $url, CURLOPT_HEADER => true, CURLOPT_NOBODY => true, CURLOPT_FOLLOWLOCATION => 1, CURLOPT_RETURNTRANSFER => true ); curl_setopt_array($ch, $options); // grab URL and pass it to the browser curl_exec($ch); $headers = curl_getinfo($ch); // close cURL resource, and free up system resources curl_close($ch); } else { echo "<p>Error: <a href=\"http://uk.php.net/manual/en/book.curl.php\">cURL<a/> is not installed on the web server. Unable to continue.</p>"; return false; } return $headers; } print_r(getHeaders('mail.google.com')); Which yields the following results: Array ( [ur1] => http://mail.google.com [content_type] => text/html; charset=UTF-8 [http_code] => 404 [header_size] => 338 [request_size] => 55 [filetime] => -1 [ssl_verify_result] => 0 [redirect_count] => 0 [total_time] => 0.128 [namelookup_time] => 0.042 [connect_time] => 0.095 [pretransfer_time] => 0.097 [size_upload] => 0 [size_download] => 0 [speed_download] => 0 [speed_upload] => 0 [download_content_length] => 0 [upload_content_length] => 0 [starttransfer_time] => 0.128 [redirect_time] => 0 ) (In case you're wondering, I changed the 'url' key to stop the forum interpreting it as BB Code) I've tested it with several long links, and the function acknowledges redirects, all apart from mail.google.com it seems. For fun, I passed the same URL (mail.google.com) to the W3C link checker, which produced: Results Links Valid links! List of redirects The links below are not broken, but the document does not use the exact URL, and the links were redirected. It may be a good idea to link to the final location, for the sake of speed. warning Line: 1 http://mail.google.com/mail/ redirected to https://www.google.com/accounts/ServiceLogin?service=mail&passive=true&rm=false&continue=http%3A%2F%2Fmail.google.com%2Fmail%2F%3Fui%3Dhtml%26zy%3Dl&bsv=zpwhtygjntrz&scc=1<mpl=default<mplcache=2 Status: 302 -> 200 OK This is a temporary redirect. Update the link if you believe it makes sense, or leave it as is. Anchors Found 0 anchors. Checked 1 document in 4.50 seconds. Which is correct, as the address above is where I am redirected to when I enter mail.google.com into my browser. What cURL options would I need to use to make my function return 200 for mail.google.com? Why is it that the function above returns 404 status code as opposed to 302 status code? TIA Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.