champrock Posted June 17, 2008 Share Posted June 17, 2008 HI I have a list of websites and i want to check if they are working or not. Is there any way to configure cURL to just get the header code that it gets? I dont want to download the full page as that will consume a lot of bandwidth. I believe "http_code" does that trick but that is after it fetches the full page. I dont want to fetch the full page. any suggestion or comments on how this can be done? thanks a lot Quote Link to comment Share on other sites More sharing options...
champrock Posted June 17, 2008 Author Share Posted June 17, 2008 (i will be trying to use the multithreaded option in curl to process several requests at once to save time.) Quote Link to comment Share on other sites More sharing options...
champrock Posted June 17, 2008 Author Share Posted June 17, 2008 is this even possible? Quote Link to comment Share on other sites More sharing options...
mikeschroeder Posted June 17, 2008 Share Posted June 17, 2008 function check_link($link) { $main = array(); $ch = curl_init(); curl_setopt ($ch, CURLOPT_URL, $link); curl_setopt ($ch, CURLOPT_HEADER, 1); curl_setopt ($ch, CURLOPT_NOBODY, 1); curl_setopt ($ch, CURLOPT_FOLLOWLOCATION, 1); curl_setopt ($ch, CURLOPT_NETRC, 1); // omit if you know no urls are FTP links... curl_setopt ($ch, CURLOPT_TIMEOUT, 300); curl_setopt ($ch, CURLOPT_USERAGENT, "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)"); ob_start(); curl_exec ($ch); $stuff = ob_get_contents(); ob_end_clean(); curl_close ($ch); $parts = split("n",$stuff,2); $main = split(" ",$parts[0],3); return $main; } $start_time = microtime(true); $db_starttime = mktime(); $return = check_link($_GET['url']); $end_time = microtime(true); // $return[0] is the protocol information // $return[1] is the header information $load_time = round($end_time - $start_time,2); print_r($return); Quote Link to comment Share on other sites More sharing options...
champrock Posted June 17, 2008 Author Share Posted June 17, 2008 does this only get the headers? I dont want to download the full file cos that means loads of BW usage. curl_exec will get the full page right ? or am i missing something in this? Quote Link to comment Share on other sites More sharing options...
mikeschroeder Posted June 18, 2008 Share Posted June 18, 2008 Well personally I probably would have just made the function return $stuff, by commenting out the parts where i break it apart. Since $stuff contains the RAW output from curl. Then displayed it with print_r(); function check_link($link) { $main = array(); $ch = curl_init(); curl_setopt ($ch, CURLOPT_URL, $link); curl_setopt ($ch, CURLOPT_HEADER, 1); curl_setopt ($ch, CURLOPT_NOBODY, 1); curl_setopt ($ch, CURLOPT_FOLLOWLOCATION, 1); curl_setopt ($ch, CURLOPT_NETRC, 1); // omit if you know no urls are FTP links... curl_setopt ($ch, CURLOPT_TIMEOUT, 300); curl_setopt ($ch, CURLOPT_USERAGENT, "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)"); ob_start(); curl_exec ($ch); $stuff = ob_get_contents(); ob_end_clean(); curl_close ($ch); // $parts = split("n",$stuff,2); // $main = split(" ",$parts[0],3); return $stuff; } $link = 'http://www.phpfreaks.com'; $returnData = check_link($link); print_r($returnData); But regardless, the function ONLY returns the headers just as you had asked for in your original post. Expect a response similar to this. HTTP/1.1 200 OK Date: Wed, 18 Jun 2008 15:47:46 GMT Server: Apache/1.3.37 (Unix) PHP/5.2.3 mod_auth_passthrough/1.8 mod_log_bytes/1.2 mod_bwlimited/1.4 FrontPage/5.0.2.2635.SR1.2 mod_ssl/2.8.28 OpenSSL/0.9.7a X-Powered-By: PHP/5.2.3 Set-Cookie: PHPSESSID=63b59aa92e02bfe3bce4b575e05ce992; path=/ Expires: Thu, 19 Nov 1981 08:52:00 GMT Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0 Pragma: no-cache Content-Type: text/html Quote Link to comment Share on other sites More sharing options...
champrock Posted June 18, 2008 Author Share Posted June 18, 2008 I think i need to clarify my problem a bit. I dont want to download the entire page on the server also . Its not about outputting only the headers. I want the server to just check the headers. For example: if the file size is 10mb, then according the cURL command provided above, cURL will download the full file on the server and then output only the headers. My problem is that I dont want the server to download 10mb ! I just need the headers Quote Link to comment Share on other sites More sharing options...
mikeschroeder Posted June 18, 2008 Share Posted June 18, 2008 Just in case you were right, which in this case you are not. I generated a 100MB .bin file on one of my servers. I then tested my curl function against it. Resulting in just the headers being returned and a total execution time of 0.11 seconds. HTTP/1.1 200 OK Date: Wed, 18 Jun 2008 17:56:32 GMT Server: Apache/2 Last-Modified: Wed, 18 Jun 2008 17:55:45 GMT ETag: "467070-5f5e100-44ff491e28a40" Accept-Ranges: bytes Content-Length: 100000000 Vary: Accept-Encoding,User-Agent Content-Type: application/octet-stream 0.11 Seconds So, to clarify again, it is not downloading the entire 100MB file. ONLY returning the headers. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.