pmaiorana Posted May 18, 2007 Share Posted May 18, 2007 I'm attempting to write a script that pulls a list of URLs from my database, checks to see what the last modified date of that URL is, and updates the database with each URL's new last modified timestamp. It's working in most cases right now, but it seems fsockopen can't handle redirects very well according to some google searches. This is especially problematic because most of the URLs I'm checking are RSS feeds and a lot of them redirect to feedburner now for analytics. I'm hoping someone can help me deal with the redirect issue. Can someone help me through this? I'm not a developer, I've just hacked this together -- so, talk sllloooowwllllyyyy. Here's the script: <? $user = "user"; $password = "password"; $database = "database"; $host = "mysql.domain.com"; mysql_connect($host,$user,$password); @mysql_select_db($database) or die( "Unable to select database"); function get_raw_header($host,$doc) { $httpheader = ''; $fp = fsockopen ($host, 80, $errno, $errstr, 30); if (!$fp) { echo $errstr.' ('.$errno.')'; }else{ fputs($fp, 'GET '.$doc.' HTTP/1.0'."\r\n".'Host: '.$host."\r\n\r\n"); while(!feof($fp)) { $httpresult = fgets ($fp,1024); $httpheader = $httpheader.$httpresult; if (ereg("^\r\n",$httpresult)) break; } fclose ($fp); } return $httpheader; } function get_header_array($url) { $url = ereg_replace('http://','',$url); $endHostPos = strpos($url,'/'); if(!$endHostPos) $endHostPos = strlen($url); $host = substr($url,0,$endHostPos); $doc = substr($url,$endHostPos,strlen($url)-$endHostPos); if($doc == '') $doc = '/'; $raw = get_raw_header($host,$doc); $tmpArray = explode("\n",$raw); for ($i=0;$i<sizeof($tmpArray); $i++) { @list($name, $value) = explode(':', $tmpArray[$i], 2); $array[trim($name)]=trim($value); } return $array; } //select every row in the database $query = "SELECT * FROM database"; $result = mysql_query($query) or die(mysql_error()); //for each row in the database... while($row = mysql_fetch_array($result)){ $remote_file = $row['url']; $id = $row['id']; $array = get_header_array($remote_file); $timestamp = date('Y-m-d H:i:s',strtotime($array['Last-Modified'])); $insertquery = "UPDATE database SET timestamp = '$timestamp' WHERE id = '$id'"; $insert = mysql_query($insertquery) or die(mysql_error()); echo $row['name'] . ': ' . $timestamp . '<br>'; } mysql_close(); ?> Quote Link to comment https://forums.phpfreaks.com/topic/51951-fsockopen-and-redirect/ Share on other sites More sharing options...
phast1 Posted May 18, 2007 Share Posted May 18, 2007 Can you be more specific about what the problem is? I know URLs that redirect are the problem, but what is happening when it comes across a redirect compared to what you want it to do? But, I'm sure that the fsockopen() function isn't going to automatically follow redirects, so you would probably need to view an http header that contains a redirect and see what the code needs to look for (sorry, I can't remember this at the moment), and then add some code that detects this using the strstr() function or similar and restarts the processing with the new URL.. Quote Link to comment https://forums.phpfreaks.com/topic/51951-fsockopen-and-redirect/#findComment-256090 Share on other sites More sharing options...
pmaiorana Posted May 18, 2007 Author Share Posted May 18, 2007 When everything is working correctly, the database's "timestamp" column gets populated with the URL's last modified date. When a redirect is encountered the timestamp gets populated as all zeroes. I think I understand your recommended course of action conceptually, but I wouldnt know where to start to actually implement something like that. Do you have any code snippets you could share? Quote Link to comment https://forums.phpfreaks.com/topic/51951-fsockopen-and-redirect/#findComment-256276 Share on other sites More sharing options...
phast1 Posted May 18, 2007 Share Posted May 18, 2007 I think the first step would be to determine the exact difference between an HTTP header response that works as expected and one that doesn't.. You could always add a line to your existing code to echo the raw header info that your script is seeing and then run some tests like that.. while(!feof($fp)) { $httpresult = fgets ($fp,1024); $httpheader = $httpheader.$httpresult; if (ereg("^\are\n",$httpresult)) break; } fclose ($fp); echo $httpheader; Once you know what the exact difference is, then you can start trying to figure out how to detect for this condition in your code. Quote Link to comment https://forums.phpfreaks.com/topic/51951-fsockopen-and-redirect/#findComment-256469 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.