Jump to content

Recommended Posts

I'm attempting to write a script that pulls a list of URLs from my database, checks to see what the last modified date of that URL is, and updates the database with each URL's new last modified timestamp. It's working in most cases right now, but it seems fsockopen can't handle redirects very well according to some google searches.

 

This is especially problematic because most of the URLs I'm checking are RSS feeds and a lot of them redirect to feedburner now for analytics.

 

I'm hoping someone can help me deal with the redirect issue. Can someone help me through this? I'm not a developer, I've just hacked this together -- so, talk sllloooowwllllyyyy. Here's the script:

 

<?

$user = "user";
$password = "password";
$database = "database";
$host = "mysql.domain.com";
mysql_connect($host,$user,$password);
@mysql_select_db($database) or die( "Unable to select database");

function get_raw_header($host,$doc)
{
$httpheader = '';
$fp = fsockopen ($host, 80, $errno, $errstr, 30);
if (!$fp)
{
	echo $errstr.' ('.$errno.')';
}else{

	fputs($fp, 'GET '.$doc.' HTTP/1.0'."\r\n".'Host: '.$host."\r\n\r\n");

	while(!feof($fp))
	{
		$httpresult = fgets ($fp,1024);
		$httpheader = $httpheader.$httpresult;
		if (ereg("^\r\n",$httpresult))
		break;
	}

	fclose ($fp);
}
return $httpheader;
}

function get_header_array($url)
{
$url = ereg_replace('http://','',$url);
$endHostPos = strpos($url,'/');
if(!$endHostPos) $endHostPos = strlen($url);
$host = substr($url,0,$endHostPos);
$doc = substr($url,$endHostPos,strlen($url)-$endHostPos);
if($doc == '') $doc = '/';
$raw = get_raw_header($host,$doc);
$tmpArray = explode("\n",$raw);
for ($i=0;$i<sizeof($tmpArray); $i++)
{
	@list($name, $value) = explode(':', $tmpArray[$i], 2);
	$array[trim($name)]=trim($value);
}
return $array;
}

//select every row in the database
$query = "SELECT * FROM database"; 
$result = mysql_query($query) or die(mysql_error());

//for each row in the database...
while($row = mysql_fetch_array($result)){

	$remote_file = $row['url'];
	$id = $row['id'];
	$array = get_header_array($remote_file);
	$timestamp = date('Y-m-d H:i:s',strtotime($array['Last-Modified']));
	$insertquery = "UPDATE database SET timestamp = '$timestamp' WHERE id = '$id'";
	$insert = mysql_query($insertquery) or die(mysql_error());

	echo $row['name'] . ': ' . $timestamp . '<br>';

}

mysql_close();

?>

 

Link to comment
https://forums.phpfreaks.com/topic/51951-fsockopen-and-redirect/
Share on other sites

Can you be more specific about what the problem is? I know URLs that redirect are the problem, but what is happening when it comes across a redirect compared to what you want it to do?

 

But, I'm sure that the fsockopen() function isn't going to automatically follow redirects, so you would probably need to view an http header that contains a redirect and see what the code needs to look for (sorry, I can't remember this at the moment), and then add some code that detects this using the strstr() function or similar and restarts the processing with the new URL..

 

When everything is working correctly, the database's "timestamp" column gets populated with the URL's last modified date. When a redirect is encountered the timestamp gets populated as all zeroes.

 

I think I understand your recommended course of action conceptually, but I wouldnt know where to start to actually implement something like that. Do you have any code snippets you could share?

I think the first step would be to determine the exact difference between an HTTP header response that works as expected and one that doesn't.. You could always add a line to your existing code to echo the raw header info that your script is seeing and then run some tests like that..

 

	while(!feof($fp))
	{
		$httpresult = fgets ($fp,1024);
		$httpheader = $httpheader.$httpresult;
		if (ereg("^\are\n",$httpresult))
		break;
	}

	fclose ($fp);
echo $httpheader;

 

Once you know what the exact difference is, then you can start trying to figure out how to detect for this condition in your code.

 

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.