Jump to content

Recommended Posts

I'm trying to allow indexing of pages in my site which require a cookie. (I want the code to detect indexing by searchbots and to bypass the restrictions). Here is my code. It doesn't work.

 

	
if ( strstr($_SERVER['HTTP_USER_AGENT'], "Googlebot" ) == true ){
  //User has the GoogleBot user agent, but is it a real google bot?
  $host = gethostbyaddr($_SERVER['REMOTE_ADDR']);
  if ( substr($host, (strlen($host)-13)) == 'googlebot.com' )
  {
  }
  	//real bot

  else
  	//fake bot or general access to page
if(!isset($_COOKIE['legal'])) {  	
header("Location: /index.php");	 
}

if($_COOKIE['legal'] == "no")
		{
		header("Location: /index.php");	
		}		

 

Even is I get this to work it will only work for Google.

Is there a better way to go about this. I don't need to log the activity of the searchbots. I just want them to gain access to the site.

Link to comment
https://forums.phpfreaks.com/topic/133325-detect-search-bot-and-bypass-cookie/
Share on other sites

Yes. Once the bot is legitimate I want the code to do nothing. If the bot is not legitimate or it's someone who is trying to access the page without gaining the proper cookie value I want the code to redirect back to 'index.htm' for a login form.

 

<?php
if ( strstr($_SERVER['HTTP_USER_AGENT'], "Googlebot" ) == true ){  // condition (1)
//User has the GoogleBot user agent, but is it a real google bot?
$host = gethostbyaddr($_SERVER['REMOTE_ADDR']);
if ( substr($host, (strlen($host)-13)) == 'googlebot.com' ){  // condition (2)
	//real bot
} 
else if (!isset($_COOKIE['legal'])) {     //condition (3)
	header("Location: /index.php");   
}

//The code below is executed even if conditions (1) and (2) are true.
if($_COOKIE['legal'] == "no"){ //condition (4)
	header("Location: /index.php");   
}

// you're missing a } here

Ok tried that but it bypassed the cookie security on the site. Pages displayed even if cookie didn't exist.

<?php 	

if ( strstr($_SERVER['HTTP_USER_AGENT'], "Googlebot" ) == true ){  // condition (1)
//User has the GoogleBot user agent, but is it a real google bot?
$host = gethostbyaddr($_SERVER['REMOTE_ADDR']);
if ( substr($host, (strlen($host)-13)) == 'googlebot.com' ){  // condition (2)
	//real bot
} 
else if (!isset($_COOKIE['legal'])) {     //condition (3)
	header("Location: /index.php");   
}

//The code below is executed even if conditions (1) and (2) are true.
if($_COOKIE['legal'] == "no"){ //condition (4)
	header("Location: /index.php");   
}

}

?>

 

I want condition 1 and 2 to be checked. If true then I want the page to be viewable without any more condition checks. If false then I want both condition 3 and 4 to be checked. Do I need to put condition 4 into an ELSE IF statement?

 

Basically you need to enlose them in {} after else

<?php
if ( strstr($_SERVER['HTTP_USER_AGENT'], "Googlebot" ) == true ){  // condition (1)
//User has the GoogleBot user agent, but is it a real google bot?
$host = gethostbyaddr($_SERVER['REMOTE_ADDR']);
if ( substr($host, (strlen($host)-13)) == 'googlebot.com' ){  // condition (2)
	//real bot
} 
else {
if (!isset($_COOKIE['legal'])) {     //condition (3)
		header("Location: /index.php");   
	}

	if($_COOKIE['legal'] == "no"){ //condition (4)
		header("Location: /index.php");   
	}
}	
}

 

Ok I rewrote it as this

 
<?php 	

if ( strstr($_SERVER['HTTP_USER_AGENT'], "Googlebot" ) == true ){  // condition (1)
//User has the GoogleBot user agent, but is it a real google bot?
$host = gethostbyaddr($_SERVER['REMOTE_ADDR']);
if ( substr($host, (strlen($host)-13)) == 'googlebot.com' ){  // condition (2)
	//real bot
} 
}
else if (!isset($_COOKIE['legal'])) {     //condition (3)
	header("Location: /index.php");   
}

else 
if($_COOKIE['legal'] == "no"){ //condition (4)
	header("Location: /index.php");   
}


?>

 

It seems to work but google is still failing to index pages.

Any advice?

 

if ( strstr($_SERVER['HTTP_USER_AGENT'], "Googlebot" ) == true ){  // condition (1)

 

That if is techincally wrong. SInce strstr can return 0 it could be a false positive. (I think thats the right word)

 

if ( strstr($_SERVER['HTTP_USER_AGENT'], "Googlebot" ) !== FALSE ){  // condition (1)

 

That should produce the right result for that part. If that is the problem, I do not know. Just saw that it was wrong =).

Ok tried that and pages still failed to index.

Nothing on my site is indexing in google. (I'm using google webmaster tools to submit sitemap and show diagnostics.)

Is it possible to recreate this problem locally with a different value other than googlebot so that I can test my code? That way I could see if my code was working correctly?

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.