Jump to content

My Rewrite Condition Doesn't Work (Htaccess)?


spike-spiegel

Recommended Posts

My Rewrite condition doesn't work (htaccess)?

 

I'm trying to allow only the bots bellow to access my site content, however, I'm receiving 403 error every time I try to submit a sitemap to Google webmaster tools.

 

When I delete the rules, I stop getting the error. It's more like Googlebot is the wrong name of the bot.. IDK...

 

What I need to do is to allow Googlebot, bing, yahoo and altavista, then block any other bot that tries to access the website.

 

 

 

My rules are as follows:

 

 

RewriteEngine on

RewriteBase /

 

-----> There are other rules here, just system related. <-----

 

 

RewriteCond %{HTTP_USER_AGENT} Googlebot [OR]

RewriteCond %{HTTP_USER_AGENT} Googlebot-2.1 [OR]

RewriteCond %{HTTP_USER_AGENT} Googlebot-Mozilla-2.1 [OR]

RewriteCond %{HTTP_USER_AGENT} Google-AdSense-2.1 [OR]

RewriteCond %{HTTP_USER_AGENT} Googlebot-Image [OR]

 

RewriteCond %{HTTP_USER_AGENT} AdsBot-Google [OR]

RewriteCond %{HTTP_USER_AGENT} msnbot [OR]

RewriteCond %{HTTP_USER_AGENT} AltaVista [OR]

 

 

RewriteCond %{HTTP_USER_AGENT} Yahoo-Slurp [OR]

 

RewriteCond %{HTTP_USER_AGENT} Alexa-1 [OR]

RewriteCond %{HTTP_USER_AGENT} Alexa-2 [OR]

 

RewriteCond %{HTTP_USER_AGENT} MSN-1.0 [OR]

 

 

 

RewriteCond %{HTTP_USER_AGENT} Slurp

 

RewriteRule .* - [F,L]

Edited by spike-spiegel
Link to comment
Share on other sites

I'm trying to allow only the bots bellow to access my site content, however, I'm receiving 403 error every time I try to submit a sitemap to Google webmaster tools.

Probably because your rewriting is doing the opposite of what you said. It will 403 block all of those bots and let the rest through.

Link to comment
Share on other sites

Probably because your rewriting is doing the opposite of what you said. It will 403 block all of those bots and let the rest through.

 

 

Thx for the reply. Yes, I thought about that before, but while searching on Google I saw some people saying to use ^ before the bot name (e.g. ^Googlebot) in order to block them, so..

 

So how do I allow those bots to pass and block any other one?

Edited by spike-spiegel
Link to comment
Share on other sites

Well, there's a fairly simple question you have to ask yourself: how do you know whether the visitor is a bot? Because the user-agent string contains the word "bot"? Use that to make sure your block is restricted to bots, but at the same time you have to allow the bots you do want. A structure like

# if it's not a bot then skip the following rule
#RewriteCond they're *not* a bot
RewriteRule ^ - [s=1]

RewriteCond %{HTTP_USER_AGENT} good-bot [OR]
RewriteCond %{HTTP_USER_AGENT} good-bot [OR]
# etc
RewriteRule ^ - [F,L]

Edited by requinix
Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.