Jump to content

session_start() ruining search engine indexing


jakebur01

Recommended Posts

hey guys. I need some help. Google is trying to index my pages with session id's attached to them.

 

Ex. =  show_cat.php?catid=P70&PHPSESSID=aeg3klbcaeoirjerfodfifs

 

And I don't know if this is right but I put these to lines in my robots.txt but it still didn't work.

Disallow: /*?PHPSESSID=
Disallow: /*?*PHPSESSID=

 

I am using session_start(); at the beginning of all my pages. What do I need to do? I will be toast if I don't change something.

 

`Jake

[quote=Web Server Administrator's Guide

to the Robots Exclusion Protocol]Note also that regular expression are not supported in either the User-agent or Disallow lines. The '*' in the User-agent field is a special value meaning "any robot". Specifically, you cannot have lines like "Disallow: /tmp/*" or "Disallow: *.gif".

 

http://www.robotstxt.org/wc/exclusion-admin.html

 

 

There are a slew of suggestions in the comments section of that article linked above.  These two ini_sets look to be the best solution.

 

ini_set('session.use_trans_sid', false);

ini_set('url_rewriter.tags','');

 

And you can use get_browser() to match "Google" or "MSIECrawler" (you can probably find others) in the user agent string and only run the above restrictions if it's a robot.

 

Bots:

http://www.i-asap.net/crawlersdb.php

http://www.botsvsbrowsers.com/

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.