Jump to content


Photo

GoogleBots, MSNBots, Bots in general.


  • Please log in to reply
5 replies to this topic

#1 Hardbyte

Hardbyte
  • Members
  • PipPipPip
  • Advanced Member
  • 30 posts
  • LocationMidlands, UK

Posted 06 July 2006 - 08:57 AM

Hi, and sorry for possibly posting in the wrong section.

Just wondering, do these spider bot things still trigger links within the page code even if I got a script to hide all links (or the whole page) if its equal to a bot? What Im trying to ask is, do they still spider through the code like a text file or do they work as if they were visiting like using a normal browser?

Any help would be appreciated.

Thanks.


#2 Orio

Orio
  • Staff Alumni
  • Advanced Member
  • 2,491 posts

Posted 06 July 2006 - 09:00 AM

Just like a normal browser... They cant see what you write between the <?php and ?>, just the output.

Orio.
Think you're smarty?

(Gone until 20 to November)

#3 Hardbyte

Hardbyte
  • Members
  • PipPipPip
  • Advanced Member
  • 30 posts
  • LocationMidlands, UK

Posted 06 July 2006 - 09:57 AM

Hi and thanks for your reply.

On another note: If I were to allow Mozilla as an agent, then the bots could still get thru. For example:

MOZILLA/5.0 (COMPATIBLE; YAHOO! SLURP; HTTP://HELP.YAHOO.COM/HELP/US/YSEARCH/SLURP) has visited ....
MOZILLA/5.0 (COMPATIBLE; GOOGLEBOT/2.1; +HTTP://WWW.GOOGLE.COM/BOT.HTML) has visited ....

so just search for Yahoo & Google to block for example?

Thanks

#4 SharkBait

SharkBait
  • Members
  • PipPipPip
  • Advanced Member
  • 845 posts
  • LocationMetro Vancouver, BC

Posted 06 July 2006 - 03:01 PM

If your looking to block crawlers take a look at this: http://webtools.live...m/se_robots.php

You can place a text file in the directories and if there are rules, the bots are designed to follow the rules in the file before crawling your site.

#5 .josh

.josh
  • Staff Alumni
  • .josh
  • 14,871 posts

Posted 06 July 2006 - 03:09 PM

although i personally wouldn't rely 100% on that ^ those are rules that bots are "supposed" to follow, but that doesn't mean someone has to program them to obey those rules.  I would implement that, but keep to your original plan of scripting to check for them.
Did I help you? Feeling generous? Buy me lunch! 
Please, take the time and do some research and find out how much it would have cost you to get your help from a decent paid-for source. A "roll-of-the-dice" freelancer will charge you $5-$15/hr. A decent entry level freelancer will charge you around $15-30/hr. A professional will charge you anywhere from $50-$100/hr. An agency will charge anywhere from $100-$250/hr. Think about all this when soliciting for help here. Think about how much money you are making from the work you are asking for help on. No, we do not expect you to pay for the help given here, but donating a few bucks is a fraction of the cost of what you would have paid, shows your appreciation, helps motivate people to keep offering help without the pricetag, and helps make this a higher quality free-help community :)

#6 Hardbyte

Hardbyte
  • Members
  • PipPipPip
  • Advanced Member
  • 30 posts
  • LocationMidlands, UK

Posted 06 July 2006 - 03:24 PM

Thank you for the replies.

I agree with Crayon as Iv already got meta tags (no follow etc..) and robots.txt within the site which they are not following these "rules".

I dont mind them visiting the site, but they are triggering my "Report a comment" link and Im receiving <backwards>tihs</backwards> loads of emails.

My script seems to be working, its blocking the bots which I have in an array and those bots visiting which are not in the array - it emails me to inform me of the agent. (obviously allowing IE/Firefox and such common browsers.) Im just having to currently find out which are the bad bots  ???

Thanks again.




0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users