Jump to content


Photo

htaccess script


  • Please log in to reply
4 replies to this topic

#1 Ninjakreborn

Ninjakreborn
  • Members
  • PipPipPip
  • Information Technology Specialist
  • 3,922 posts
  • Age:33

Posted 02 October 2006 - 05:53 PM

When I first learnt htaccess I found this.  Something I ran across, I put it on my website for awhile for testing purposes.  Is this helpful in any way, it's suppose to catch a lot of virus's and stuff, and keep them from visiting your site, or stripping content, I don't knwo the positives or negatives of using it though.  If it is good, does anyone have any good advice on what I can add to it.

# This is something I am testing for a little while, it's suppose to totally
# Prevent quite a number of bad bot's from entering the site to do various "bad"
# things.  This is again just a test and can be removed if it causes problems, or 
# have more added to it later, if there are new bot names discovered.

RewriteEngine On 
RewriteCond %{HTTP_USER_AGENT} ^BlackWidow [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Bot\ mailto:craftbot@yahoo.com [OR] 
RewriteCond %{HTTP_USER_AGENT} ^ChinaClaw [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Custo [OR] 
RewriteCond %{HTTP_USER_AGENT} ^DISCo [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Download\ Demon [OR] 
RewriteCond %{HTTP_USER_AGENT} ^eCatch [OR] 
RewriteCond %{HTTP_USER_AGENT} ^EirGrabber [OR] 
RewriteCond %{HTTP_USER_AGENT} ^EmailSiphon [OR] 
RewriteCond %{HTTP_USER_AGENT} ^EmailWolf [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Express\ WebPictures [OR] 
RewriteCond %{HTTP_USER_AGENT} ^ExtractorPro [OR] 
RewriteCond %{HTTP_USER_AGENT} ^EyeNetIE [OR] 
RewriteCond %{HTTP_USER_AGENT} ^FlashGet [OR] 
RewriteCond %{HTTP_USER_AGENT} ^GetRight [OR] 
RewriteCond %{HTTP_USER_AGENT} ^GetWeb! [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Go!Zilla [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Go-Ahead-Got-It [OR] 
RewriteCond %{HTTP_USER_AGENT} ^GrabNet [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Grafula [OR] 
RewriteCond %{HTTP_USER_AGENT} ^HMView [OR] 
RewriteCond %{HTTP_USER_AGENT} HTTrack [NC,OR] 
RewriteCond %{HTTP_USER_AGENT} ^Image\ Stripper [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Image\ Sucker [OR] 
RewriteCond %{HTTP_USER_AGENT} Indy\ Library [NC,OR] 
RewriteCond %{HTTP_USER_AGENT} ^InterGET [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Internet\ Ninja [OR] 
RewriteCond %{HTTP_USER_AGENT} ^JetCar [OR] 
RewriteCond %{HTTP_USER_AGENT} ^JOC\ Web\ Spider [OR] 
RewriteCond %{HTTP_USER_AGENT} ^larbin [OR] 
RewriteCond %{HTTP_USER_AGENT} ^LeechFTP [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Mass\ Downloader [OR] 
RewriteCond %{HTTP_USER_AGENT} ^MIDown\ tool [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Mister\ PiX [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Navroad [OR] 
RewriteCond %{HTTP_USER_AGENT} ^NearSite [OR] 
RewriteCond %{HTTP_USER_AGENT} ^NetAnts [OR] 
RewriteCond %{HTTP_USER_AGENT} ^NetSpider [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Net\ Vampire [OR] 
RewriteCond %{HTTP_USER_AGENT} ^NetZIP [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Octopus [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Offline\ Explorer [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Offline\ Navigator [OR] 
RewriteCond %{HTTP_USER_AGENT} ^PageGrabber [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Papa\ Foto [OR] 
RewriteCond %{HTTP_USER_AGENT} ^pavuk [OR] 
RewriteCond %{HTTP_USER_AGENT} ^pcBrowser [OR] 
RewriteCond %{HTTP_USER_AGENT} ^RealDownload [OR] 
RewriteCond %{HTTP_USER_AGENT} ^ReGet [OR] 
RewriteCond %{HTTP_USER_AGENT} ^SiteSnagger [OR] 
RewriteCond %{HTTP_USER_AGENT} ^SmartDownload [OR] 
RewriteCond %{HTTP_USER_AGENT} ^SuperBot [OR] 
RewriteCond %{HTTP_USER_AGENT} ^SuperHTTP [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Surfbot [OR] 
RewriteCond %{HTTP_USER_AGENT} ^tAkeOut [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Teleport\ Pro [OR] 
RewriteCond %{HTTP_USER_AGENT} ^VoidEYE [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Web\ Image\ Collector [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Web\ Sucker [OR] 
RewriteCond %{HTTP_USER_AGENT} ^WebAuto [OR] 
RewriteCond %{HTTP_USER_AGENT} ^WebCopier [OR] 
RewriteCond %{HTTP_USER_AGENT} ^WebFetch [OR] 
RewriteCond %{HTTP_USER_AGENT} ^WebGo\ IS [OR] 
RewriteCond %{HTTP_USER_AGENT} ^WebLeacher [OR] 
RewriteCond %{HTTP_USER_AGENT} ^WebReaper [OR] 
RewriteCond %{HTTP_USER_AGENT} ^WebSauger [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Website\ eXtractor [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Website\ Quester [OR] 
RewriteCond %{HTTP_USER_AGENT} ^WebStripper [OR] 
RewriteCond %{HTTP_USER_AGENT} ^WebWhacker [OR] 
RewriteCond %{HTTP_USER_AGENT} ^WebZIP [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Wget [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Widow [OR] 
RewriteCond %{HTTP_USER_AGENT} ^WWWOFFLE [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Xaldon\ WebSpider [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Zeus 
RewriteRule ^.* - [F,L]
# End test scripts

------

Business Website: http://www.infotechnologist.biz

Personal Website: http://www.joyelpuryear.com

Blog Site: http://www.realmofwriting.com
Services: Web development, application development, mobile development, and custom development. All services listed on my website.


#2 redbullmarky

redbullmarky
  • Staff Alumni
  • Advanced Member
  • 2,863 posts
  • LocationBedfordshire, England

Posted 02 October 2006 - 06:31 PM

in effect, its a replacement for a 'robots.txt' file. the problem with robots.txt files is they offer a 'guide' to bots, rather than any strict rules. So when a bot's purpose is to scan your site for email addresses, contact forms and suchlike, all for the purpose of telling you about how cheap they can sell viagra, rules arent really that important - so enter the htaccess file and Apache's mod_rewrite, which makes sure these rules are properly enforced.
The biggest headache with this method is that new bots are on the go all the time, and keeping track of their names (via HTTP_USER_AGENT) can be an overwhelming job. I personally have a script that is responsible for handling incoming request to my site - which along with dealing with quotes, etc, deals with filtering specific things like this.
"you have to keep pissing in the wind to learn how to keep your shoes dry..."

I say old chap, that is rather amusing!

#3 Ninjakreborn

Ninjakreborn
  • Members
  • PipPipPip
  • Information Technology Specialist
  • 3,922 posts
  • Age:33

Posted 02 October 2006 - 06:34 PM

Is it a third party script or one you ended up programming from scratch.  Where would I start in programming a custom one. 

Oh and another question, is there a way to wire up a .htaccess as a .php file, but sending the header as htaccess, just an idea, like conditional htaccess statements.

------

Business Website: http://www.infotechnologist.biz

Personal Website: http://www.joyelpuryear.com

Blog Site: http://www.realmofwriting.com
Services: Web development, application development, mobile development, and custom development. All services listed on my website.


#4 redbullmarky

redbullmarky
  • Staff Alumni
  • Advanced Member
  • 2,863 posts
  • LocationBedfordshire, England

Posted 02 October 2006 - 07:04 PM

one i did myself. recently i've been guided towards more 'MVC' design - so having one entry point into my site. in this single 'controller' script, i have a part that filters all user input - GPC - as well as access from 'bots' that notoriously end up with me getting tonnes of junk mail. They're not too hard to write and generally just means that the bot will get a blank page. its just a simple one to check the USER_AGENT and/or IP and check for its presence in my database. call it a ban list if you like, just like many forum systems use.

htaccess files are dealt with by apache and before anything else, so in effect have little to do with PHP itself. I can't think of any practical reasons why you'd want to do this. generally, htaccess files for me are used to tweak PHP settings (especially error reporting) without messing around with my global PHP.INI file, and to provide mod_rewrite to tidy up URLs and their parameters.

Cheers
Mark
"you have to keep pissing in the wind to learn how to keep your shoes dry..."

I say old chap, that is rather amusing!

#5 Ninjakreborn

Ninjakreborn
  • Members
  • PipPipPip
  • Information Technology Specialist
  • 3,922 posts
  • Age:33

Posted 02 October 2006 - 07:15 PM

Hmm, I will keep that in mind, thanks for the advice it was appreciated.

------

Business Website: http://www.infotechnologist.biz

Personal Website: http://www.joyelpuryear.com

Blog Site: http://www.realmofwriting.com
Services: Web development, application development, mobile development, and custom development. All services listed on my website.





0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users