php_tom Posted July 12, 2007 Share Posted July 12, 2007 Hey, everyone. I've been working on Aquarium, a PHP badwords filter. Check out the site at http://aquarium-filter.sourceforge.net On that page, you can try out the filter, or download the source and check that out... I realize there are other text filters out there, the reasons for making it were 1. I can never find a bad words file to use, so I finally made one, it's included (encrypted so no one can read it easily) in the source. Maybe others will find this useful. 2. I was trying to make something that filters words similar to badwords, not just bad words. So, for example, if 'badword' was in the bad-words list, it would be nice to have 'b@dword', 'bdword', and 'badwrd' filtered. I'd love to hear some feedback from you all on the engine. I realize that it does not filter words with whitespace in them, e.g. 'ba dw ord', that's something I'm working on... Try out the demo on the sourceforge page, post your comments here. If you find a bad word that doesn't get filtered, I'd like to know... maybe you can run PHP's base64_encode() on it <-- [so the forum admins don't get angry] and post it here. Thanks! Link to comment https://forums.phpfreaks.com/topic/59726-php-bad-words-filter/ Share on other sites More sharing options...
agentsteal Posted July 12, 2007 Share Posted July 12, 2007 Cross Site Scripting: There is Cross Site Scripting if you submit code. Link to comment https://forums.phpfreaks.com/topic/59726-php-bad-words-filter/#findComment-296847 Share on other sites More sharing options...
php_tom Posted July 13, 2007 Author Share Posted July 13, 2007 heehee... my bad. I fixed that (at least) now. Link to comment https://forums.phpfreaks.com/topic/59726-php-bad-words-filter/#findComment-296858 Share on other sites More sharing options...
BillyBoB Posted July 13, 2007 Share Posted July 13, 2007 Aquarium filtered 7 words in 0.03 seconds and found 4 bad words. Stats: 212 words per second, 57.1% bad words. Filtered Text: ******* Ass Nigger Sucks Dick Fucking various words didnt filter but one Link to comment https://forums.phpfreaks.com/topic/59726-php-bad-words-filter/#findComment-296935 Share on other sites More sharing options...
php_tom Posted July 13, 2007 Author Share Posted July 13, 2007 A small bug where capitalized words get through. Should be fixed now... Two of the words you posted ??? are still not filtered because they aren't in the library... One will certainly be added, the other I'm not sure, I could see innocent uses of the same word. Maybe I'll add a 'filter strength' setting... Anyway, thanks. Link to comment https://forums.phpfreaks.com/topic/59726-php-bad-words-filter/#findComment-296940 Share on other sites More sharing options...
source Posted July 14, 2007 Share Posted July 14, 2007 "Aquarium filtered 0 words in 0.01 seconds and found 0 bad words. Stats: 0 words per second, Warning: Division by zero in /home/groups/a/aq/aquarium-filter/htdocs/process.php on line 18 0% bad words. Filtered Text:" when I enter <" Link to comment https://forums.phpfreaks.com/topic/59726-php-bad-words-filter/#findComment-297983 Share on other sites More sharing options...
php_tom Posted July 14, 2007 Author Share Posted July 14, 2007 Great, thanks. Dumb bug... it's fixed now! Link to comment https://forums.phpfreaks.com/topic/59726-php-bad-words-filter/#findComment-298117 Share on other sites More sharing options...
Gath Posted July 14, 2007 Share Posted July 14, 2007 Filtered Text: **** fucktard ***** But then i inserted it like this: **** ****tard ***** fuck ***** And the previous word got filtered, but not the "simple" one. Filtered Text: ****g fuck Filtered Text: ****g **** **** fuck ***** Used F word in all, last one had an "i" in the end, just like the first had an "g". Link to comment https://forums.phpfreaks.com/topic/59726-php-bad-words-filter/#findComment-298223 Share on other sites More sharing options...
LiamProductions Posted July 15, 2007 Share Posted July 15, 2007 Pretty neat program. That would be good for forum use or guestbook use Link to comment https://forums.phpfreaks.com/topic/59726-php-bad-words-filter/#findComment-298693 Share on other sites More sharing options...
fast4god Posted August 1, 2007 Share Posted August 1, 2007 Hey, Tom Just a quick note to let you know I like this program. I adapted it and used it in my new wiki engine. http://www.fast.st/zapwiki/welcome/index.php?p=solution.1006 It seems to work great though as noted a couple posts up sometimes the results are a little erratic. Let me know if you make any improvements! Cheers, Dan Link to comment https://forums.phpfreaks.com/topic/59726-php-bad-words-filter/#findComment-313319 Share on other sites More sharing options...
MikeDXUNL Posted August 2, 2007 Share Posted August 2, 2007 fuckass and assfuck are useable Link to comment https://forums.phpfreaks.com/topic/59726-php-bad-words-filter/#findComment-313627 Share on other sites More sharing options...
Technex Posted August 5, 2007 Share Posted August 5, 2007 Aquarium filtered 1 words in 0.01 seconds and found 0 bad words. Stats: 71 words per second, 0% bad words. Filtered Text: lolshit :S Link to comment https://forums.phpfreaks.com/topic/59726-php-bad-words-filter/#findComment-316010 Share on other sites More sharing options...
john010117 Posted August 6, 2007 Share Posted August 6, 2007 Eh, it doesn't filter these: fuck3d sh17 @ss @ssh013 b17ch but then, I don't know if it's supposed to filter "1337" words. Link to comment https://forums.phpfreaks.com/topic/59726-php-bad-words-filter/#findComment-316422 Share on other sites More sharing options...
kaliza Posted August 6, 2007 Share Posted August 6, 2007 uh mm how come u encrypted the bad words file, instead of keeping it open for adjustments, one option can be a request for addition of a bad word, which gets submitted for your approval and when it gets approved it gets added on the list so the more u use it the more it filters. Link to comment https://forums.phpfreaks.com/topic/59726-php-bad-words-filter/#findComment-316539 Share on other sites More sharing options...
php_tom Posted August 6, 2007 Author Share Posted August 6, 2007 Hey, thanks for all the suggestions guys. I'm working on a new version which can handle 1337, and things like "ba dw or d" or "word1word2". It also will look at the context a word is in, e.g. words in the sentence "he's a badwording badword, I hate the badword!" would get filtered, but the sentence "Jesus rode into Jerusalem on an ass" would not (because of word strength and frequency). About the bad words file: I keep it encrypted because I don't want someone to find a list of filthy language on my server in plain text. The 'encryption' (in case you haven't figured out from the code) is simply base64_encode(base64_encode(theEntireFileAsAString)); I'd like to make the filter smarter, rather than the dictionary larger, because even though I'm using a hashtable-type lookup with the dictionary, more words in the dictionary will still slow down the algorithm... Please keep the suggestions coming, having input from the kind of people who might use this code is useful. Thanks! Link to comment https://forums.phpfreaks.com/topic/59726-php-bad-words-filter/#findComment-316660 Share on other sites More sharing options...
teng84 Posted August 10, 2007 Share Posted August 10, 2007 like i said in phphelp you can filter bad words that easy just think like your the user you can do something like the ff: Shi_@ F_U_C_ then you know A*S* then you now H*O*L* then you know or something like combination of numbers or letter Link to comment https://forums.phpfreaks.com/topic/59726-php-bad-words-filter/#findComment-319998 Share on other sites More sharing options...
curtis_b Posted August 13, 2007 Share Posted August 13, 2007 donkeycocklovingslutmouth doesn't filter. Link to comment https://forums.phpfreaks.com/topic/59726-php-bad-words-filter/#findComment-322139 Share on other sites More sharing options...
BillyBoB Posted August 13, 2007 Share Posted August 13, 2007 Aquarium filtered 9 words in 0.1 seconds and found 4 bad words. Stats: 87 words per second, 44.4% bad words. Filtered Text: Nigger **** Wetback **** **** ****** Honky Spic Chink Link to comment https://forums.phpfreaks.com/topic/59726-php-bad-words-filter/#findComment-322196 Share on other sites More sharing options...
mattd8752 Posted August 14, 2007 Share Posted August 14, 2007 <!--abc--> That just returns blank, Not even found in the source. Perhaps you can get rid of the tag filtering and just replace < and > with its < and >. Link to comment https://forums.phpfreaks.com/topic/59726-php-bad-words-filter/#findComment-323509 Share on other sites More sharing options...
Recommended Posts