cobusbo Posted January 12, 2015 Share Posted January 12, 2015 Hi I'm trying to get rid of the deprecated functions like the eregi replace, but is currently experiancing a few problems. I have the following function function filterBadWords($str) { $result1 = mysql_query("SELECT word FROM StringyChat_WordBan") or die(mysql_error()); $replacements = "#~"; while($row = mysql_fetch_assoc($result1)) { $str = eregi_replace($row['word'], str_repeat('#~', strlen($row['word'])), $str); } return $str; } and tried changing it into function filterBadWords($str) { $result1 = mysql_query("SELECT word FROM StringyChat_WordBan") or die(mysql_error()); $replacements = "(G)"; while($row = mysql_fetch_assoc($result1)) { $str = preg_replace($row['word'], str_repeat('(G)', strlen($row['word'])), $str); } return $str; } but now I'm getting errors like preg_replace(): Delimiter must not be alphanumeric or backslash Any help regarding this? Quote Link to comment Share on other sites More sharing options...
Jacques1 Posted January 12, 2015 Share Posted January 12, 2015 PCRE regexes need delimiters. Like so: '/.../' (note the slashes). Besides that, the whole approach looks rather problematic. If you do a substring search, what happens with “Charles Dickens”? Is this name politically incorrect and will be censored? Of course censorship itself is crap, but that's a different discussion ... Quote Link to comment Share on other sites More sharing options...
cobusbo Posted January 12, 2015 Author Share Posted January 12, 2015 PCRE regexes need delimiters. Like so: '/.../' (note the slashes). Besides that, the whole approach looks rather problematic. If you do a substring search, what happens with “Charles Dickens”? Is this name politically incorrect and will be censored? Of course censorship itself is crap, but that's a different discussion ... So how would you approach this scenario where I want to use it as a profanity filter? Quote Link to comment Share on other sites More sharing options...
CroNiX Posted January 12, 2015 Share Posted January 12, 2015 (edited) I've always found those to be next to useless. You can always alter how you spell things and still get the intention of the word to come across. Are you going to be able to filter all possibilities, including misspellings and phonetics? No. I don't know what would be less offensive..to be called a dick, dik, diq, etc. All have the same intention. Edited January 12, 2015 by CroNiX Quote Link to comment Share on other sites More sharing options...
cobusbo Posted January 12, 2015 Author Share Posted January 12, 2015 I've always found those to be next to useless. You can always alter how you spell things and still get the intention of the word to come across. Are you going to be able to filter all possibilities, including misspellings and phonetics? No. I don't know what would be less offensive..to be called a dick, dik, diq, etc. All have the same intention. It's true but it will help with the basic words. I'm going to add like a time out ban if such a word would be detected. Quote Link to comment Share on other sites More sharing options...
Jacques1 Posted January 12, 2015 Share Posted January 12, 2015 A ban? You mean a kid who likes Charles Dickens would be thrown out of your chat? C'mon, you can't be serious. Even the best filter has false positives, meaning a legitimate user making legitimate statements will be erroneously flagged. If you automatically ban them, sorry, then your application sucks. Quote Link to comment Share on other sites More sharing options...
cobusbo Posted January 12, 2015 Author Share Posted January 12, 2015 A ban? You mean a kid who likes Charles Dickens would be thrown out of your chat? C'mon, you can't be serious. Even the best filter has false positives, meaning a legitimate user making legitimate statements will be erroneously flagged. If you automatically ban them, sorry, then your application sucks. Well the filter will only be in messages not Names so if a username is Charles Dickens it would be accepted but not if someone mention it in a message. My Hosting rules recommend I must use some kind of method to keep chats clean. Can someone just point me in the right direction? Quote Link to comment Share on other sites More sharing options...
Jacques1 Posted January 12, 2015 Share Posted January 12, 2015 (edited) I'm not talking about usernames, I'm talking about messages. Substrings like “dick” or “ass” or whatever can occur in all kinds of legitimate texts which have absolutely nothing to do with profanity. So you need this to make your hoster happy? Then I suggest you get a proper hoster which doesn't have such stupid rules. You want this chat to be a success, right? Then you can't just ban innocent users who haven't done anything. Nobody will accept this, especially when there are tons of other chats to choose from. If, for some strange reason, you're stuck with your current hoster, then do the bare minimum to formally comply to the rules. Ask them what exactly they expect from you. This certainly does not involve banning users or preventing chats about Charles Dickens. Edited January 12, 2015 by Jacques1 Quote Link to comment Share on other sites More sharing options...
Barand Posted January 12, 2015 Share Posted January 12, 2015 We used to have a PC filter where I worked, so messages containing words like "tart" or "Scunthorpe" (town in UK) were flagged for moderation. The recipient was notified they had received a message with banned content and, after reviewing, an administrator might, or not, release the message. Quote Link to comment Share on other sites More sharing options...
Solution CroNiX Posted January 12, 2015 Solution Share Posted January 12, 2015 You do realize this has been done many times before you attempted it? Have you tried googling for "php swear filter" or anything? There are lots of libraries out there that already are working, and probably a lot better than you can do trying from scratch because the problem is a bit more complex than I think you are considering. You don't want the filter to work on substrings for the reason that Jacques1 stated. "I assume you're talking about me" is a legitimate, clean piece of text. "ass" in assume should not be filtered. So you probably want it to work on individual words. That gets a bit complex for the regex because "you're an ass." (with a period after ass) is harder to match than "I think you're an ass dude." where ass is by itself. I'm sure there are regex modifiers that will do all of that, but trying to think of all main scenarios and then coding for it will be a bit of a challenge unless you are really good at regex. I'd suggest looking for something off-the-shelf and ready to go, all tried and tested. Quote Link to comment Share on other sites More sharing options...
cobusbo Posted January 12, 2015 Author Share Posted January 12, 2015 You do realize this has been done many times before you attempted it? Have you tried googling for "php swear filter" or anything? There are lots of libraries out there that already are working, and probably a lot better than you can do trying from scratch because the problem is a bit more complex than I think you are considering. You don't want the filter to work on substrings for the reason that Jacques1 stated. "I assume you're talking about me" is a legitimate, clean piece of text. "ass" in assume should not be filtered. So you probably want it to work on individual words. That gets a bit complex for the regex because "you're an ass." (with a period after ass) is harder to match than "I think you're an ass dude." where ass is by itself. I'm sure there are regex modifiers that will do all of that, but trying to think of all main scenarios and then coding for it will be a bit of a challenge unless you are really good at regex. I'd suggest looking for something off-the-shelf and ready to go, all tried and tested. Thank you I totally see your point. Will search for a good one but in the meantime I found a solution for the problem. function filterBadWords($str) { $result1 = mysql_query("SELECT word FROM StringyChat_WordBan") or die(mysql_error()); $replacements = "#~"; while($row = mysql_fetch_assoc($result1)) { $str = preg_replace('/\b' . $row['word'].'\b/ie', str_repeat('#~', strlen($row['word'])), $str); } return $str; } Quote Link to comment Share on other sites More sharing options...
CroNiX Posted January 12, 2015 Share Posted January 12, 2015 The /e modifier has been deprecated in php >= 5.5 http://php.net/manual/en/migration55.deprecated.php Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.