Jump to content

eregi_replace to preg_replace error


cobusbo
Go to solution Solved by CroNiX,

Recommended Posts

Hi I'm trying to get rid of the deprecated functions like the eregi replace, but is currently experiancing a few problems.

 

I have the following function

function filterBadWords($str)
{
	
	
    $result1 = mysql_query("SELECT word FROM StringyChat_WordBan") or die(mysql_error()); 
    $replacements = "#~";
    
    while($row = mysql_fetch_assoc($result1))
    {
          $str = eregi_replace($row['word'], str_repeat('#~', strlen($row['word'])), $str);
    }  
    
    return $str;
}

and tried changing it into

function filterBadWords($str)
{
	
	
    $result1 = mysql_query("SELECT word FROM StringyChat_WordBan") or die(mysql_error()); 
    $replacements = "(G)";
    
    while($row = mysql_fetch_assoc($result1))
    {
          $str = preg_replace($row['word'], str_repeat('(G)', strlen($row['word'])), $str);
    }  
    
    return $str;
}

but now I'm getting errors like

 

preg_replace(): Delimiter must not be alphanumeric or backslash

 Any help regarding this?

Link to comment
Share on other sites

PCRE regexes need delimiters. Like so: '/.../' (note the slashes).

 

Besides that, the whole approach looks rather problematic. If you do a substring search, what happens with “Charles Dickens”? Is this name politically incorrect and will be censored?

 

Of course censorship itself is crap, but that's a different discussion ...

Link to comment
Share on other sites

PCRE regexes need delimiters. Like so: '/.../' (note the slashes).

 

Besides that, the whole approach looks rather problematic. If you do a substring search, what happens with “Charles Dickens”? Is this name politically incorrect and will be censored?

 

Of course censorship itself is crap, but that's a different discussion ...

So how would you approach this scenario where I want to use it as a profanity filter?

Link to comment
Share on other sites

I've always found those to be next to useless. You can always alter how you spell things and still get the intention of the word to come across. Are you going to be able to filter all possibilities, including misspellings and phonetics? No. I don't know what would be less offensive..to be called a dick, dik, diq, etc. All have the same intention.

Edited by CroNiX
Link to comment
Share on other sites

I've always found those to be next to useless. You can always alter how you spell things and still get the intention of the word to come across. Are you going to be able to filter all possibilities, including misspellings and phonetics? No. I don't know what would be less offensive..to be called a dick, dik, diq, etc. All have the same intention.

It's true but it will help with the basic words. I'm going to add like a time out ban if such a word would be detected.

Link to comment
Share on other sites

A ban? You mean a kid who likes Charles Dickens would be thrown out of your chat? C'mon, you can't be serious.

 

Even the best filter has false positives, meaning a legitimate user making legitimate statements will be erroneously flagged. If you automatically ban them, sorry, then your application sucks.

Link to comment
Share on other sites

A ban? You mean a kid who likes Charles Dickens would be thrown out of your chat? C'mon, you can't be serious.

 

Even the best filter has false positives, meaning a legitimate user making legitimate statements will be erroneously flagged. If you automatically ban them, sorry, then your application sucks.

Well the filter will only be in messages not Names so if a username is Charles Dickens it would be accepted but not if someone mention it in a message. My Hosting rules recommend I must use some kind of method to keep chats clean. Can someone just point me in the right direction?

Link to comment
Share on other sites

I'm not talking about usernames, I'm talking about messages. Substrings like “dick” or “ass” or whatever can occur in all kinds of legitimate texts which have absolutely nothing to do with profanity.

 

So you need this to make your hoster happy? Then I suggest you get a proper hoster which doesn't have such stupid rules. You want this chat to be a success, right? Then you can't just ban innocent users who haven't done anything. Nobody will accept this, especially when there are tons of other chats to choose from.

 

If, for some strange reason, you're stuck with your current hoster, then do the bare minimum to formally comply to the rules. Ask them what exactly they expect from you. This certainly does not involve banning users or preventing chats about Charles Dickens.

Edited by Jacques1
Link to comment
Share on other sites

  • Solution

You do realize this has been done many times before you attempted it? Have you tried googling for "php swear filter" or anything? There are lots of libraries out there that already are working, and probably a lot better than you can do trying from scratch because the problem is a bit more complex than I think you are considering.

 

You don't want the filter to work on substrings for the reason that Jacques1 stated. "I assume you're talking about me" is a legitimate, clean piece of text. "ass" in assume should not be filtered. So you probably want it to work on individual words. That gets a bit complex for the regex because "you're an ass." (with a period after ass) is harder to match than "I think you're an ass dude." where ass is by itself. I'm sure there are regex modifiers that will do all of that, but trying to think of all main scenarios and then coding for it will be a bit of a challenge unless you are really good at regex. I'd suggest looking for something off-the-shelf and ready to go, all tried and tested.

Link to comment
Share on other sites

You do realize this has been done many times before you attempted it? Have you tried googling for "php swear filter" or anything? There are lots of libraries out there that already are working, and probably a lot better than you can do trying from scratch because the problem is a bit more complex than I think you are considering.

 

You don't want the filter to work on substrings for the reason that Jacques1 stated. "I assume you're talking about me" is a legitimate, clean piece of text. "ass" in assume should not be filtered. So you probably want it to work on individual words. That gets a bit complex for the regex because "you're an ass." (with a period after ass) is harder to match than "I think you're an ass dude." where ass is by itself. I'm sure there are regex modifiers that will do all of that, but trying to think of all main scenarios and then coding for it will be a bit of a challenge unless you are really good at regex. I'd suggest looking for something off-the-shelf and ready to go, all tried and tested.

Thank you I totally see your point. Will search for a good one but in the meantime I found a solution for the problem.

function filterBadWords($str)
{
	
	
    $result1 = mysql_query("SELECT word FROM StringyChat_WordBan") or die(mysql_error()); 
    $replacements = "#~";
    
    while($row = mysql_fetch_assoc($result1))
    {
          $str = preg_replace('/\b' . $row['word'].'\b/ie', str_repeat('#~', strlen($row['word'])), $str);
    }  
    
    return $str;
}

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.