Jump to content

simple regex question


dijona

Recommended Posts

I've got a php function that tries to make searches safer by stripping most non-essential characters. but I wanted to keep a character in there- specifically, the & sign.

 

originally I had my regex pattern as:

 

$patterns = '/[^a-zA-Z0-9\s]/';

 

but when I searched for Echo & the Bunnymen (for example) it returned search results for Echo.

 

then I thought I'd add the & like so:

 

$patterns = '/[^a-zA-Z0-9&\s]/'; \\added the &

 

but that still doesn't do it. so how can I make sure users search for the whole string 'Echo & the Bunnymen'?

 

Link to comment
https://forums.phpfreaks.com/topic/202104-simple-regex-question/
Share on other sites

This worked for me: preg_replace('/[^a-zA-Z0-9&\s]/i', '', 'Echo! & t\/h/\e $Bunnymen%*()@ ');

 

The result is Echo & the Bunnymen

 

Perhaps the problem is with the ampersand itself when being passed to the search in your code? I suggest htmlentities() or htmlspecialchars(). It's also possible that the ampersand means bitwise AND or something in whatever context you're using it.

 

 

thanks beta0x64 - I'm still new to this but I was just using echo & the bunnymen as an example - not something to specifically escape. There are lots of bands that use an & in their name so I'm trying for a somewhat more global solution. I'd like to stick with preg_replace rather than htmlentities() or htmlspecialchars() because this function is already being used for most of my safe-search duties and working fine.

I'm not seeing a problem, using '/[^a-zA-Z0-9&\s]/ as the search parameeter seems to work fine for me:

 

echo preg_replace('/[^a-zA-Z0-9&\s]/', '', "--Echo & The Bunnymen--");
//Output: Echo & The Bunnymen
echo preg_replace('/[^a-zA-Z0-9&\s]/', '', "--Hall & Oats--");
//Output: Hall & Oats

 

You should double check the actual input data. Are you positive that the input is EXACTLY the ampersand and not the HTML code for an ampersand, i.e. "&"?

Well, now the problem becomes a little more interesting.

 

The easy solution would be to simply allow the semi-colon as well. But, then that would allow a semi-colon anywhere in the name.

 

Assuming you do not want to allow the semi-colon and that you want to maintain "&" (instead of converting to just "&"), you could do this:

 

$patterns = array(
    "/&/",
    "/[^a-z0-9&\s]/i",
    "/&/"
);

$replacements = array(
    "&",
    "",
    "&"
);

echo preg_replace($patterns, $replacements, "--Echo & The Bunnymen--");

 

Basically, it does the following in order:

 

1. Convert any "&" to just "&"

2. Removes any character not a-z (upper or lower case), 0-9, & (ampersand), or white space

3. Convert any ampersands ("&") back to "&"

Thanks so much for your help, that's just what I needed.

 

Now, if it's not too much trouble.. ;) how would i write the rule for the band 50/50 where obviously the problem is the slash (this time an actual slash, not the html code!!) I really need to write letters to all bands and ask them to use plain ol' vanilla characters. Once I figure that out (how to escape slashes) I think I can follow through to other examples on my own.

A better question is what is in the data that you need to get rid of? Why are you working with dirty data?

 

Anyway to allow for a forward slash just use "\/", I know it looks funny, the backward slash is just to tell the regex processor to interpret the character as a literal.

$patterns = array(
    "/&/",
    "/[^a-z0-9&\/\s]/i",
    "/&/"
);

$replacements = array(
    "&",
    "",
    "&"
);

 

You know there are going to be other exceptions too. It might be better if you post some of the input data showing what the problem is. There might be a better solution.

 

Ones that might be problematic are ones with accented characters such as "Björk". The script above will convert that to "Bjrk". I don't know of a modifyer that is "accent insensitive" (if there is such a word)

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.