dbrimlow Posted June 7, 2007 Share Posted June 7, 2007 As a self-acknowledged idiot when it comes to regex, I spent a good half hour or so yesterday searching here and Google for a simple way to convert "&" to "&" so I could make my firm's dynamically generated web pages database results text adhere to web standards for XHTML. I couldn't find it. But, then I realized that I already use a great email filter function that I found on the manual site, and it has a section that does this conversion (included within it), so I tried it: // Convert ampersands to named or numbered entities. // Use regex to skip any that might be part of existing entities. function makeAmpersandEntities($str, $useNamedEntities = 1) { return preg_replace("/&(?![A-Za-z]{0,4}\w{2,3};|#[0-9]{2,5};)/m", $useNamedEntities ? "&" : "&", $str); } It worked like a charm when I called the function during my select command's variables initializing ($description = makeAmpersandEntities($description) complete variable below: $description = $result_row[comment1].$result_row[comment2].$result_row[comment3].$result_row[comment4].$result_row[comment5].$result_row[comment6].$result_row[comment7].$result_row[comment8]; $description = makeAmpersandEntities($description); $description = strtolower($description); $description = ucfirst($description); Now, since I am an idiot, I would like to try to actually UNDERSTAND the preg-replace command it uses. I THINK I understand what it does. Do I have the following right? &(?![A-Za-z]{0,4}\w{2,3}; - find the ampersand and ignore any alpha characters that might be immediately after it that occur at least 0 times but not more than 4 times, AND any word characters that occur at least 2 times but not more than 3 times? the pipe - | - says to check (before and after) both the sub-patterns within the parentheses. Is this - #[0-9]{2,5} - find the hash mark and at least 2 but no more than 5 numbers? Lastly, I simply can't "get it" why the multiline is necessary - /m - unless it is there initially because this was part of an email filter so it WOULD need to apply any instance of a new line - /n - within the email message . Thanks to anyone who answers. Idiot Dave Quote Link to comment https://forums.phpfreaks.com/topic/54618-convert-to/ Share on other sites More sharing options...
The Little Guy Posted June 7, 2007 Share Posted June 7, 2007 Something like this? <?php $var = "I like pizza & pasta"; echo preg_replace("/&/","&",$var); ?> Quote Link to comment https://forums.phpfreaks.com/topic/54618-convert-to/#findComment-270248 Share on other sites More sharing options...
dbrimlow Posted June 8, 2007 Author Share Posted June 8, 2007 Thanks, I wanted to try something simple and elegant like that, but that doesn't take into account any potential pre-existing instances of "&" or "&" Therefore, your solution would recode "&" as "I like pizza & pasta " or "&" as "I like pizza & pasta". This is why the check for characters after the & was necessary. (Nice websites, BTW. Clean css and markup). Dave Quote Link to comment https://forums.phpfreaks.com/topic/54618-convert-to/#findComment-270768 Share on other sites More sharing options...
sKunKbad Posted June 9, 2007 Share Posted June 9, 2007 <?php $var = "I like pizza & pasta"; echo preg_replace("/ & /","&",$var); ?> can't you just add spaces around the ampersand and get what you want? Quote Link to comment https://forums.phpfreaks.com/topic/54618-convert-to/#findComment-271455 Share on other sites More sharing options...
dbrimlow Posted June 9, 2007 Author Share Posted June 9, 2007 I don't know, seems like it should work. I'll give it a test run. Meanwhile, I would still like to know if I deciphered the preg_replace("/&(?![A-Za-z]{0,4}\w{2,3};|#[0-9]{2,5};)/m" correctly. Quote Link to comment https://forums.phpfreaks.com/topic/54618-convert-to/#findComment-271475 Share on other sites More sharing options...
neel_basu Posted June 19, 2007 Share Posted June 19, 2007 I think urlencode() would do this. <?php $var = "I like pizza & pasta"; echo preg_replace("/ & /","&",$var); ?> can't you just add spaces around the ampersand and get what you want? Do something like this Simple. Quote Link to comment https://forums.phpfreaks.com/topic/54618-convert-to/#findComment-277412 Share on other sites More sharing options...
Azu Posted June 19, 2007 Share Posted June 19, 2007 <?php $var = "I like pizza & pasta"; echo preg_replace("/ & /","&",$var); ?> can't you just add spaces around the ampersand and get what you want? That won't work if somebody uses an ampersand that is not surrounded in spaces. And if you don't port the spaces there, then it will mess up other entities. So there is no simple REGEX solution for this. There are some already existing functions that will do the job easily, though. Like the one neel listed above. And dbrim, I think you understood it correctly. And the multi-line is probably there so that if there is an entity that starts on one line and ends on another, the regex will not goof up. Quote Link to comment https://forums.phpfreaks.com/topic/54618-convert-to/#findComment-277443 Share on other sites More sharing options...
neel_basu Posted June 19, 2007 Share Posted June 19, 2007 preg_replace("/&/","&",$var); Quote Link to comment https://forums.phpfreaks.com/topic/54618-convert-to/#findComment-277578 Share on other sites More sharing options...
dbrimlow Posted June 19, 2007 Author Share Posted June 19, 2007 Neel, it didn't work. As Azu pointed out, a majority of the errors made in that field are written by some 9 to 5 underpaid data entry person more concerned with punching them out fast, than proper content, so the &s tend to get either tagged onto a word or after a word. Besides, I had to create a function to do this because I only want to make the & conversion for one dynamic variable (pulled from the DB). Quote Link to comment https://forums.phpfreaks.com/topic/54618-convert-to/#findComment-277890 Share on other sites More sharing options...
Wildbug Posted June 19, 2007 Share Posted June 19, 2007 (?!) is a negative lookahead assertion. It means find the preceeding when it's NOT followed by the stuff in (?!). As "&(?!amp;)" means find "&" unless it's "&". The pipe (|) means "or." "/abc|123/" matches "abc" or "123". "/abc(?:123|xyz)/" finds "abc123" or "abcxyz". {2,5} is the quantifier "at least two, no more than 5." {0,4} can be written {,4} and is just what you think it is. /m is unnecessary. See Pattern modifiers. Quote Link to comment https://forums.phpfreaks.com/topic/54618-convert-to/#findComment-277946 Share on other sites More sharing options...
The Little Guy Posted June 19, 2007 Share Posted June 19, 2007 give this a try: <?php $var = "I like pizza & pasta"; echo preg_replace("/^&{1}$/","&",$var); ?> Quote Link to comment https://forums.phpfreaks.com/topic/54618-convert-to/#findComment-277956 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.