brianlange Posted April 8, 2011 Share Posted April 8, 2011 I need to match a word with a regular expression and make sure the word is not within html tags. Specifically the regex should look for anchor tags but it probably should check for any html tags. -Brian Quote Link to comment Share on other sites More sharing options...
salathe Posted April 9, 2011 Share Posted April 9, 2011 Unless you're reading in a plain text file, there will always be HTML tags around your chosen word; unless you really mean something much more specific like literally, only, around the word itself? Could you elaborate a little more on precisely what you want to accept as matching and what you want to disallow from matching? Also, if you've made a start on matching the word but cannot quite adapt that to looking for tags, post up any regex that you have at the moment for us to help you to adapt it. Quote Link to comment Share on other sites More sharing options...
brianlange Posted April 11, 2011 Author Share Posted April 11, 2011 It would be sufficient to determine whether the word is in anchor tags. Right now I have an array of regular expressions and an array of replacement strings. I use preg_replace to perform the replacement. I have about 700 words in my search array. $search = array("/\bAAA tenant\b/i","/\babandonment\b/i"); $replacement = array('<a href="/words/aaa-tenant">AAA tenant</a>','<a href="/words/abandonment">abandonment</a>'); preg_replace($_search, $_replacement, $content); -Brian Quote Link to comment Share on other sites More sharing options...
brianlange Posted April 11, 2011 Author Share Posted April 11, 2011 I added /\b[^<]word\b/i to my regular expression. It works if the word is adjacted to the > tag. But I am also trying to match phrases. So <a href="">Regular word [b]matching words[/b] </a> is still an issue. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.