binxalot Posted July 12, 2011 Share Posted July 12, 2011 Hi, I've been reading this forum for over and hour now and I can't seem to find anything along the lines of what I'm trying to do, which is remove specific phrases of pronouns from a string, leaving only adjectives and adverbs, etc... The problem I'm having is that when I use $blob = str_replace($knownWords[$b], ' ', $blob); (where blob is the string of text, and knownwords is the list of words I'm looking to replace with spaces. ) If one of the known words is "on top of" for instance, then all letters in the string with "on" "top" or "of" get removed. So I looked in to this type of string replacement $blob = ereg_replace('~\b'.$knownWords[$b].'\b~', " ", $blob); but this seems to skip all of the words and doesn't remove anything. Has anyone ever delt with a situation like this before? I see lots of posts about finding specific words, or trying to find text between tags, or removing special characters but when I search the forum for removing or even finding words in a phrase I get 2 hits, is this not the right direction I should be looking in for accomplishing this? Also the reason I'm going about it this way is because I need to build up a list of adjectives using the ones left behind in large strings of phrases, I can't go about it in reverse because I can't know which adjectives will be used ahead of time. Quote Link to comment https://forums.phpfreaks.com/topic/241764-removing-entire-phrases-from-a-string/ Share on other sites More sharing options...
xyph Posted July 12, 2011 Share Posted July 12, 2011 ereg_replace doesn't use delimiters. Try using preg_replace instead. Quote Link to comment https://forums.phpfreaks.com/topic/241764-removing-entire-phrases-from-a-string/#findComment-1241865 Share on other sites More sharing options...
premiso Posted July 12, 2011 Share Posted July 12, 2011 ereg_replace doesn't use delimiters. Try using preg_replace instead. ereg_replace is also depreciated. Stick to using preg_replace for that reason alone Quote Link to comment https://forums.phpfreaks.com/topic/241764-removing-entire-phrases-from-a-string/#findComment-1241872 Share on other sites More sharing options...
binxalot Posted July 12, 2011 Author Share Posted July 12, 2011 The solution was this: $word_escaped = preg_quote($knownWords[$b], '~'); //array of phrases... $pattern = '~\b' . $word_escaped . '\b~'; //mystery pattern... $blob = preg_replace($pattern, "", $blob, -1); //removes all of the words, probably the -1 is not needed. why this works I have no idea, but it does, it will remove phrases from a string while leaving behind words that also contain part of a phrase, so if you have text like "herself" but you're looking to remove the word "her" then only the word her will be removed, and the word "herself" will be unchanged. Quote Link to comment https://forums.phpfreaks.com/topic/241764-removing-entire-phrases-from-a-string/#findComment-1241898 Share on other sites More sharing options...
cags Posted July 13, 2011 Share Posted July 13, 2011 The \b matches word boundaries, that is to say the space between a 'word' character (letter, digit or underscore) and non-word character. The preg_quote function simply escapes the characters so that if you're pattern had, for example \b in it, then it would match the two literal characters \ and b rather than another word boundary. Quote Link to comment https://forums.phpfreaks.com/topic/241764-removing-entire-phrases-from-a-string/#findComment-1242436 Share on other sites More sharing options...
binxalot Posted July 14, 2011 Author Share Posted July 14, 2011 I see, makes sense, thanks for clarifying,but what does the tilda do? ~ Quote Link to comment https://forums.phpfreaks.com/topic/241764-removing-entire-phrases-from-a-string/#findComment-1242558 Share on other sites More sharing options...
cags Posted July 14, 2011 Share Posted July 14, 2011 The tilde's in this case are delimiters, they don't have to be a tilde, it can be one of many characters, I forget what the exact requirements are but it's practically anything that isn't a word character I believe. As a rule of thumb the delimiter use should simply be a 'character' that is unlikely to appear in the the pattern you are attempting to match. The delimiters are in place to separate the pattern you are matching from the modifiers. In your case you don't have any modifiers, but you can have for example ... $pattern = '~\b' . $word_escaped . '\b~i'; ... to make the pattern case insensitive. You can find a list of the supported modifiers on the PCRE pages of the manual. Quote Link to comment https://forums.phpfreaks.com/topic/241764-removing-entire-phrases-from-a-string/#findComment-1242582 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.