a1amattyj Posted July 8, 2011 Share Posted July 8, 2011 Hello, Trying to get all content within paragraph tags. Im using: preg_match("#<p>(.*)</p>#i", $contents, $match); There are around 8 paragraphs on this page, but it only matches two - and thats the first <p> tag with the last </p> tag. Not to sure what I should change (.*) too? Thanks Quote Link to comment Share on other sites More sharing options...
AyKay47 Posted July 8, 2011 Share Posted July 8, 2011 preg_match will only match 1 set...use preg_match_all Quote Link to comment Share on other sites More sharing options...
Psycho Posted July 8, 2011 Share Posted July 8, 2011 Well, you also have the problem that your regex is "greedy". Even if you used preg_match_all() it would only return one result - everything from the very first "<p>" to the very last "</p>". You need to make the expression "non-greedy" so that it will return each instance of "<p>" to the next "</p>". It is the asterisk (*) that is behaving in a greedy manner in that expression. One way to make it non-greedy is to add a question mark (?) after the asterisk. Also, I think you have to escape the forward slash. And I would also add some handling for the opening paragraph tag in case there are additional attributes for the paragraph tag such as a class or style attribute. Try this: preg_match_all("#<p[^>]*>(.*?)<\/p>#i", $contents, $matches); Quote Link to comment Share on other sites More sharing options...
AyKay47 Posted July 8, 2011 Share Posted July 8, 2011 good catch mjdamato thank you, only thing i'll add is the forward slash will not need to be escaped since the OP is using the # delimiter preg_match_all("#<p[^>]*>(.*?)</p>#i", $contents, $matches); nothing major though Quote Link to comment Share on other sites More sharing options...
a1amattyj Posted July 8, 2011 Author Share Posted July 8, 2011 Thank you guys, very informative! Quote Link to comment Share on other sites More sharing options...
.josh Posted July 8, 2011 Share Posted July 8, 2011 you may also need to add the 's' modifier... technically linebreaks shouldn't be within p tags but it's perfectly valid for them to be, and some people like doing it to make the source code more readable in their editor. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.