DJphp Posted December 2, 2007 Share Posted December 2, 2007 Hi, I am successfully scraping URL's from web pages to create an RSS feed. But, now I need to grab the first post from each link found. I can do this successfully, except I am having some trouble with some regexp. The text I receive looks like: <!-- message --> <div class="idview"> Some text to capture and more text. <div style="margin:20px; margin-top:5px; "> some more text <div> more text /div> And Even More TEXT ! where the ! stops the regexp <!-- / message --> I need to capture everything between the tags: <!-- message --> and <!-- / message --> My initial Regexp looks like: $patternDescriptions = "/(<!-- message -->[^!]+).*/i"; but if there is an exclamation in the text I need then the pattern matching stops and I lose tex Any help to grab all of the text between the message tags would be appreciated. thanks, DJphp Quote Link to comment Share on other sites More sharing options...
DJphp Posted December 2, 2007 Author Share Posted December 2, 2007 ha! solved it. my regexp now looks like: "/(<!-- message -->[^!]+)[^-]+[^-]+[^ ]+[^\/]+[^ ]+[^m]+[^e]+[^s]+[^s]+[^a]+[^g]+[^e]+[^ ]+[^-]+[^-]+[^>]+/i"; to match from <!-- message --> to <!-- / message --> DJphp Quote Link to comment Share on other sites More sharing options...
Orio Posted December 2, 2007 Share Posted December 2, 2007 Keep it simple man... <?php $data = <<<DATA <!-- message --> <div class="idview"> Some text to capture and more text. <div style="margin:20px; margin-top:5px; "> some more text <div> more text /div> And Even More TEXT ! where the ! stops the regexp <!-- / message --> DATA; preg_match("/<!-- message -->(.*?)<!-- \/ message -->/is", $data, $match); echo $match[1]; ?> Orio. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.