mmem700 Posted July 5, 2008 Share Posted July 5, 2008 I'm new to regexp so I'm having a really tough time with this one. I need a regexp that will parse an HTML comment *THAT contains a certain keyword*, into 3 parts: (1) The part before the comment tag (2) The comment tag itsef (3) The part after the comment tag I've tried using this regexp with preg_match(): /(.*)(<!--.*MyKeyword.*-->)(.*)/s but, it *does not* work with this HTML: <p>stuff before</p> <!-- <p>stuff</p> SomeKeyword <p>stuff</p> --> <!-- <p>stuff</p> MyKeyword <p>stuff</p> --> <p>stuff after</p> My 3 matches come out looking like this: (1) <p>stuff before</p> (2) <!-- <p>stuff</p> SomeKeyword <p>stuff</p> --> <!-- <p>stuff</p> MyKeyword <p>stuff</p> --> (3) <p>stuff after</p> It's picking up the first "<!--" and then expanding the second match (2) to the "-->" that occurs *after* the second comment tag because of the greedy ".*" . What I really need is for the matches to look like this: (1) <p>stuff before</p> <!-- <p>stuff</p> SomeKeyword <p>stuff</p> --> (2) <!-- <p>stuff</p> MyKeyword <p>stuff</p> --> (3) <p>stuff after</p> How do I do this? Thanks in advance! Quote Link to comment https://forums.phpfreaks.com/topic/113312-solved-parsing-html-comments-but-harder-cant-figure-this-one-out/ Share on other sites More sharing options...
sasa Posted July 5, 2008 Share Posted July 5, 2008 try <?php $text = '<p>stuff before</p> <!-- <p>stuff</p> SomeKeyword <p>stuff</p> --> <!-- <p>stuff</p> MyKeyword <p>stuff</p> --> <p>stuff after</p>'; preg_match('/^(.*)(<!--.*?MyKeyword.*?-->)(.*)$/s', $text, $out); print_r($out); ?> Quote Link to comment https://forums.phpfreaks.com/topic/113312-solved-parsing-html-comments-but-harder-cant-figure-this-one-out/#findComment-582355 Share on other sites More sharing options...
mmem700 Posted July 6, 2008 Author Share Posted July 6, 2008 Many thanks for the suggestion. This is definitely closer, but I'm still having a false match... $x = <<<eof <table border="0" align="center" cellpadding="0" cellspacing="0" class="ps_ListItem_TABLE"> <tr> <td width="120" align="center" valign="middle" class="ps_ListItem_TD-Pic"> <a href="__BlurbProductURL__"><img src="__BlurbPicURL__" border="0" class="ps_ListItem_IMG"></a> </td> <!-- <p class="Style_1"> </p> --> <td width="365" align="left" valign="top" class="ps_ListItem_TD-Desc"> <a href="__BlurbProductURL__" class="ps_ListItem_Title_A"> <p class="ps_ListItem_Title_P">__BlurbBanner__</p> </a> <p class="ps_ListItem_Body_P">__BlurbDesc__</p> <!-- <p class="Style_1"></p><p class="SomeOtherClass"></p> --> <!-- --> </td> </tr> </table> eof; $re = '/^(.*)(<!--.*?MyKeyword.*?-->)(.*)$/s' $MatchCount = preg_match($re, $x, $Matches); Yields this... $Matches[1] <table border="0" align="center" cellpadding="0" cellspacing="0" class="ps_ListItem_TABLE"> <tr> <td width="120" align="center" valign="middle" class="ps_ListItem_TD-Pic"> <a href="__BlurbProductURL__"><img src="__BlurbPicURL__" border="0" class="ps_ListItem_IMG"></a> </td> $Matches[2] <!-- <p class="Style_1"> </p> --> <td width="365" align="left" valign="top" class="ps_ListItem_TD-Desc"> <a href="__BlurbProductURL__" class="ps_ListItem_Title_A"> <p class="ps_ListItem_Title_P">MyKeyword</p> </a> <p class="ps_ListItem_Body_P">__BlurbDesc__</p> <!-- <p class="Style_1"></p><p class="SomeOtherClass"></p> --> $Matches[3] <!-- --> </td> </tr> </table> The requirement is that all matches of MyKeyword occur within HTML comment tags. The problem is that $Matches[2] is matching the instance of "MyKeyword" which is not within a comment tag (highlighted in red). It seems I somehow have to tell the regexp "don't look past the next "-->" for the next match. Thoughts? Ideas? Quote Link to comment https://forums.phpfreaks.com/topic/113312-solved-parsing-html-comments-but-harder-cant-figure-this-one-out/#findComment-582753 Share on other sites More sharing options...
mmem700 Posted July 6, 2008 Author Share Posted July 6, 2008 Correction to above code... (sorry) $x = <<<eof <table border="0" align="center" cellpadding="0" cellspacing="0" class="ps_ListItem_TABLE"> <tr> <td width="120" align="center" valign="middle" class="ps_ListItem_TD-Pic"> <a href="__BlurbProductURL__"><img src="__BlurbPicURL__" border="0" class="ps_ListItem_IMG">[/url] </td> <!-- <p class="Style_1"> </p> --> <td width="365" align="left" valign="top" class="ps_ListItem_TD-Desc"> <a href="__BlurbProductURL__" class="ps_ListItem_Title_A"> <p class="ps_ListItem_Title_P">MyKeyword</p> [/url] <p class="ps_ListItem_Body_P">__BlurbDesc__</p> <!-- <p class="Style_1"></p><p class="SomeOtherClass"></p> --> <!-- --> </td> </tr> </table> eof; Quote Link to comment https://forums.phpfreaks.com/topic/113312-solved-parsing-html-comments-but-harder-cant-figure-this-one-out/#findComment-582755 Share on other sites More sharing options...
sasa Posted July 6, 2008 Share Posted July 6, 2008 try <?php $x = <<<eof <table border="0" align="center" cellpadding="0" cellspacing="0" class="ps_ListItem_TABLE"> <tr> <td width="120" align="center" valign="middle" class="ps_ListItem_TD-Pic"> <a href="__BlurbProductURL__"><img src="__BlurbPicURL__" border="0" class="ps_ListItem_IMG">[/url] </td> <!-- <p class="Style_1"> </p> --> <td width="365" align="left" valign="top" class="ps_ListItem_TD-Desc"> <a href="__BlurbProductURL__" class="ps_ListItem_Title_A"> <p class="ps_ListItem_Title_P">MyKeyword</p> [/url] <p class="ps_ListItem_Body_P">__BlurbDesc__</p> <!-- <p class="Style_1"></p><p class="SomeOtherClass"></p> --> <!-- b <p class="ps_ListItem_Title_P">MyKeyword</p> --> </td> </tr> </table> eof; preg_match('/^(.*)(<!--.*?(?!-->)MyKeyword.*?-->)(.*)$/s', $x, $out); print_r($out); ?> Quote Link to comment https://forums.phpfreaks.com/topic/113312-solved-parsing-html-comments-but-harder-cant-figure-this-one-out/#findComment-582788 Share on other sites More sharing options...
mmem700 Posted July 7, 2008 Author Share Posted July 7, 2008 Thanks very much!! I do appreciate it. Quote Link to comment https://forums.phpfreaks.com/topic/113312-solved-parsing-html-comments-but-harder-cant-figure-this-one-out/#findComment-583349 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.