trace Posted April 22, 2009 Share Posted April 22, 2009 I've been Googling for a regexp to match php-blocks in a html-file. But Googling for "php regexp html" doesn't yield very much useful... I still believe the regexp has been written at least once... The idea being to strip out php-from a html-file, but store the stripped out blocks for later... I would need something like /<\?php (.*?)\?>/. But I would like it to match all possible php-blocks, see http://www.php.net/manual/en/language.basic-syntax.phpmode.php And it should also handle strings etc. which might contain the string "?>". I don't even know which ways a block of php can contain "?>" without it closing the php-block... Quote Link to comment Share on other sites More sharing options...
trace Posted April 22, 2009 Author Share Posted April 22, 2009 <html><body> <?php echo 'some string ?>';?> <script language="php">echo $foobar;</script> </body></html> would result in <html><body> <?php ?> <script language="php"></script> </body></html> And the php blocks would be returned in an array... Quote Link to comment Share on other sites More sharing options...
RussellReal Posted April 25, 2009 Share Posted April 25, 2009 '/(<\?(?:php)?.*?\?>)/i' but ofcourse.. that would result in errors if for whatever reason you're trying to like echo '?>'; in your php but thats rare I suppose, but you could use some sort of look ahead to be sure you get the very last one, or at the very least an atomic group.. good luck oh and you're using php? use preg_match_all instead of preg_match for greedy searches Quote Link to comment Share on other sites More sharing options...
.josh Posted April 25, 2009 Share Posted April 25, 2009 oh and you're using php? use preg_match_all instead of preg_match for greedy searches If by greedy you mean preg_match returns first match vs. preg_match_all returns every match it finds, then sure, you can call that a greedy search. Though that might confuse people, considering that that term is used to describe quantifiers, and that is not the same kind of greedy... OP: Not really sure what you're overall goal here is, but if you're scraping pages, php will not be in the file unless it's somehow commented out, outside of php tags to begin with. Quote Link to comment Share on other sites More sharing options...
RussellReal Posted April 25, 2009 Share Posted April 25, 2009 oh.. coz the first language I learned RegEx in was mIRC, and in mIRC or (mSL) the RegEx had modifyers, much like /i or /s but one of them was /g for 'greedy' or 'global' search so I just got used to calling it a greedy search, my bad Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.