Jump to content

[SOLVED] preg_match, preg_match_all and preg_grep


diamondnular

Recommended Posts

Hi folks,

 

Sorry if I am too noob. I am trying to understand how all these functions work, and examples in php.net are not enough for me. What are the MAIN difference between these functions? For example, with $html below, what function I should use to find ALL html tags (<b>, </b>, <img ... />)? Also, if I want to replace img tag <img ... />, str_replace is the best function to use here?

$html = "<b>bold text</b><a href=howdy.html><img src='test.jpg' />click me</a>";

 

Thanks a bunch,

 

D.

From php.net:

 

preg_match — Perform a regular expression match

preg_match_all — Perform a global regular expression match

preg_grep — Return array entries that match the pattern

 

$html is not an array, so you would not use preg_grep. You want to find all occurrences, so you would use preg_match_all since it indicates that it is global.

 

Use str_replace if the string is static; if not, use preg_replace.

http://www.php.net/preg-match

 

http://us2.php.net/preg-match-all

 

http://us3.php.net/preg-grep

 

They all have self explanatory descriptions. I hate to be a jerk, but do your research. :\

 

I already read all above and others as well before asking here (as I said examples there are not enough for me). And I think I have general ideas how they work. Thanks for you input anyway :).

From php.net:

 

preg_match — Perform a regular expression match

preg_match_all — Perform a global regular expression match

preg_grep — Return array entries that match the pattern

 

$html is not an array, so you would not use preg_grep. You want to find all occurrences, so you would use preg_match_all since it indicates that it is global.

 

Use str_replace if the string is static; if not, use preg_replace.

 

Thanks Effigy, that is why I am playing with preg_match_all now. Just wonder with your guys experience so that I can learn from you. But I have to admit that PHP regex really drives me crazy.

<?php
$html = "<b>bold text</b><c>bold text</c><a href=howdy.html>click me</a>";

preg_match_all("/<[\/*]\w>/", $html, $matches, PREG_SET_ORDER);
?>

the [\/*] means zero or more \/, so both

<b>

and

</b>

will match. But I ran the code, and the output are just

</b>,</c>,</a>

. What I am doing wrong here?

Got it now. In order to pull out all the < () >

<?php
$html = "<b>bold text</b><c>bold text</c><a href=howdy.html><img src='test.jpg' />click me</a>";

preg_match_all("/<[^>]*>/", $html, $matches, PREG_PATTERN_ORDER);

print_r($matches);
?>

^> means there is NO addition > in the match. Why this page (http://php-regex.blogspot.com/2008/01/php-regex-cheat-sheet.html) says that ^ is start of the subject??? Anybody corrects me?

 

In order to pull out the img tag only (which is what I am trying to do)

<?php
$html = "<b>bold text</b><c>bold text</c><a href=howdy.html><img src='test.jpg' />click me</a>";

preg_match_all("/<[^>]*\/>/", $html, $matches, PREG_PATTERN_ORDER);

print_r($matches)."\n";
?>

 

Anybody knows which REGEX reference I should use for preg_match? The page above seems to not work for me (or I am too retarded???).

 

D.

<?php
$html = "<b>bold text</b><c>bold text</c><a href=howdy.html>click me</a>";

preg_match_all("/<[\/*]\w>/", $html, $matches, PREG_SET_ORDER);
?>

the [\/*] means zero or more \/, so both

<b>

and

</b>

will match. But I ran the code, and the output are just

</b>,</c>,</a>

. What I am doing wrong here?

 

My mistake. It should read [\/]*. Now I think I understand a little more about REGEX :).

the [\/*] means zero or more \/

 

A character class ([...]) matches any one item that it contains, and, since you have the quantifier inside of the class, this pattern will match / or * once.

 

Why this page (http://php-regex.blogspot.com/2008/01/php-regex-cheat-sheet.html) says that ^ is start of the subject?

 

Because character classes have their own set of metacharacters. Outside of a character class ^ serves as an anchor, inside it serves as negation.

 

If you want to match all beginning and end tags /<[^>]+>/ will suffice.

Why this page (http://php-regex.blogspot.com/2008/01/php-regex-cheat-sheet.html) says that ^ is start of the subject?

 

Because character classes have their own set of metacharacters. Outside of a character class ^ serves as an anchor, inside it serves as negation.

 

If you want to match all beginning and end tags /<[^>]+>/ will suffice.

 

Got it now, thanks Effigy. Do you know of a good and full reference for PHP REGEX?

 

D.

Not PHP specifically; however, PHP uses PCRE (Perl Compatible Regular Expressions) which are practically the standard. The PHP docs do cover these, but not well in my opinion. I prefer http://www.regular-expressions.info.

 

Thanks Effigy.

 

D.

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.