Jump to content

Recommended Posts

Hi folks,

 

Sorry if I am too noob. I am trying to understand how all these functions work, and examples in php.net are not enough for me. What are the MAIN difference between these functions? For example, with $html below, what function I should use to find ALL html tags (<b>, </b>, <img ... />)? Also, if I want to replace img tag <img ... />, str_replace is the best function to use here?

$html = "<b>bold text</b><a href=howdy.html><img src='test.jpg' />click me</a>";

 

Thanks a bunch,

 

D.

From php.net:

 

preg_match — Perform a regular expression match

preg_match_all — Perform a global regular expression match

preg_grep — Return array entries that match the pattern

 

$html is not an array, so you would not use preg_grep. You want to find all occurrences, so you would use preg_match_all since it indicates that it is global.

 

Use str_replace if the string is static; if not, use preg_replace.

http://www.php.net/preg-match

 

http://us2.php.net/preg-match-all

 

http://us3.php.net/preg-grep

 

They all have self explanatory descriptions. I hate to be a jerk, but do your research. :\

 

I already read all above and others as well before asking here (as I said examples there are not enough for me). And I think I have general ideas how they work. Thanks for you input anyway :).

From php.net:

 

preg_match — Perform a regular expression match

preg_match_all — Perform a global regular expression match

preg_grep — Return array entries that match the pattern

 

$html is not an array, so you would not use preg_grep. You want to find all occurrences, so you would use preg_match_all since it indicates that it is global.

 

Use str_replace if the string is static; if not, use preg_replace.

 

Thanks Effigy, that is why I am playing with preg_match_all now. Just wonder with your guys experience so that I can learn from you. But I have to admit that PHP regex really drives me crazy.

<?php
$html = "<b>bold text</b><c>bold text</c><a href=howdy.html>click me</a>";

preg_match_all("/<[\/*]\w>/", $html, $matches, PREG_SET_ORDER);
?>

the [\/*] means zero or more \/, so both

<b>

and

</b>

will match. But I ran the code, and the output are just

</b>,</c>,</a>

. What I am doing wrong here?

Got it now. In order to pull out all the < () >

<?php
$html = "<b>bold text</b><c>bold text</c><a href=howdy.html><img src='test.jpg' />click me</a>";

preg_match_all("/<[^>]*>/", $html, $matches, PREG_PATTERN_ORDER);

print_r($matches);
?>

^> means there is NO addition > in the match. Why this page (http://php-regex.blogspot.com/2008/01/php-regex-cheat-sheet.html) says that ^ is start of the subject??? Anybody corrects me?

 

In order to pull out the img tag only (which is what I am trying to do)

<?php
$html = "<b>bold text</b><c>bold text</c><a href=howdy.html><img src='test.jpg' />click me</a>";

preg_match_all("/<[^>]*\/>/", $html, $matches, PREG_PATTERN_ORDER);

print_r($matches)."\n";
?>

 

Anybody knows which REGEX reference I should use for preg_match? The page above seems to not work for me (or I am too retarded???).

 

D.

<?php
$html = "<b>bold text</b><c>bold text</c><a href=howdy.html>click me</a>";

preg_match_all("/<[\/*]\w>/", $html, $matches, PREG_SET_ORDER);
?>

the [\/*] means zero or more \/, so both

<b>

and

</b>

will match. But I ran the code, and the output are just

</b>,</c>,</a>

. What I am doing wrong here?

 

My mistake. It should read [\/]*. Now I think I understand a little more about REGEX :).

the [\/*] means zero or more \/

 

A character class ([...]) matches any one item that it contains, and, since you have the quantifier inside of the class, this pattern will match / or * once.

 

Why this page (http://php-regex.blogspot.com/2008/01/php-regex-cheat-sheet.html) says that ^ is start of the subject?

 

Because character classes have their own set of metacharacters. Outside of a character class ^ serves as an anchor, inside it serves as negation.

 

If you want to match all beginning and end tags /<[^>]+>/ will suffice.

Why this page (http://php-regex.blogspot.com/2008/01/php-regex-cheat-sheet.html) says that ^ is start of the subject?

 

Because character classes have their own set of metacharacters. Outside of a character class ^ serves as an anchor, inside it serves as negation.

 

If you want to match all beginning and end tags /<[^>]+>/ will suffice.

 

Got it now, thanks Effigy. Do you know of a good and full reference for PHP REGEX?

 

D.

Not PHP specifically; however, PHP uses PCRE (Perl Compatible Regular Expressions) which are practically the standard. The PHP docs do cover these, but not well in my opinion. I prefer http://www.regular-expressions.info.

 

Thanks Effigy.

 

D.

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.