Jump to content

preg_match_all() string markup?


Fog Juice

Recommended Posts

Hey all,

 

I'm wondering if someone can point me to a good resource that will help me understand preg_match_all(). What I'm confused about is the string markup part and how it finds wtv string I'm looking for. For example, if I used  preg_match_all('/<a([^>]+)\>(.*?)\<\/a\>/i', $this->markup, $links); that is supposed to give me <a> and </a> but I'm not exactly sure what all of the symbols etc mean. Does anyone have a resource to help me learn about this markup?

 

 

Thank you.

Link to comment
Share on other sites

QUANTIFIERS

* + ? {s,n}

quantifiers gauge how many occurrences to expect

 

* is 0 to infinity

+ is 1 to infinity

? is 1 or 0

{s,n} is s to n

 

for example: /o{1,2}/

that regex would expect the following 2 strings 'o' and 'oo'

coz its looking for 'o' to occur 1 to 2 times

 

. is a wildcard in regex, it matches everything except new lines I believe. (unless you specify 's' as a modifyer)

 

for example: /.{5}/

that regex would expect any 5 character long string

E.G. '12ggh', '11111', etc

 

/<a([^>]+)\>(.*?)\<\/a\>/i

this regex is looking for a definate

<a

then it goes into a capturing group looking for anything which ISN'T '>' 1 or more times so.. once it matches whatever then exits the capturing group, it EXPECTS to find > after the group ends (because it will exit the group once it encounters a '>' BECAUSE it was looking for everything that WASN'T a '>')

 

so now we're at the > teh regex engine has collected so far <a .......> and now entering another capturing group looking for anything ('.' wildcard) 0 or more times, the ? after the * tells the quantifier to look from left to right, instead of automatically push to the end of the string, since .* will match everything anyway so by default the regex engine pushes to the end of the string, which would then look from right to left, so in order to look left to right they append the ?

 

so .*? means look for EVERYTHING from left to right, basically what it's doing is its matching then looking forward to see if it can move on in the pattern.

 

its looking for '<' to exit the capturing group. once it finds the '<' it will store the back reference before the '<' and then proceed to attempt to complete the pattern, once it matches successfully the regex engine stores the result and then attempts to find the next one, since preg_match_all is a global search

Link to comment
Share on other sites

or he coulda just googled regex..? and got the same results

I know. All I'm saying is you should have started you post by saying all those symbols are used for Regex before diving head first into the explanation because when you use the regex word in the middle of an explanation, it's a bit confusing to people who don't know regex.

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.