Jump to content

preg_match


supermerc

Recommended Posts

Hey

 

I want to do a preg_match_all search of my document and extract two things however I dont understand the things we need to put to make it extract.

 

I had one that worked which was

 

$pattern = "/src=[\"']?([^\"']?.*(png|jpg|gif))[\"']?/i";

 

But this only gets my the image link, theres something else further I also need.

 

Forexample the complete string looks something like

 

src="/images/potion_red.gif" style="border:0px;margin:0px;width:23px;height:24px;" ONMOUSEOVER="itempopup(event,'8986689')"

 

What i would want to extract in that would be potion_red.gif AND 8986689

 

Please help me!

Link to comment
Share on other sites

$str = <<<EOF
src="/images/potion_red.gif" style="border:0px;margin:0px;width:23px;height:24px;" ONMOUSEOVER="itempopup(event,'8986689')"
EOF;

$pattern = '#src=[\'"][^\'"]+/(.+?\.(?:png|jpe?g|gif))[\'"].+?ONMOUSEOVER=[\'"]itempopup\(event,\'(\d+)\'\)[\'"]#i';
preg_match_all($pattern, $str, $matches);
echo $matches[1][0] . ' - ' . $matches[2][0];

 

Ok, here's the logic. Start off with src= then either ' or ". Now here's the trick.. we want to stay within the src quotes, otherwise we might end up matching a file name ending with png, gif or jpeg further down the string (if it exists) if we aren't careful... We do this by following [\'"] with [^\'"]+. So at this point, it will match: /images/potion_red.gif

 

However, this is too much info. So after [^\'"]+, we specify that we want /, then using a lazy quantifier, creep up, matching everything to (and including) a required dot followed by either png, gif, or jpe?g and finally the closing quote (be it ' or "). So while this causes some backtracking early, it ensures we stay within the quotes for accuracy.

 

Next, I make the assumption that ONMOUSEOVER is still in the same line as src.. so I next make it match anything (other than a new line, as we are not using the s modifer after the cloding dleimiter) and be lazy about it: .+? until we match ONMOUSEOVER. Unfortunately, we can't use the typical [^\'"]+ safeguard we had in place with src because ONMOUSEOVER can contain both single and double quotes.. so we just lazily match along to find the rest (capturing the digits we find).

 

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.