Jump to content

Archived

This topic is now archived and is closed to further replies.

Anidazen

Advanced Regex problem. First vs last match!

Recommended Posts

Hello.

When trying to run regular expressions, it seems sometimes the expression doesn't want to pick the first instance as it should.

Example would be: "/blah.*?([0-9]*)/is" on the following:
blah 25 blahblah blah 62 blah blah blahblah 33

In this example, it would return either 62 or 33, but not 25 - which would match if there were nothing after it. This is incredibly frustrating and is ofc. breaking my script's use. How can I fix this? Why does this happen!



Thanks in advance.

Share this post


Link to post
Share on other sites
Your regex did not return anything for me. How about this?

/blah\s*(\d+)/is

Share this post


Link to post
Share on other sites
I'm surprised that expression returned any numbers at all for that example.

The reason it's not doing what you want is because of the asterisk after the number class ([0-9][color=red]*[/color]).  What your entire expression means is "find 'blah' followed by zero or more of anything without being greedy, followed by zero or more numbers."  But both of those "zero or more" characters match at the zero-length atom(?) just beyond "blah".  To the regular expression engine it's as if there were an invisible character between the "h" in "blah" and the space following it.  This is analagous to the word boundry class (\b) where a match occurs between a word character and a non-word character.

Effigy's suggestion will work fine for you, unless you require that there be a space between "blah" and the trailing number, in which case you'd need to use "\s+" instead of "\s*".

Share this post


Link to post
Share on other sites

×

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.