Jump to content

Im VERY CLOSE to breaking point, REGEX


Go to solution Solved by requinix,

Recommended Posts

Hello

 

So im beginning to now apprach breaking point because of this regex, I cannot understand why such a simple thing will not work, my interpritation of making a subgroup optional is by adding a question mark ? to the end of it, Which would mean, If its there return sub match in the match, if its not there, dont worry.

 

Whats actually  happening is, if its there, its not returning the subgroup match in the whole match result, the same with if its not there.

 

If the match exists, I want the sub group returned in the overall match, It is returning the whole match, but thats not included.

 

Here is a simple example to see, I cant believe this isnt working, I have now spent over 5 hours trying every different combination, on such a simple thing, Its not long before I smash the place up

<?php

$string = 'I HAVE HAD ENOUGH OF THIS SHIT';

preg_match("/(I HAVE HAD ENOUGH).+(THIS SHIT)?/i",$string,$match);

echo '<pre>';
var_dump($match);
echo '</pre>';

?>
Edited by RuleBritannia
Link to comment
https://forums.phpfreaks.com/topic/283549-im-very-close-to-breaking-point-regex/
Share on other sites

.+ is greedy. It will match as much as it possibly can before the engine even bothers to attempt matching anything else. As such, it will go all the way to the end of the string.

Now at the end, it will attempt to match the optional group. It can't, of course, because there's nothing left to match against, but it's optional so that's okay.

.+ is greedy. It will match as much as it possibly can before the engine even bothers to attempt matching anything else. As such, it will go all the way to the end of the string.

Now at the end, it will attempt to match the optional group. It can't, of course, because there's nothing left to match against, but it's optional so that's okay.

 

So therefore making

.+

to

.+?

would make it lazy, but the result is still the exact same.

  • Solution

No, the result is not exactly the same:

array(2) {
  [0]=>
  string(30) "I HAVE HAD ENOUGH OF THIS SHIT"
  [1]=>
  string(17) "I HAVE HAD ENOUGH"
}
became

array(2) {
  [0]=>
  string(18) "I HAVE HAD ENOUGH "
  [1]=>
  string(17) "I HAVE HAD ENOUGH"
}
Ungreediness means matching as little as possible. Since you used .+? and not .*?, all it has to do is match a single character. It does (a space), then tries the optional group, fails to match (which doesn't matter since it's optional), and ends.

 

You need more. Perhaps you are trying to match the entire string? You need a $ anchor to make sure it matches against the entire string. (A ^ would be a good idea too.)

preg_match("/^(I HAVE HAD ENOUGH).+?(THIS SHIT)?$/i",$string,$match);
array(3) {
  [0]=>
  string(30) "I HAVE HAD ENOUGH OF THIS SHIT"
  [1]=>
  string(17) "I HAVE HAD ENOUGH"
  [2]=>
  string(9) "THIS SHIT"
}

No, the result is not exactly the same:

array(2) {
  [0]=>
  string(30) "I HAVE HAD ENOUGH OF THIS SHIT"
  [1]=>
  string(17) "I HAVE HAD ENOUGH"
}
became

array(2) {
  [0]=>
  string(18) "I HAVE HAD ENOUGH "
  [1]=>
  string(17) "I HAVE HAD ENOUGH"
}
Ungreediness means matching as little as possible. Since you used .+? and not .*?, all it has to do is match a single character. It does (a space), then tries the optional group, fails to match (which doesn't matter since it's optional), and ends.

 

You need more. Perhaps you are trying to match the entire string? You need a $ anchor to make sure it matches against the entire string. (A ^ would be a good idea too.)

preg_match("/^(I HAVE HAD ENOUGH).+?(THIS SHIT)?$/i",$string,$match);
array(3) {
  [0]=>
  string(30) "I HAVE HAD ENOUGH OF THIS SHIT"
  [1]=>
  string(17) "I HAVE HAD ENOUGH"
  [2]=>
  string(9) "THIS SHIT"
}

Hello

 

Your explanation is good, But if lazy = ungreedy as possible, and .+ is 1 or more(lazy = 1), doesnt that mean .* being 0 or more (lazy = 0), that wouldnt be right either.

So it seems the problem is the beginning and end delimiters?

 

It now seems to work, Thanks ALOT for your help, I have been used alot of regex without ^ and $ as start/finish, I guess something like this was destined to happen.

if lazy = ungreedy as possible, and .+ is 1 or more(lazy = 1), doesnt that mean .* being 0 or more (lazy = 0), that wouldnt be right either.

So it seems the problem is the beginning and end delimiters?

What?

.*  - as much as possible but does not have to match anything
.+  - as much as possible and has to match something
.*? - as little as possible and does not have to match anything
.+? - as little as possible but has to match something
Interestingly enough, compare the usage of the words "and" and "but". Edited by requinix

What?

.*  - as much as possible but does not have to match anything
.+  - as much as possible and has to match something
.*? - as little as possible and does not have to match anything
.+? - as little as possible but has to match something
Interestingly enough, compare the usage of the words "and" and "but".

 

Yes, this is exactly how I understood it to be after your 1st post, but my later explaining was incorrect when i said 0 or more, I should have wrote nothing and more.

 

It seems also that, whilst using start and end delimiters here, it works on this test question, but on my overall problem this doesnt work on -_-, I think im going to bed, thanks for your help, I appreciate it.

Edited by RuleBritannia

It seems also that, whilst using start and end delimiters here, it works on this test question, but on my overall problem this doesnt work on -_-, I think im going to bed, thanks for your help, I appreciate it.

Okay well then maybe you should post what your actual problem is. "Asking questions 101 (circa 1969): State the actual, complete problem, not some subset or tangent of the problem. 'Dumbing it down' rarely works out. If you had the ability to accurately 'dumb it down,' you likely wouldn't be stuck with trying to fix the problem in the first place!"

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.