RuleBritannia Posted November 3, 2013 Share Posted November 3, 2013 (edited) Hello So im beginning to now apprach breaking point because of this regex, I cannot understand why such a simple thing will not work, my interpritation of making a subgroup optional is by adding a question mark ? to the end of it, Which would mean, If its there return sub match in the match, if its not there, dont worry. Whats actually happening is, if its there, its not returning the subgroup match in the whole match result, the same with if its not there. If the match exists, I want the sub group returned in the overall match, It is returning the whole match, but thats not included. Here is a simple example to see, I cant believe this isnt working, I have now spent over 5 hours trying every different combination, on such a simple thing, Its not long before I smash the place up <?php $string = 'I HAVE HAD ENOUGH OF THIS SHIT'; preg_match("/(I HAVE HAD ENOUGH).+(THIS SHIT)?/i",$string,$match); echo '<pre>'; var_dump($match); echo '</pre>'; ?> Edited November 3, 2013 by RuleBritannia Quote Link to comment https://forums.phpfreaks.com/topic/283549-im-very-close-to-breaking-point-regex/ Share on other sites More sharing options...
requinix Posted November 3, 2013 Share Posted November 3, 2013 .+ is greedy. It will match as much as it possibly can before the engine even bothers to attempt matching anything else. As such, it will go all the way to the end of the string. Now at the end, it will attempt to match the optional group. It can't, of course, because there's nothing left to match against, but it's optional so that's okay. Quote Link to comment https://forums.phpfreaks.com/topic/283549-im-very-close-to-breaking-point-regex/#findComment-1456649 Share on other sites More sharing options...
RuleBritannia Posted November 3, 2013 Author Share Posted November 3, 2013 .+ is greedy. It will match as much as it possibly can before the engine even bothers to attempt matching anything else. As such, it will go all the way to the end of the string. Now at the end, it will attempt to match the optional group. It can't, of course, because there's nothing left to match against, but it's optional so that's okay. So therefore making .+ to .+? would make it lazy, but the result is still the exact same. Quote Link to comment https://forums.phpfreaks.com/topic/283549-im-very-close-to-breaking-point-regex/#findComment-1456659 Share on other sites More sharing options...
Solution requinix Posted November 3, 2013 Solution Share Posted November 3, 2013 No, the result is not exactly the same: array(2) { [0]=> string(30) "I HAVE HAD ENOUGH OF THIS SHIT" [1]=> string(17) "I HAVE HAD ENOUGH" }became array(2) { [0]=> string(18) "I HAVE HAD ENOUGH " [1]=> string(17) "I HAVE HAD ENOUGH" }Ungreediness means matching as little as possible. Since you used .+? and not .*?, all it has to do is match a single character. It does (a space), then tries the optional group, fails to match (which doesn't matter since it's optional), and ends. You need more. Perhaps you are trying to match the entire string? You need a $ anchor to make sure it matches against the entire string. (A ^ would be a good idea too.) preg_match("/^(I HAVE HAD ENOUGH).+?(THIS SHIT)?$/i",$string,$match); array(3) { [0]=> string(30) "I HAVE HAD ENOUGH OF THIS SHIT" [1]=> string(17) "I HAVE HAD ENOUGH" [2]=> string(9) "THIS SHIT" } Quote Link to comment https://forums.phpfreaks.com/topic/283549-im-very-close-to-breaking-point-regex/#findComment-1456660 Share on other sites More sharing options...
RuleBritannia Posted November 3, 2013 Author Share Posted November 3, 2013 No, the result is not exactly the same: array(2) { [0]=> string(30) "I HAVE HAD ENOUGH OF THIS SHIT" [1]=> string(17) "I HAVE HAD ENOUGH" }became array(2) { [0]=> string(18) "I HAVE HAD ENOUGH " [1]=> string(17) "I HAVE HAD ENOUGH" }Ungreediness means matching as little as possible. Since you used .+? and not .*?, all it has to do is match a single character. It does (a space), then tries the optional group, fails to match (which doesn't matter since it's optional), and ends. You need more. Perhaps you are trying to match the entire string? You need a $ anchor to make sure it matches against the entire string. (A ^ would be a good idea too.) preg_match("/^(I HAVE HAD ENOUGH).+?(THIS SHIT)?$/i",$string,$match); array(3) { [0]=> string(30) "I HAVE HAD ENOUGH OF THIS SHIT" [1]=> string(17) "I HAVE HAD ENOUGH" [2]=> string(9) "THIS SHIT" } Hello Your explanation is good, But if lazy = ungreedy as possible, and .+ is 1 or more(lazy = 1), doesnt that mean .* being 0 or more (lazy = 0), that wouldnt be right either. So it seems the problem is the beginning and end delimiters? It now seems to work, Thanks ALOT for your help, I have been used alot of regex without ^ and $ as start/finish, I guess something like this was destined to happen. Quote Link to comment https://forums.phpfreaks.com/topic/283549-im-very-close-to-breaking-point-regex/#findComment-1456661 Share on other sites More sharing options...
requinix Posted November 3, 2013 Share Posted November 3, 2013 (edited) if lazy = ungreedy as possible, and .+ is 1 or more(lazy = 1), doesnt that mean .* being 0 or more (lazy = 0), that wouldnt be right either. So it seems the problem is the beginning and end delimiters? What? .* - as much as possible but does not have to match anything .+ - as much as possible and has to match something .*? - as little as possible and does not have to match anything .+? - as little as possible but has to match somethingInterestingly enough, compare the usage of the words "and" and "but". Edited November 3, 2013 by requinix Quote Link to comment https://forums.phpfreaks.com/topic/283549-im-very-close-to-breaking-point-regex/#findComment-1456662 Share on other sites More sharing options...
RuleBritannia Posted November 3, 2013 Author Share Posted November 3, 2013 (edited) What? .* - as much as possible but does not have to match anything .+ - as much as possible and has to match something .*? - as little as possible and does not have to match anything .+? - as little as possible but has to match somethingInterestingly enough, compare the usage of the words "and" and "but". Yes, this is exactly how I understood it to be after your 1st post, but my later explaining was incorrect when i said 0 or more, I should have wrote nothing and more. It seems also that, whilst using start and end delimiters here, it works on this test question, but on my overall problem this doesnt work on , I think im going to bed, thanks for your help, I appreciate it. Edited November 3, 2013 by RuleBritannia Quote Link to comment https://forums.phpfreaks.com/topic/283549-im-very-close-to-breaking-point-regex/#findComment-1456664 Share on other sites More sharing options...
.josh Posted November 3, 2013 Share Posted November 3, 2013 It seems also that, whilst using start and end delimiters here, it works on this test question, but on my overall problem this doesnt work on , I think im going to bed, thanks for your help, I appreciate it. Okay well then maybe you should post what your actual problem is. "Asking questions 101 (circa 1969): State the actual, complete problem, not some subset or tangent of the problem. 'Dumbing it down' rarely works out. If you had the ability to accurately 'dumb it down,' you likely wouldn't be stuck with trying to fix the problem in the first place!" Quote Link to comment https://forums.phpfreaks.com/topic/283549-im-very-close-to-breaking-point-regex/#findComment-1456745 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.