shane18 Posted September 24, 2009 Share Posted September 24, 2009 Can someone explain to me in high detail how: <? $CC = ".Username Reason is blah blah blah"; preg_match("/^([\.!#@\+^\$-_~]{1})(.+?)(.+?)$/", $CC, $CCM); echo "<pre>"; print_r($CCM); echo "</pre>"; ?> Makes this: Array ( [0] => .Username Reason is blah blah blah [1] => . [2] => U [3] => sername Reason is blah blah blah ) the outcome. I know how to make this work the way I want it too, but that is not the question because I am trying to learn how the engine works inside and out. This is my last piece of the puzzle. Link to comment https://forums.phpfreaks.com/topic/175314-solved-regex-engine-question/ Share on other sites More sharing options...
Garethp Posted September 24, 2009 Share Posted September 24, 2009 Ok, well, [0] will always be the entire string matched [1] is what was matched by the first bracket, which happened to be ([\.!#@\+^\$-_~]{1}) Now, ([\.!#@\+^\$-_~]{1}) means, 1 character (the {1} means one and only one), which has to be a . ! # + ^ $ - _ or ~. In this case it was . [2] is what was matched by the second bracket which was (.+?) which means match anything, to any amount, so long as you match as little as you are required [3] is the third bracket, which is the same as above Now, [2] matched only one character because there was another (.+?) to let it stop, because it's lazy, it said "Well, it's your job to match now, I'm gonna sit down and have a cup of Coffee" simply because it was lazy enough to pass the job on as soon as it could. Since there was no other match orders after [3], [3] had to match the rest, because it .+ which meant anything, once or more Link to comment https://forums.phpfreaks.com/topic/175314-solved-regex-engine-question/#findComment-923935 Share on other sites More sharing options...
nrg_alpha Posted September 24, 2009 Share Posted September 24, 2009 Shane18, To further expand on the explanation of things, I advise you to have a look at this thread, which explains things regarding .+ and .+? (in particular, read post #11 and #14). Also note that in your pattern, you used the {1} (called an interval) after the character class (character class = [...] notation).. this is not necessary, as a character class already checks for a single character only.. so using [abc] will check for either an a, b or c at the current location in the source string, just as [abc]{1} will. Intervals are more useful for things like {1,} (minimum one, or any additional amount - similar to the + quantifier), or say {2,7} (minimum 2, maximum 7) kind of thing. Simply using {1} is impractical, as whatever aspect of the pattern that precedes it will represent at least one.. so the pattern #sle{1}pt# is the same as simply using #slept#, as in both cases, a single 'e' is understood automatically. As well, with regards to character classes, it is important to understand that most meta characters (meta characters are characters that have special meanings; examples are like the dot (which is a match_all character that typically matches any single character other than a newline by default)) lose their special meaning within a character class..(some meta characters can retain their special meaning, depending on their location within the character class) so for a literal dot in the character class, you don't need to escape it... (position of the dot in a character class doesn't matter). Notice however the location of your hyphen (-) character in the class (this is where location in the character class becomes crucial). If you want to look for a literal hyphen, list it as the very first or very last character in the character class, otherwise you are creating a range instead. So in your case, you have \$-_ which creates a range from the dollar sign to the underscore, which would create undesirable results.. (much like [a-z] will look for a range from a all the way to z). Relocate that hyphen to the start or end, as this will be clear to the regex engine that this is not a range (as you won't have characters listed on both sides of it) and will instead force it to be treated as a literal. Link to comment https://forums.phpfreaks.com/topic/175314-solved-regex-engine-question/#findComment-924107 Share on other sites More sharing options...
Recommended Posts
Archived
This topic is now archived and is closed to further replies.