Mycotheologist Posted July 16, 2012 Share Posted July 16, 2012 I was just wondering why (.*?) works the way it does. For example, if I do preg_match('/WORD1(.*?)WORD2/'); it will match anything in between WORD1 and WORD2. I know that . is a wild card so it matches any letter. From what I've read, * matches zero or more of the preceding characters so that makes it match whole strings, rather than a single character. Its the ? that confuses me, from what I read, ? matches zero or one of the preceding characters. What purpose does that serve then? What would happen if I omitted the ? Quote Link to comment https://forums.phpfreaks.com/topic/265751-how-does-it-work/ Share on other sites More sharing options...
ignace Posted July 16, 2012 Share Posted July 16, 2012 The question mark after * makes it ungreedy. If you leave it out, it will match everything between the first WORD1 and the last WORD2 where if you have multiple occurences of WORD2 the ungreedy operator will only match until the first WORD2 preg_match('!WORD1(.*?)WORD2!', 'WORD1foobar stands for FTP Operation Over Big Address Records.WORD2Explanation of these and more acronyms can be found atWORD2', $matches); print_r($matches); // ungreedy (foobar stands for FTP Operation Over Big Address Records.) preg_match('!WORD1(.*)WORD2!', 'WORD1foobar stands for FTP Operation Over Big Address Records.WORD2Explanation of these and more acronyms can be found atWORD2', $matches); print_r($matches); // greedy (foobar stands for FTP Operation Over Big Address Records.WORD2Explanation of these and more acronyms can be found at) Quote Link to comment https://forums.phpfreaks.com/topic/265751-how-does-it-work/#findComment-1361880 Share on other sites More sharing options...
silkfire Posted July 16, 2012 Share Posted July 16, 2012 Mycotheologist, let me explain what ungreedy means. A ? preceded by (normally .*) means a non-greedy match. If this is your text: WORD1 wohhahah WORD2 tatata WORD2 And you omit the ?, the preg_match .* will "eat up" (greedy) all characters from WORD1 to the last WORD2 found. WORD1 wohhahah WORD2 tatata WORD2 If you have the ?, it will be non-greedy ("nice") and only eat up characters until the first collision with WORD2. WORD1 wohhahah WORD2 tatata WORD2 Does that make sense? Quote Link to comment https://forums.phpfreaks.com/topic/265751-how-does-it-work/#findComment-1361892 Share on other sites More sharing options...
ragax Posted July 29, 2012 Share Posted July 29, 2012 If find it least confusing to explain what you are telling the regex engine to do. Dot-star (.*) tells the regex engine: "Match any character, zero or more times, as many times as possible". The dot-star will bulldoze its way to the end of the subject. Then, if needed to allow a match, it will backtrack, one character at a time. Dot-star-question-mark (.*?) tells the regex engine: "Match any character, zero or more times, as few times as possible". The engine will start out by matching zero characters, then, because it cannot return a match (since "WORD 2" has not been found), it will match one more character, then one more, and so on. For more details, you may like to check out my tut about the degrees of regex greed, and Jan's page on repetition. This is a very cool but crucial concept to grasp, please don't hesitate to ask for clarifications. Quote Link to comment https://forums.phpfreaks.com/topic/265751-how-does-it-work/#findComment-1365302 Share on other sites More sharing options...
Berre Posted August 2, 2012 Share Posted August 2, 2012 <a href="website.com">Click</a> or just visit <a href="example.com">my example page</a> Let's say you want to retrieve all the a tags, in this example 2. The bold text is what is matched, while the underscore is to show you where the criteria matches. The greedy (without ?) will match the entire string, because it doesn't stop at the first match. While the second (with ?) stops matching as early as it can, and will therefor make two matches. Example 1: Regex: /<a.*>.*<\/a>/ Matches: <a href="website.com">Click</a> or just visit <a href="example.com">my example page</a> Example 2: Regex: /<a.*?>.*?<\/a>/ Matches: <a href="website.com">Click</a> or just visit <a href="example.com">my example page</a> Quote Link to comment https://forums.phpfreaks.com/topic/265751-how-does-it-work/#findComment-1366309 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.