sKunKbad Posted May 21, 2016 Share Posted May 21, 2016 I'm on Ubuntu 16.04 with PHP7, and I have no encountered this problem in other environments. The following script fails (white screen of death) unless I subtract a character from $string. What is going on? <?php $string = "# MAKE SURE TO LEAVE THE NEXT TWO LINES HERE. # BEGIN DENY LIST -- # END DENY LIST -- asdasdasdasdasd asdasdasdasdasd asdasdasdasdasd asdasdasdasdasd asdasdasdasdasd asdasdasdasdasd asdasdasdasdasd asdasdasdasdasd asdasdasdasdasd asdasdasdasdasd asdasdasdasdasd asdasdasdasdasd fsdfsdfsdfsdf asdasdasdasdasd asdasdasdasdasd asdasdasdasdasd asdasdasdasdasd asdasdasdasdasd asdasdasdasdasd asdasdasdasdasd asdasdasdasdasd asdasdasdasdasd asdasdasdasdasd asdasdasdasdasd asdasdasdasdasd asdasdasdasdasd asdasdasdasdasd asdasdasdasdasd fsdfsdfsdfsdf asdasdasdasdasd asdasdasdasdasd asdasdasdasdasd asdasdasdasdasd asdasdasdasdasd asdasdasdasdasd asdasdasdasdasd asdasdasdasdasd asdasdasdasdasd asdasdasdasdasd asdasdasdasdasd asdasdasdasdasd asdasdasdasdasd asdasdasdasdasd asdasdasdasdasd fsdfsdfsdfsdf asdasdasdasdasd asdasdasdasdasd asdasdasdasdasd asdasdasdasdasd asdasdasdasdasd asdasdasdasdasd asdasdasdasdasd asdasdasdasdasd asdasdasdasdasd asdasdasdasdasd asdasdasdasdasd asdasdasdasdasd asdasdasdasdasd asdasdasdasdasd asdasdasdasdasd fsdfsdfsdfsdf asdasdasdasdasd asdasdasdasdasd ass"; $insert = 'Whatever'; $pattern = '/(?<=# BEGIN DENY LIST --)(.|\n)*(?=# END DENY LIST --)/'; // Within the string, replace the denial list with the new one $string = preg_replace( $pattern, $insert, $string ); echo $string; Quote Link to comment Share on other sites More sharing options...
sKunKbad Posted May 21, 2016 Author Share Posted May 21, 2016 I ended up coming up with a solution that uses explode and str_replace, but it relies on the DENY LIST comments being at the top of the file. I thought about using exec and sed, but I took the easy way out for now. Still curious as to what is up with preg_replace. It doesn't seem like it's very reliable if it can't handle a big string. Quote Link to comment Share on other sites More sharing options...
Jacques1 Posted May 22, 2016 Share Posted May 22, 2016 (edited) You're using the worst possible regex for the input, so it's only natural that your script blows up. Since you're using a greedy quantifier in the middle part, the entire input after “BEGIN DENY LIST” is consumed. Then the regex engine has to go all the way back to “END DENY LIST”, character by character, each time checking the lookahead. If you anaylze the regex with a tool like Regex Buddy, you can actually see the excessive backtracking and the large number of required steps. If the deny list is very small compared to the part after the “END DENY LIST”, try a nongreedy quantifier (like “*?”). Or simply use strpos() and strrpos(). Regular expressions aren't the solution to everything. Edited May 22, 2016 by Jacques1 1 Quote Link to comment Share on other sites More sharing options...
sKunKbad Posted May 22, 2016 Author Share Posted May 22, 2016 You're using the worst possible regex for the input, so it's only natural that your script blows up. Since you're using a greedy quantifier in the middle part, the entire input after “BEGIN DENY LIST” is consumed. Then the regex engine has to go all the way back to “END DENY LIST”, character by character, each time checking the lookahead. If you anaylze the regex with a tool like Regex Buddy, you can actually see the excessive backtracking and the large number of required steps. If the deny list is very small compared to the part after the “END DENY LIST”, try a nongreedy quantifier (like “*?”). Or simply use strpos() and strrpos(). Regular expressions aren't the solution to everything. Everything before "END DENY LIST --" ends up getting tossed out and dynamically rebuilt, so I just used explode: $arr = explode('END DENY LIST --', $string); $string = $new_deny_list . $arr[1]; I have a copy of Regex Buddy that's probably almost a decade old, and a book on regex, so I should probably go find them. In the interest of trying to understand what you're suggesting, I just found a site that does online regex analysis, https://regex101.com/ Do I understand correctly that the nongreedy quantifier would simply add a question mark after my asterisk, like this: (?<=# BEGIN DENY LIST --)(.|\n)*?(?=# END DENY LIST --) In the interest of learning, what would your regex look like if you had to use regex? Quote Link to comment Share on other sites More sharing options...
requinix Posted May 22, 2016 Share Posted May 22, 2016 Ungreedy will help but it will still crash on large enough inputs (that is, a long enough distance between the BEGIN and END). The problem is you're putting the quantifier on a capturing group, and the PCRE library will smash the stack trying to remember everything. 1. If you don't need to capture what's in there, don't use a capturing group. 2. Alternating .|\n is like using the /s flag but worse. '/(? 1 Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.