Cobra23 Posted September 9, 2018 Share Posted September 9, 2018 (edited) Hello, Can you please help with 3 regex codes I have as I am in experienced with this but they do seem to work fine. What I do not understand is if they do avoid a ReDOS attack as I do not know how to test them. <?php preg_match("/^[A-Za-z0-9.\-\,\!\'\s\r?\n]{2,100}+$/", $mycontent) preg_match("/^[A-Za-z0-9\\!\\@\\)\\-\\_\\#]{8,10}$/D", $mycontent) preg_replace("/[^A-Za-z0-9\\<\\>\\.\\/\\,\\'\\;\\:\\&\\!\\%\\s]/", "", $mycontent) ?> Are the two backslashes acceptable in this? Or is it designed wrong? Edited September 9, 2018 by Cobra23 Quote Link to comment https://forums.phpfreaks.com/topic/307674-help-on-regexs-to-avoid-redos-attack/ Share on other sites More sharing options...
requinix Posted September 10, 2018 Share Posted September 10, 2018 <?php preg_match('/^[A-Za-z0-9.\-,!\'\s\r?\n]{2,100}+$/', $mycontent) preg_match('/^[A-Za-z0-9!@)\-_#]{8,10}$/D', $mycontent) preg_replace('/[^A-Za-z0-9<>.\/,\';:&!%\s]/', "", $mycontent) ?> That's a cleaner version. Backslashes are not like salt and pepper. They do not enhance the flavoring. I can't figure out the context of these regexes. Need context to be able to say if there's anything wrong with them. Quote Link to comment https://forums.phpfreaks.com/topic/307674-help-on-regexs-to-avoid-redos-attack/#findComment-1560735 Share on other sites More sharing options...
Cobra23 Posted September 10, 2018 Author Share Posted September 10, 2018 (edited) Thank you very much. I think I used double backslashes because of it crashing or not working due to some of the special characters and got carried away on the others using the same thing. I can see that you have the single backslash before the 3 below: /,- apart from the above 3 and: \s \r \n \d Is there any other special characters that requires the backslash without crashing? Or an online reference to this? Edited September 10, 2018 by Cobra23 Quote Link to comment https://forums.phpfreaks.com/topic/307674-help-on-regexs-to-avoid-redos-attack/#findComment-1560738 Share on other sites More sharing options...
requinix Posted September 10, 2018 Share Posted September 10, 2018 There's a backslash for hyphens because inside []s they mean a character range (like A-Z). The one on the forward slash is because that character is being used as the regex delimiter so the one inside must be escaped. The comma doesn't have one - look again. Most characters in a [] set don't need to be escaped. Even ones that normally are special, like periods or parentheses. It's basically just the hyphen (special inside a [], not special outside), the delimiter (special everywhere), and ] when intended as a character (to avoid confusion with the one ending the set). Going back to the original question, the problem happens with inefficient regexes that cause the engine to backtrack. Backtracking, and the overall matter of how the engine tries to make a match, is a complicated subject that I don't want to try to explain here, but that means it's harder to explain what you need to avoid. For the most part, repeating something that's being repeated itself is risky. The {2,100}+ may look like it's doing that, but actually it's a particular syntax used primarily to increase performance. They look fine to me. Really, this "ReDOS" thing only happens with non-trivial regular expressions. Stuff more complicated than those three. Quote Link to comment https://forums.phpfreaks.com/topic/307674-help-on-regexs-to-avoid-redos-attack/#findComment-1560742 Share on other sites More sharing options...
Cobra23 Posted September 10, 2018 Author Share Posted September 10, 2018 (edited) Thank you for the very clear explanation. I can understand how preg_match will be repeated by keep looking for matches in long textareas especially for those used for messages or with content editors used for a summary section or bio. I don't seem to know of a quicker solution than using preg_match for validation as i am filtering, sanitizing and using preg_replace before it. As with lengths similar or even much bigger than the {2,100}+, I am also using strlen before it so I thought that having it's min/max length also in the preg_match will help performance (but believe it's not required if i'm using strlen before it). Is there a solution to using something better than preg_match for long textareas like messages or content editors as I wouldn't want it to become slow or stall? Edited September 10, 2018 by Cobra23 Quote Link to comment https://forums.phpfreaks.com/topic/307674-help-on-regexs-to-avoid-redos-attack/#findComment-1560745 Share on other sites More sharing options...
requinix Posted September 10, 2018 Share Posted September 10, 2018 ...bold? I'm confused. It sounds like you think I told you not to use regular expressions. Or that someone told you not to. That's wrong. They're fine to use as long as you're careful about how you write them. If you write anything complicated, test it with very large inputs that should match and some that should not match. You'll probably find out real quick whether there's a problem. 1 Quote Link to comment https://forums.phpfreaks.com/topic/307674-help-on-regexs-to-avoid-redos-attack/#findComment-1560746 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.