Jump to content
Cobra23

Help on Regex's to avoid ReDOS attack

Recommended Posts

Hello,

Can you please help with 3 regex codes I have as I am in experienced with this but they do seem to work fine. What I do not understand is if they do avoid a ReDOS attack as I do not know how to test them.

<?php
preg_match("/^[A-Za-z0-9.\-\,\!\'\s\r?\n]{2,100}+$/", $mycontent)

preg_match("/^[A-Za-z0-9\\!\\@\\)\\-\\_\\#]{8,10}$/D", $mycontent)

preg_replace("/[^A-Za-z0-9\\<\\>\\.\\/\\,\\'\\;\\:\\&\\!\\%\\s]/", "", $mycontent)
?>

Are the two backslashes acceptable in this? Or is it designed wrong?

Edited by Cobra23

Share this post


Link to post
Share on other sites
<?php
preg_match('/^[A-Za-z0-9.\-,!\'\s\r?\n]{2,100}+$/', $mycontent)

preg_match('/^[A-Za-z0-9!@)\-_#]{8,10}$/D', $mycontent)

preg_replace('/[^A-Za-z0-9<>.\/,\';:&!%\s]/', "", $mycontent)
?>

That's a cleaner version. Backslashes are not like salt and pepper. They do not enhance the flavoring.

I can't figure out the context of these regexes. Need context to be able to say if there's anything wrong with them.

Share this post


Link to post
Share on other sites

Thank you very much. I think I used double backslashes because of it crashing or not working due to some of the special characters and got carried away on the others using the same thing. I can see that you have the single backslash before the 3 below:

/,-

apart from the above 3 and:

\s \r \n \d

Is there any other special characters that requires the backslash without crashing? Or an online reference to this?

Edited by Cobra23

Share this post


Link to post
Share on other sites

There's a backslash for hyphens because inside []s they mean a character range (like A-Z). The one on the forward slash is because that character is being used as the regex delimiter so the one inside must be escaped. The comma doesn't have one - look again.

Most characters in a [] set don't need to be escaped. Even ones that normally are special, like periods or parentheses. It's basically just the hyphen (special inside a [], not special outside), the delimiter (special everywhere), and ] when intended as a character (to avoid confusion with the one ending the set).

Going back to the original question, the problem happens with inefficient regexes that cause the engine to backtrack. Backtracking, and the overall matter of how the engine tries to make a match, is a complicated subject that I don't want to try to explain here, but that means it's harder to explain what you need to avoid. For the most part, repeating something that's being repeated itself is risky. The {2,100}+ may look like it's doing that, but actually it's a particular syntax used primarily to increase performance.

They look fine to me. Really, this "ReDOS" thing only happens with non-trivial regular expressions. Stuff more complicated than those three.

Share this post


Link to post
Share on other sites

Thank you for the very clear explanation.

I can understand how preg_match will be repeated by keep looking for matches in long textareas especially for those used for messages or with content editors used for a summary section or bio. I don't seem to know of a quicker solution than using preg_match for validation as i am filtering, sanitizing and using preg_replace before it. As with lengths similar or even much bigger than the {2,100}+, I am also using strlen before it so I thought that having it's min/max length also in the preg_match will help performance (but believe it's not required if i'm using strlen before it).

Is there a solution to using something better than preg_match for long textareas like messages or content editors as I wouldn't want it to become slow or stall?

Edited by Cobra23

Share this post


Link to post
Share on other sites

...bold?

I'm confused. It sounds like you think I told you not to use regular expressions. Or that someone told you not to. That's wrong. They're fine to use as long as you're careful about how you write them. If you write anything complicated, test it with very large inputs that should match and some that should not match. You'll probably find out real quick whether there's a problem.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.