natasha_thomas Posted May 11, 2011 Share Posted May 11, 2011 Folks, Requirement: I want a .htaccess level solution to 404 when the URL contains special characters other than mentioned in the below Rewrite rule: RewriteRule ^([a-zA-Z0-9-!@#$^&*:"<>/?]{4,})\.html$ search.php?q=$1 [QSA,L] So, what i want is, i want to show a 404 when the URL contains anything other than "a-zA-Z0-9-!@#$^&*:"<>/?" What i have done: RewriteRule ^([a-zA-Z0-9-!@#$^&*:"<>/?]{4,})\.html$ search.php?q=$1 [QSA,L] Problem: Its not working with Special characters but working only with English letters and Numerics in URL. Cheers Natasha T Quote Link to comment https://forums.phpfreaks.com/topic/236067-404-based-on-special-character-in-url/ Share on other sites More sharing options...
gizmola Posted May 11, 2011 Share Posted May 11, 2011 Really what you want is a fall through rule for the 404 that follows the working rule, but otherwise broadly matches your pattern. The existing rule will match what you want, and anything else will be redirected. Trying to do a negative match is quite tricky and not really the strength of regex. In this case you what you're really saying is -I could have a valid character OR not -and some number of invalid characters -AND some valid characters OR not -etc This is probably why you're having a hard time crafting something that works. Quote Link to comment https://forums.phpfreaks.com/topic/236067-404-based-on-special-character-in-url/#findComment-1213593 Share on other sites More sharing options...
natasha_thomas Posted May 11, 2011 Author Share Posted May 11, 2011 Really what you want is a fall through rule for the 404 that follows the working rule, but otherwise broadly matches your pattern. The existing rule will match what you want, and anything else will be redirected. Trying to do a negative match is quite tricky and not really the strength of regex. In this case you what you're really saying is -I could have a valid character OR not -and some number of invalid characters -AND some valid characters OR not -etc This is probably why you're having a hard time crafting something that works. Again many things Gizmola. can you tell me what is the Purpose of " {4,})" in the above Rewrite rule? Cheers Quote Link to comment https://forums.phpfreaks.com/topic/236067-404-based-on-special-character-in-url/#findComment-1213727 Share on other sites More sharing options...
gizmola Posted May 11, 2011 Share Posted May 11, 2011 Yeah that's a quantifier for the character class that preceeds it (the stuff inside the [] is called a character class). The character class is defining characters and ranges of characters that the regex is trying to match. The {} after it is quantifying how many times it is looking to match. There are a lot of different variations to these quantifiers. This site is a great reference: http://www.regular-expressions.info/reference.html That particular quantifer means "most match at least 4 times". That closing paren closes out a capture group for the pattern: ([a-zA-Z0-9-!@#$^&*:"/?]{4,}) So everything inside the () will be captured together. This is what gets substituted for the $1 in the rewrite. Since it's the first captured group (or in this case the only captured group) It becomes $1. search.php?q=$1 Quote Link to comment https://forums.phpfreaks.com/topic/236067-404-based-on-special-character-in-url/#findComment-1213736 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.