Jump to content

Recommended Posts

How much should you validate a First Name?

 

This is my code...

// Validate First Name.
if (empty($trimmed['firstName'])){
$errors['firstName'] = 'Please enter your First Name.';
}else{
if (preg_match('#^[A-Z \'.-]{2,20}$#i', $trimmed['firstName'])){
	$firstName = $trimmed['firstName'];
}else{
	$errors['firstName'] = 'First Name must be 2-20 characters (A-Z \' . -)';
}
}

 

I did this for fear if I left things wide-open it would be a security risk.

 

And in the U.S. at least, the above would really cover all valid First Names.

 

Thoughts?

 

 

 

Debbie

 

 

Link to comment
https://forums.phpfreaks.com/topic/256303-validating-first-name/
Share on other sites

/^[A-Z][a-zA-Z -]+$/

 

i think this makes it to where it is letters and dashes and the first letter must start with a capital.

 

You're missing the point...

 

I am wondering if my Regex is TOO RESTRICTIVE.  (Your code makes things even worse and would definitely fail...)

 

 

Debbie

 

 

doing some extensive research it would not be a wise idea to use Regex for names, it will give you false positives.

 

and someone might spell there name a certain way and have it kick back "this is not valid"

 

What if someone's name is something like...

a';DROP TABLE users; SELECT * FROM userinfo WHERE 't' = 't

 

That is why I chose to originally use Regex.

 

 

Debbie

 

 

doing some extensive research it would not be a wise idea to use Regex for names, it will give you false positives.

 

and someone might spell there name a certain way and have it kick back "this is not valid"

 

What if someone's name is something like...

a';DROP TABLE users; SELECT * FROM userinfo WHERE 't' = 't

 

That is why I chose to originally use Regex.

 

Do you understand what it means to escape the values for a query [whether it is with mysql_real_escape_string() or with prepared statements]?. If the input is properly escaped, a name such as that won't do anything - that literal string would simply be inserted into the table. Granted, it's a stupid name, but it won't cause any harm as long as you are handling it correctly. So, your regex is not providing any security it is only creating the possibility that you might reject a valid name.

doing some extensive research it would not be a wise idea to use Regex for names, it will give you false positives.

 

and someone might spell there name a certain way and have it kick back "this is not valid"

 

What if someone's name is something like...

a';DROP TABLE users; SELECT * FROM userinfo WHERE 't' = 't

 

That is why I chose to originally use Regex.

 

Do you understand what it means to escape the values for a query [whether it is with mysql_real_escape_string() or with prepared statements]?

 

In simple terms, escaping form values is supposed to prevent special characters like the single or double quote from creating unexpected results in your query.  (Maybe you have a better definition?!)

 

My understanding is that one of the key reasons - if not the only reason - to use Prepared Statements is that is basically eliminates the risk that any strange or special characters could be entered into a form and then cause havoc in your query.  (Again, feel free to correct me if I am wrong.)

 

 

If the input is properly escaped, a name such as that won't do anything - that literal string would simply be inserted into the table. Granted, it's a stupid name, but it won't cause any harm as long as you are handling it correctly. So, your regex is not providing any security it is only creating the possibility that you might reject a valid name.

 

Okay.

 

But Regex also prevents someone from accidentally or purposelly entering in some obnoxious name like D33bb13, right?

 

 

Debbie

 

Debbie, your original question was "how much should you validate a first name". The answer is, all you can do is filter out rediculous fake names, as your regex does, and filter out things like fields that only contain spaces, as trim() will do. When it comes down to it, in this case, it would be very hard to prevent any sort of injection from simple first name checks, as the characters involved in SQL injecting a form can also be used in a name. The important thing is once the field is passed to your back end form handling code, it is properly sanitized and escaped. Since you are using prepared statements (this is the best SQL security in my opinion), the string will be internally sanitized and prepared for query use. So if you are indeed using prepared statements, your data will be secure, as this is the nature of prepared statements.

I did this for fear if I left things wide-open it would be a security risk.

 

But Regex also prevents someone from accidentally or purposelly entering in some obnoxious name like D33bb13, right?

 

Before you code anything you need to know what the requirements are. You are changing the requirements between posts. The escaping of the input prevents any security risk (which is what you were first asking about). But, now you are saying you want to prevent names that you feel are obnoxious. Only YOU can define the specific rules for what YOU feel is an obnoxious name. The main thing to keep in mind with creating such a process is that you have the risk of creating false positives, i.e. excluding the real name of a person.

 

Regex operations are typically a costly operation and should only be used when necessary. If you really feel that you need to restrict names then state what your rules are. FYI: Your current expression would find many valid names as invalid - most notably accented characters such as ä. The US is FULL of people from around the world who would have such names. Personally, I would not go to the trouble to try and do these validations as it, in my opinion, is just a waste of time. If someone wants their name to be Zippity Do Da on my site, what do I care? Of course, if they are making a CC purchase they will have to provide the valid name for the account.

I did this for fear if I left things wide-open it would be a security risk.

 

But Regex also prevents someone from accidentally or purposelly entering in some obnoxious name like D33bb13, right?

 

Before you code anything you need to know what the requirements are. You are changing the requirements between posts. The escaping of the input prevents any security risk (which is what you were first asking about). But, now you are saying you want to prevent names that you feel are obnoxious. Only YOU can define the specific rules for what YOU feel is an obnoxious name. The main thing to keep in mind with creating such a process is that you have the risk of creating false positives, i.e. excluding the real name of a person.

 

Regex operations are typically a costly operation and should only be used when necessary. If you really feel that you need to restrict names then state what your rules are. FYI: Your current expression would find many valid names as invalid - most notably accented characters such as ä. The US is FULL of people from around the world who would have such names. Personally, I would not go to the trouble to try and do these validations as it, in my opinion, is just a waste of time. If someone wants their name to be Zippity Do Da on my site, what do I care? Of course, if they are making a CC purchase they will have to provide the valid name for the account.

 

So by virtue of using Prepared Statements, I should be entirely safe from SQL Injections and other SQL-based attacks regardless of whether I am using Regex or not on my Name fields?

 

And it sound like you are saying drop the Regex altogether and maybe just check the field length with strlen(), right?

 

 

Debbie

 

JavaScript validation is mainly used to provide the user with a real time response as to whether or not data that is entered is considered valid. It should never be relied on as a means of filtering and/or sanitizing data to be sent to the server, as it can simply be disabled. I believe what psychos point is, is that your regex is so bland, that it will allow many rediculous names, and as I said before, SQL injection to be sent to the server, that there really is not much point to perform a regex and use resources for something that will not achieve your logic. A few JavaScript checks to check the correct length etc can be used and output to the user as they are typing, but spending too much time and resources on validating with JavaScript is not recommended.

JavaScript validation is mainly used to provide the user with a real time response as to whether or not data that is entered is considered valid. It should never be relied on as a means of filtering and/or sanitizing data to be sent to the server, as it can simply be disabled. I believe what psychos point is, is that your regex is so bland, that it will allow many rediculous names, and as I said before, SQL injection to be sent to the server, that there really is not much point to perform a regex and use resources for something that will not achieve your logic. A few JavaScript checks to check the correct length etc can be used and output to the user as they are typing, but spending too much time and resources on validating with JavaScript is not recommended.

 

Who said anything about JavaScript?

 

My point was simply that the requirements for this changed in the thread from implementing a security measure to preventing "obnoxious" names. If the requirement is to prevent SQL Injection, properly escaping the input (either via mysql_real_escape_string(), prepared statements, etc.) will prevent SQL Injection attacks. Adding logic to prevent certain inputs does not improve security (with regard to SQL Injection). If the requirement is to prevent "obnoxious" names then more detailed requirements need to be defined to create appropriate code.

 

However, you also need to consider "how" the input will be used. If the input is stored and then later display int he HTML content, then you will need to run the input through htmlspecialchars() when generating the output. Otherwise, you are at risk of XSS attacks - which is an entirely different thing than SQL Injection.

Who said anything about JavaScript?

 

No me!

 

 

My point was simply that the requirements for this changed in the thread from implementing a security measure to preventing "obnoxious" names.

 

Fair enough, but to me, they are interrelated.

 

 

If the requirement is to prevent SQL Injection, properly escaping the input (either via mysql_real_escape_string(), prepared statements, etc.) will prevent SQL Injection attacks.

 

So my code should be safe with Regex, right?

 

 

Adding logic to prevent certain inputs does not improve security (with regard to SQL Injection). If the requirement is to prevent "obnoxious" names then more detailed requirements need to be defined to create appropriate code.

 

However, you also need to consider "how" the input will be used. If the input is stored and then later display int he HTML content, then you will need to run the input through htmlspecialchars() when generating the output. Otherwise, you are at risk of XSS attacks - which is an entirely different thing than SQL Injection.

 

Where I need to output things I try to use code like this...

<!-- First Name -->
<li>
	<label for="firstName"><b>*</b>First Name:</label>
	<input id="firstName" name="firstName" type="text" maxlength="20"
		value="<?php if(isset($firstName)){echo htmlspecialchars($firstName, ENT_QUOTES);} ?>" /><!-- Sticky Field -->
	<?php
		if (!empty($errors['firstName'])){
			echo '<span class="error">' . $errors['firstName'] . '</span>';
		}
	?>
</li>

 

 

Debbie

 

My point was simply that the requirements for this changed in the thread from implementing a security measure to preventing "obnoxious" names.

 

Fair enough, but to me, they are interrelated.

Well, then, you would be wrong. Sanitizing/Escaping an input to prevent SQL Injection or other types of errors is very different from validating that the input is of the format that you expect. A value such as "D33bb13" would have absolutely no security risk with respect to a name value. YOU keep changing what you are talking about. Figure out what you are trying to achieve and stay on point.

 

If the requirement is to prevent SQL Injection, properly escaping the input (either via mysql_real_escape_string(), prepared statements, etc.) will prevent SQL Injection attacks.

 

So my code should be safe with Regex, right?

As long as your input is being properly escapes the RegEx is doing nothing to make the input more or less safe. It is an arbitrary validation that you are implementing. If you feel that it is necessary, then by all means use it. But, it doesn't do anything to improve security. Validation such as those can be necessary/warranted - such as the input for a phone number to ensure it contains 10 digits.

 

Where I need to output things I try to use code like this...

   <!-- First Name -->
   <li>
      <label for="firstName"><b>*</b>First Name:</label>
      <input id="firstName" name="firstName" type="text" maxlength="20"
         value="<?php if(isset($firstName)){echo htmlspecialchars($firstName, ENT_QUOTES);} ?>" /><!-- Sticky Field -->
      <?php
         if (!empty($errors['firstName'])){
            echo '<span class="error">' . $errors['firstName'] . '</span>';
         }
      ?>
   </li>

That will work. But, I personally hate putting PHP logic inside my markup. My preference would be to simply define $firstName in the "logic" of my code and then simply output it within the HTML markup.

 

In the logic of the code

$firstName = isset($firstName) ? htmlspecialchars($firstName, ENT_QUOTES) : '';
$firstNameError = (!empty($errors['firstName'])) ? "<span class='error'>{$errors['firstName']}</span>" : '';

 

In the display/output of the code

   <!-- First Name -->
   <li>
      <label for="firstName"><b>*</b>First Name:</label>
      <input id="firstName" name="firstName" type="text" maxlength="20" value="<?php echo $firstName; ?>" />
      <!-- Sticky Field -->
      <?php echo $firstNameError; ?>
   </li>

 

makes the code much more readable, IMHO. But, use whatever works for you.

My point was simply that the requirements for this changed in the thread from implementing a security measure to preventing "obnoxious" names.

 

Fair enough, but to me, they are interrelated.

Well, then, you would be wrong. Sanitizing/Escaping an input to prevent SQL Injection or other types of errors is very different from validating that the input is of the format that you expect.

 

Except they are not mutually exclusive.

 

If someone entered something like...

a';DROP TABLE users;

 

Then depending on my query, that could cause big problems.

 

And my Regex would catch that, even though it is more focused on catching things like numbers versus letters.

 

 

A value such as "D33bb13" would have absolutely no security risk with respect to a name value.

 

I never said it would.

 

 

If the requirement is to prevent SQL Injection, properly escaping the input (either via mysql_real_escape_string(), prepared statements, etc.) will prevent SQL Injection attacks.

 

Right, and I am using Prepared Statements to fight against SQL Injections...

 

 

 

That will work. But, I personally hate putting PHP logic inside my markup. My preference would be to simply define $firstName in the "logic" of my code and then simply output it within the HTML markup.

 

In the logic of the code

$firstName = isset($firstName) ? htmlspecialchars($firstName, ENT_QUOTES) : '';
$firstNameError = (!empty($errors['firstName'])) ? "<span class='error'>{$errors['firstName']}</span>" : '';

 

In the display/output of the code

   <!-- First Name -->
   <li>
      <label for="firstName"><b>*</b>First Name:</label>
      <input id="firstName" name="firstName" type="text" maxlength="20" value="<?php echo $firstName; ?>" />
      <!-- Sticky Field -->
      <?php echo $firstNameError; ?>
   </li>

 

makes the code much more readable, IMHO. But, use whatever works for you.

 

Interesting approach.

 

I may not have time to re-do my code like that now, but I  would like to learn to separate my "Presentation Layer" from my "Business Layer".

 

Thanks,

 

 

Debbie

 

I'm not sure you understand what SQL injection is. You say this:

If someone entered something like...

a';DROP TABLE users;

 

Then depending on my query, that could cause big problems.

 

And then you say this:

Right, and I am using Prepared Statements to fight against SQL Injections...

 

If you are using prepared statements, then "a';DROP TABLE users;" is going to do absolutely nothing. It can in no way harm your database. Prepared statements process the query internally and make it completely safe for interaction with the database. No form of SQL injection will every do anything, because it is sanitized automatically.

 

Aside from that, fighting SQL injection with regex is a bad idea. It is terribly inefficient and there are better ways to handle it (such as mysql_real_escape_string or prepared statements).

I'm not sure you understand what SQL injection is. You say this:

If someone entered something like...

a';DROP TABLE users;

 

Then depending on my query, that could cause big problems.

 

And then you say this:

Right, and I am using Prepared Statements to fight against SQL Injections...

 

If you are using prepared statements, then "a';DROP TABLE users;" is going to do absolutely nothing. It can in no way harm your database. Prepared statements process the query internally and make it completely safe for interaction with the database. No form of SQL injection will every do anything, because it is sanitized automatically.

 

Aside from that, fighting SQL injection with regex is a bad idea. It is terribly inefficient and there are better ways to handle it (such as mysql_real_escape_string or prepared statements).

 

So to fight SQL Injection Attacks, I'll keep using Prepared Statements.

 

And to keep my data "pretty" - if I decide to keep doing that - then I'll use my Regex.

 

Right?

 

 

Debbie

 

You simply need to tap the breaks a bit and slow down. You are not understanding what has been stated multiple times. Or, you are simply trying to be difficult.

 

If someone entered something like...

a';DROP TABLE users;

 

Then depending on my query, that could cause big problems.

No it would NOT. As long as the data is "Sanitized" using prepared statements or something like mysql_real_escape_string, then that input would do absolutely no harm. It would simply set the name as the literal string "a';DROP TABLE users;". THAT is the whole point of Sanitizing the data. It ensures that the data is safe for use in a query.

 

My point was simply that the requirements for this changed in the thread from implementing a security measure to preventing "obnoxious" names.

 

Fair enough, but to me, they are interrelated.

Related, maybe, but they have very different purposes. You should ALWAYS sanitize the input and the methods for doing so are pretty standard. But, "validation" is a whole other can of worms for which there is no standard. The validations will be based on your particular requirements. As stated before, you may decide to require that phone numbers contain 10 digits, that emails are in a proper format, that a date is of a certain format, etc. etc.

 

I've already answered your question multiple times. Any regex you implement is not adding any security. If you want to go to the trouble of excluding certain content for the name field, or any other, go ahead. That is your perogative.

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.