Jump to content

user input cleansing


knobby2k

Recommended Posts

Should be a simple one this,

 

I've seen many different ways of cleansing data that has been input from a user but no definitive answer.

 

I am making a user registration form; i'll keep it simple for now with a username and email input field. Without any cleansing it would appear as the following:

 


$un = $_POST['un'];
$email = $_POST['email'];

 

so with a little cleansing along the lines of...

 


$un = mysqli_real_escape_string(trim(htmlentities($_POST['un'] ,ENT_QUOTES)));
$email= mysqli_real_escape_string(trim(htmlentities($_POST['email'] ,ENT_QUOTES)));

 

...is that correct, not correct, complete bollox, etc? I know i'd need to re-jig the email one to make sure characters like .-_aA1@ were accepted but can you think of anything else??

 

Cheers

 

Link to comment
Share on other sites

Usually you want to make sure that your input isn't going to harm your database in any way (SQL injection) but otherwise you store it as you got it. Then format it properly on output. That way if you ever want to change how you display the data, you can do it from its original state.

 

So, basically just escape it and make it XSS-proof on entry and then you can convert HTML to entities and so forth on output.

 

If you are using MySQLi then you should be using prepared queries which will prevent SQL injection completely, and you don't have to escape it.

 

$mysqli = new mysqli($host,$user,$pass,$dbname);

$stmt = $mysqli->prepare("INSERT INTO users (username, email) VALUES (?,?)");

$stmt->bind_param('ss', $username, $email);

$username = $_POST['username'];
$email = $_POST['email'];

$stmt->execute();

Link to comment
Share on other sites

Adding to that, you may want to validate that an email is actually a valid email. There are many ways to do this, the most common being using a regular expression match.

 

Also, when validating data, you need to consider what the data needs to be. For string information (like a username or first name) of course you want to escape them with whatever xxx_escape_string function you have. For numeric data, you want to make sure that it is a numeric value, or cast it as a numeric value. The topic of input validation is a pretty broad topic though (to broad for a forum post anyways) So I would suggest reading a few tutorials about it, and why validation is important

Link to comment
Share on other sites

Adding to that, you may want to validate that an email is actually a valid email. There are many ways to do this, the most common being using a regular expression match.

 

Also, when validating data, you need to consider what the data needs to be. For string information (like a username or first name) of course you want to escape them with whatever xxx_escape_string function you have. For numeric data, you want to make sure that it is a numeric value, or cast it as a numeric value. The topic of input validation is a pretty broad topic though (to broad for a forum post anyways) So I would suggest reading a few tutorials about it, and why validation is important

 

Right, always assume that your users are malicious. Don't think "well, I'm not going to bother validating that because it will never be used maliciously" - you just don't know for sure.

 

So like mikesta said, if you are expecting a username, make sure it is a string with a max length. You probably don't want funky characters like < > | \ /, etc, in your username so use regex to filter those out.

 

Your input needs to exactly match what you expect it to be.

Link to comment
Share on other sites

Here is how I validate and insert an email, it's in a class method. I use PDO and so should you. Here a link about it:

 

http://net.tutsplus.com/tutorials/php/why-you-should-be-using-phps-pdo-for-database-access/

 

 

 

 

private validateEmail($email) {

     // access the PDO:: Database Connection Class
     require_once("../database.class.php");

     // force email address to all lowercase for easier reading
     // variable is private so it starts with underscore
     $_email = strtolower($email);

     // make sure the email address is not empty
     // this may seem redundant since the following regex will
     // catch it however the regex is slow, this is fast, thus first. 
     if (empty($_email)) {
          $_SESSION['error'] = "Please provide a valid email address, foo!";
          $_status = 'FALSE';
     }
     // make sure email address is valid use email validation regex
     elseif (!preg_match('/^[a-z0-9]+([_\\.-][a-z0-9]+)*@([a-z0-9]+([\.-][a-z0-9]+)*)+\\.[a-z]{2,}$/i', $_email)) {
          $_SESSION['error'] = "That is not a valid email address, foo!";
          $_status = 'FALSE';
     }
     // make sure the email address is not already in use
     else {
     // SQL query string to retrieve the user ID if the email address is in use
     $_sql = "SELECT id FROM users WHERE email = :email LIMIT 1";

     try {
          // Build the query transaction
          $db->beginTransaction();

          // Build the prepared statement
          $_stmt = $db->prepare($_sql);

          // Bind the parameters to their properties
          $_stmt->bindParam(":email", $_email, PDO::PARAM_STR);

          // Execute the query
         $_stmt->execute();

          // Commit to the transaction
          $db->commit();

          // Fetch the results
          $_dBSelectResults = $_stmt->fetchAll(PDO::FETCH_ASSOC);

          if ((count($_dBSelectNumRows) != 0)) {
               $_SESSION['error'] = "That email address is already in use, foo!";
               $_status = 'FALSE';
          }
          else {
               $_cleanEmail = $_email;
          }
     }
     catch(PDOException $e) {
          // Roll back the transaction if we fail
          $dbh->rollback();
          $_SESSION['terror'] = 'Error: ' . $e->getMessage();
          $_status = 'FALSE';
     }
}


// Insert the new email address (and other fields) into the database
try {
     $db->beginTransaction();

     $_insertUser = $db->prepare("INSERT INTO `users` (email,password) VALUES (:email,:password)");
     
     $_insertUser->bindParam(':email', $_cleanEmail, PDO::PARAM_STR);
     $_insertUser->bindParam(':password', $_cleanPassword, PDO::PARAM_STR);

     $_insertUser->execute();
     $db->commit();
}
catch(PDOException $e) {
     $db->rollBack();
     $_SESSION['terror'] = 'Error: ' . $e->getMessage();
     $_status = 'FALSE';
}

Link to comment
Share on other sites

Should be a simple one this,

 

I've seen many different ways of cleansing data that has been input from a user but no definitive answer.

 

I am making a user registration form; i'll keep it simple for now with a username and email input field. Without any cleansing it would appear as the following:

 


$un = $_POST['un'];
$email = $_POST['email'];

 

so with a little cleansing along the lines of...

 


$un = mysql_real_escape_string(trim(htmlentities($_POST['un'] ,ENT_QUOTES)));
$email= mysql_real_escape_string(trim(htmlentities($_POST['email'] ,ENT_QUOTES)));

 

...is that correct, not correct, complete bollox, etc? I know i'd need to re-jig the email one to make sure characters like .-_aA1@ were accepted but can you think of anything else??

 

Cheers

 

oops it was ment to be mysql_real_escape_string not mysqli_real_escape_string

Link to comment
Share on other sites

ok so just to clarify...

 

$email= mysql_real_escape_string(trim(htmlentities($_POST['email'] ,ENT_QUOTES)));

 

only removes... ' or "

 

is that right??

 

how do i cleanse the data of other dangerous characters such as... []{}-_:;,.<>$

 

i'm confused, thought i had got my head round it!

Link to comment
Share on other sites

ok so just to clarify...

 

$email= mysql_real_escape_string(trim(htmlentities($_POST['email'] ,ENT_QUOTES)));

 

only removes... ' or "

 

is that right??

 

how do i cleanse the data of other dangerous characters such as... []{}-_:;,.<>$ ...or don't i even need to cleanse that once the ' and " are removed??

 

i'm confused, thought i had got my head round it!

Link to comment
Share on other sites

ok so just to clarify...

 

$email= mysql_real_escape_string(trim(htmlentities($_POST['email'] ,ENT_QUOTES)));

 

only removes... ' or "

 

is that right??

 

No, it does not remove anything. If anything it will "escape" characters (by putting a forward slash in front of them) that could cause the value to be interpreted as MySQL code. The actual stored value would be exactly the same as the user entered (except anything you may have trimmed).

 

You need to take an analytical approach to each field you are storing to determine what are the appropriate values. When you say "...how do i cleanse the data of other dangerous characters such as... []{}-_:;,.<>$"

 

Who says they are dangerous? they are not dangerous for storing in the database and they won't cause any harm if you "tried" to use them to send an email (it would just fail). But, if you want to validate whether the user entered email is a properly formatted email address, that needs to be done before you even think about storing it.

 

As mikesta707 stated previously this was typically done using a regular expression to validate that the email only used certain characters and were formatted in a certain way. However, as of PHP5.2 there is a built in validation function called filter_var() that can validate many types of values - the first example shown is for email addresses

 

http://www.php.net/manual/en/function.filter-var.php

Link to comment
Share on other sites

ok so just to clarify...

 

$email= mysql_real_escape_string(trim(htmlentities($_POST['email'] ,ENT_QUOTES)));

 

only removes... ' or "

 

is that right??

 

No, it does not remove anything. If anything it will "escape" characters (by putting a forward slash in front of them) that could cause the value to be interpreted as MySQL code. The actual stored value would be exactly the same as the user entered (except anything you may have trimmed).

 

You need to take an analytical approach to each field you are storing to determine what are the appropriate values. When you say "...how do i cleanse the data of other dangerous characters such as... []{}-_:;,.<>$"

 

Who says they are dangerous? they are not dangerous for storing in the database and they won't cause any harm if you "tried" to use them to send an email (it would just fail). But, if you want to validate whether the user entered email is a properly formatted email address, that needs to be done before you even think about storing it.

 

As mikesta707 stated previously this was typically done using a regular expression to validate that the email only used certain characters and were formatted in a certain way. However, as of PHP5.2 there is a built in validation function called filter_var() that can validate many types of values - the first example shown is for email addresses

 

http://www.php.net/manual/en/function.filter-var.php

 

I mistakenly used the email input as the example. I do understand what you mean by analysing what data I should expect from a user input. I really just want to prevent malicious code being ran, entered into my database or carrying out database functions because I have not prevented against SQL injection properly.

 

So for example the username is usually a field where a user 'might' enter a special character rather than just a standard number or letter. Therefore without being too restrictive but also protecting myself at the same time, is escaping ' and " enough to prevent against sql injection??

 

Thanks

Link to comment
Share on other sites

to prevent sql injection, there is not need to use htmlentities, you can use that on output to prevent xss attacks.

 

So as above is suggested use either mysql_real_escape_string to prevent sql injections or prepared statements. So these are to prevent sql injection.

 

Besides that it's a good habit to always check that the values are as expected. for integers you can use type casting to force them into being an integer: (int)$value;

 

And for emails and such use filter_var. Just make sure values are as expected. And if you have a for instance a ISBN number make sure it's the length you expect etcetera. Or State code 2 characters a-z.

 

On output there are some other things you can do to prevent not sql injection, but for instance xss attack. That could be htmlspecialchars or htmlentities.

 

Here is a bite size tutorial that will make you understand it even better.http://www.phpfreaks.com/tutorial/php-security

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.