Sanitizing, how's the best way of doing it?

Matt Ridge · December 1, 2011

I have been reading this:

http://www.phpro.org/tutorials/Filtering-Data-with-PHP.html#11

I am curious, how does one actually sanitize a php script?

I know the site shows how to do it, but it really doesn't show in real world how to do it.

Let me give you an example:


<!DOCTYPE html 
     PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
     "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<title>PDI NCMR Admin Panel</title>
<!--[if IE]><link rel="stylesheet" type="text/css" href="../CSS/ie.css" /><![endif]-->
<!--[if !IE]> <--><link rel="stylesheet" type="text/css" href="../CSS/pdi.css" /><!--> <![endif]--></head>
<body>
<?php
echo '<div id="admin">';
//Show the navagation menu
require_once('../hf/nav.php');
echo '<hr id="line">';
echo '<h2 id="title">Latest NCMRs </h2>';

  // Connect to the database 
    require_once('../connectvars.php');
  $dbc = mysqli_connect(DB_HOST, DB_USER, DB_PASSWORD, DB_NAME);

  // Retrieve the data from MySQL
  $query = "SELECT * FROM ncmr";
  $data = mysqli_query($dbc, $query);


  echo '<table>';
  echo '<tr><th>NCMR ID &nbsp</th><th>Part &nbsp</th><th>Date &nbsp</th><th>Actions &nbsp</th></tr>';
	while ($row = mysqli_fetch_array($data)) { 
    // Display the data
    echo '<tr class="ncmrdata">';
	echo '<td>' . $row['NCMR_ID'] .'     &nbsp &nbsp</td>';
	echo '<td>' . $row['Nexx_Part_Description'] .'     &nbsp</td>';
	echo '<td>' . date("M d,Y",strtotime($row['Added_By_Date'])) . '&     &nbsp</td>';
	echo '<td><a href="viewncmr.php?id=' . $row['id'] . '">Comment</a></strong> &nbsp<strong><a href="editncmr.php?id=' . $row['id'] . '">Edit</a> &nbsp<a href="printncmr.php?id=' . $row['id'] . '">Print</a>';
  echo '</td></tr>';
  }
  echo '</table>';
  
     mysqli_close($dbc);
 require_once('../hf/footer.php')

?>
</body>
</html>

How do I sanitize this? Or is it for inputs only?

KevinM1 · December 1, 2011

Sanitizing is generally inputs-only because one would assume you'd only want to store and/or display clean/valid/verified data. There's nothing stopping you from using sanitizing functions in other instances, it's just an ass backwards way of doing things.

Psycho · December 1, 2011

I'm not sure what response you are wanting with regard to the code you posted. But, sanitizing could mean different things to different people. I would call it the process of escaping or transforming ANY data to prevent errors. Those errors can be significant security issues or relatively benign display issues.

You should ALWAYS sanitize user input before storing/using it. The methods you use will depend upon how you are using/storing the data. If you are storing the data in a MySQL database, for example, you would want to run mysql_real_escape_string() on string data, you would need to validate and format dates appropriately, and the same goes for numeric input - validate that the data is numeric and of the right type: int float, length, etc.

The same goes for outputting data either to the browser or as a URL. For the former you might want to use htmlspecialcharacters() or htmlentities, for the latter you would likely use urlencode(). There could be any number of other scenarios, such as if you are outputting data to an xml or csv file.

The bottom line is that you should really think about how your data is used. Where is it coming from, what do you *expect* it to be, what values could cause problems and how you can protect against those problems. Never assume that user data is safe. Even if you have a select list of known values, don't assume that the value submitted by the user is one of those values - it can easily be spoofed. Validate that the value is in the list of values you expect.

KevinM1 · December 1, 2011

Exactly. Never, ever, ever trust anything coming in from $_GET or $_POST. Always run it through whatever processes you need in order to verify that the data coming in is legit and useful. This is especially important if that incoming data is going to be used as part of a database query or is going to be displayed on a user's screen.

But, aside from general advice, we can't say much more because it all depends on what your individual needs are for whatever project you're working on. There's only a handful of "Always do this/never do this" rules:

Always escape string (text) data that will be used in a database query. Never use addslashes for this. Instead, always use the escape method that comes with the database you're using (like mysql_real_escape_string) OR, even better, use prepared statements with MySQLi or PDO, as prepared statements automatically escape strings. That will help stop injection attacks.

For output Always run user-supplied data through a function that will swap tags for their HTML entity counterparts. htmlentities with ENT_QUOTES and UTF-8 should do the trick. It's easiest to do this right before echoing the data. That will help stop XSS attacks.

Always validate incoming data. Expecting a number? Run it through is_int to make sure. Expecting an email address? Run it through filter_var with the email flag. Want data to conform to a certain unique format? Use regular expressions and the preg functions.

Matt Ridge · December 2, 2011

Ok, sorry for the late delay.

Basically what I am attempting to do is to make it so that a form I'm putting online although password protected will not be able to be attacked other manners.

I'm asking about sanitizing because it seems everyone who knows how to do it has a different way of going about it.

Lets just put it this way...

I have simple script:

<?php
require_once('connectvars.php');


if (isset($_POST['submit'])) {
$Added_By = $_POST['Added_By'];

if (empty($Added_By){

echo 'Please fill out all of the required NCMR information.<br />';
$output_form = 'yes';
	}
}

else {
$output_form = 'yes';}

//Access the Database
if (!empty($Added_By)) {
	$dbc = mysqli_connect(DB_HOST, DB_USER, DB_PASSWORD, DB_NAME)
	or die('Error connecting to MySQL server.');

$query = "INSERT INTO database (Added_By) VALUES ('$Added_By')";

mysqli_query($dbc, $query)
      or die ('Data not inserted.');


// Confirm success with the user
  echo '<tr><td class="thank">';
      echo '<p>Thank you for entering data into the database.</p>';
      echo '<p><a href="index.php"><< Back to the form</a></p>';
  echo '</td></tr>';

mysqli_close($dbc);
  }
if ($output_form == 'yes') {

	echo '<form method="post">';
		echo '<fieldset>';
				echo '<div id="ab"><span class="b">Added By:  </span><input type="text" name="Added_By" value="" /></div>';
echo '</fieldset>';
echo '</form>';

}
?>

Now I know this is unsanitized, and unprotected from hacks. Reading what I posted in the OP, I'm asking how do I make this work so I can use said information in a situation like posted above.

Thanks.

KevinM1 · December 2, 2011

Well, for starters, you'd need to validate $added_by. Is it only supposed to contain letters? Letters and numbers? Any special characters? See, that's what we mean when we say that form validation and security is dependent on context. What are you requiring the value to be?

The one sure thing I can say is that you want to pass $added_by into mysqli_real_escape_string before using it in your db query. Like I said before, all text that's being used by any query needs to be properly escaped. Since you're not using prepared statements in your example, you need to do it manually with that function.

Matt Ridge · December 2, 2011

Well, for starters, you'd need to validate $added_by. Is it only supposed to contain letters? Letters and numbers? Any special characters? See, that's what we mean when we say that form validation and security is dependent on context. What are you requiring the value to be?

The one sure thing I can say is that you want to pass $added_by into mysqli_real_escape_string before using it in your db query. Like I said before, all text that's being used by any query needs to be properly escaped. Since you're not using prepared statements in your example, you need to do it manually with that function.

Can you explain what you mean by properly escaped?

The site I posted originally explains how to do it, but not really...

hence:

<?php

/*** our string ***/
$string = "кириллица";

/*** echo the sanitized string ***/
echo filter_var($string, FILTER_SANITIZE_SPECIAL_CHARS, FILTER_FLAG_STRIP_LOW);

?>

Matt Ridge · December 2, 2011

You know, the more I read the more I get confused at times...

<?php
$stmt = $dbh->prepare("INSERT INTO REGISTRY (name, value) VALUES (:name, :value)");
?>

What does "$dbh->" stand for? I'm reading this:

http://php.net/manual/en/pdo.prepared-statements.php

and It goes straight into using "$dbh->" without explaining what it means.

KevinM1 · December 2, 2011

That's not escaping.

http://en.wikipedia.org/wiki/Escape_character#Programming_and_data_formats

Essentially, escaping modifies the behavior of certain characters. Why is that important? Well, in SQL, certain characters do certain things, and if they were just blindly inserted into a query, they could have disastrous results. Classic example:

anything' OR 'x'='x

Looks pretty innocent, but let's say that was entered into a form, and you have the following query:

SELECT * FROM table WHERE email = '$input'

The actual executed query would look like:

SELECT * FROM table WHERE email ='anything' OR 'x'='x';

Since 'x'='x' always results in true, every record will be retrieved. It seems like an innocent mistake, but these kinds of attacks - SQL Injection attacks - are very dangerous. They can grant an attacker access to the entire database, and all the sensitive information within. An SQL Injection is what brought the PSN down earlier this year.

So, how does escaping thwart injection? It turns

anything' OR 'x'='x

into

anything\' OR \'x\'=\'x

which turns the example query into

SELECT * FROM table WHERE email = 'anything\' OR \'x\'=\'x'

which is a value not found in the database.

Now, why can't you simply use addslashes to escape string data? Addslashes() does not escape all possible characters. Therefore, it's always best to use the escape method relevant to the kind of database you're using.

---

$dbh is an object. PDO stands for PHP Data Objects. If you're not familiar with object oriented programming, use MySQLi instead.

Matt Ridge · December 2, 2011

Ok, I hate sounding like an idiot, because I know I'm going to... but how do you know where to put the "\"?

I have a script I am working on...

<?php
//This is a script that will identify the OS and browser you are using. This also has a fix in where Chrome shows up as Chrome, and not show up as Safari by accident. 



//Booleans to set OS and Browser to False.
$os = false;
$browser = false;
//Booleans for Web Browser & OS Functions.
$info = $_SERVER['HTTP_USER_AGENT'];
$xp = '/Windows NT 5.1/';
$vista = '/Windows NT 6.0/';
$win7 = '/Windows NT 6.1/';
$ubuntu = '/Ubuntu/';
$ie9 = '/ie9/';
$ie8 = '/ie8/';
$chrome = '/Chrome/';
$safari = '/Safari/';
$firefox = '/Firefox/';


//Operating Systems
if (stristr($info, "Windows NT 5.1")) {echo 'You are using a Windows XP Operating System ';}
if (stristr($info, "Windows NT 6.0")) {echo 'You are using a Windows Vista Operating System ';}
if (stristr($info, "Windows NT 6.1")) {echo 'You are using a Windows 7 Operating System ';}
if (stristr($info, "Ubuntu")) {echo 'You are using an Ubuntu Operating System ';}
if (stristr($info, "Mac OS")) {echo 'You are using a Macintosh Operating System ';}


//Web Browsers
if (stristr($info, "Chrome") !== FALSE) {stristr($info,"Safari");
	$chrome = 'Chrome';
		echo 'with a Chrome Web Browser ';}
elseif (stristr($info, "Safari")) {echo 'with a Safari Web Browser ';}
if (stristr($info, "Firefox")) {echo 'with a Firefox Web Browser ';}




//If OS or Browser not found in list.
if ($ubuntu || $xp || $vista || $win7)
$os = true;

if($firefox || $chrome || $safari || $ie9 || $ie8)
$browser = true;

if(!$browser || !$os){

echo'<strong>';
echo '<br />' . $_SERVER['HTTP_USER_AGENT'] . '<br /><br />Administrator someone in your work force is using an unsupported browser or OS, please email this information to the developer of the NCMR software you are using. It will allow your browser/OS combination  to be used correctly. Sorry for the inconvenience.</strong> <br /><br />Please copy and paste the text above and send it to your web administrator. It will explain everything he/she needs to do.<br />';}

?>

Is this doing what you suggested? Sorry if I am sounding like I'm not getting it, but I really don't understand how you know how to escape properly. You are explaining things to me that don't make sense. I understand 'x'='x', but how does that help me with real world scripting?

I hate to say this but I learn by example, not theory... it is one of the reasons I posted the script, because if I see what you mean being used in action, I have a greater chance of knowing what you are talking about.

I don't mean to sound like I don't appreciate your help, or not understand what you are saying, or leaning on a disability to make excuses, but I am dyslexic, and I do have a style of learning that works for me... and examples without really understanding why they are the way they are confuses the dickens out of me.

Sorry...

KevinM1 · December 2, 2011

Okay, here's a canned example:

// database connection and table selection code goes here

if (isset($_POST['submit']))
{
   if (!empty($_POST['email']))
   {
      $email = filter_var($_POST['email'], FILTER_VALIDATE_EMAIL);
   }
   else
   {
      $email = false;
   }

   if ($email)
   {
      $email = mysql_real_esape_string($email);
      $query = "SELECT * FROM table WHERE email = '$email'";
      $result = mysql_query($query);

      if (mysql_num_rows($result) == 1)
      {
         $row = mysql_fetch_assoc($result);
         echo "User name: " . $row['username'];
      }
      else
      {
         echo "ERROR: multiple accounts have the same email address.";
       }
   }
   else
   {
      echo "ERROR: invalid email.";
   }
}

El Chupacodra · December 2, 2011

Matt you don't have to change anything to insert the backslashes.

PHP does that for you and that is what we mean by escaping it.

It disarms the potentially dangerous SQL Code that can be abused to enter your DB.

SQL injection is how hackers get into a lot of sites so it's important to protect yourself.

Matt Ridge · December 2, 2011

Okay, here's a canned example:

// database connection and table selection code goes here

if (isset($_POST['submit']))
{
   if (!empty($_POST['email']))
   {
      $email = filter_var($_POST['email'], FILTER_VALIDATE_EMAIL);
   }
   else
   {
      $email = false;
   }

   if ($email)
   {
      $email = mysql_real_esape_string($email);
      $query = "SELECT * FROM table WHERE email = '$email'";
      $result = mysql_query($query);

      if (mysql_num_rows($result) == 1)
      {
         $row = mysql_fetch_assoc($result);
         echo "User name: " . $row['username'];
      }
      else
      {
         echo "ERROR: multiple accounts have the same email address.";
       }
   }
   else
   {
      echo "ERROR: invalid email.";
   }
}

So what this done is dump the email address after it has been inputted?

Or is it because you put a filter_var in front of the post, it will now use the FILTER_VALIDATE_EMAIL command to allow only what is inputted at the email address field, and nothing before and after it?

Isn't that what real_escape_string is for though?

Just curious...

Matt Ridge · December 2, 2011

Matt you don't have to change anything to insert the backslashes.

PHP does that for you and that is what we mean by escaping it.

It disarms the potentially dangerous SQL Code that can be abused to enter your DB.

SQL injection is how hackers get into a lot of sites so it's important to protect yourself.

But you need to add the slashes into the code somewhere to make php utilize them correct?

scootstah · December 2, 2011

Sanitizing is generally inputs-only because one would assume you'd only want to store and/or display clean/valid/verified data. There's nothing stopping you from using sanitizing functions in other instances, it's just an ass backwards way of doing things.

This is not always the case. You should sanitize input so that it is the expected format and safe to store. So, knock off SQL injections and validate UTF8 and that's all you really need.

Then sanitize HTML and such at output. In this way, your data is more flexible. You can use it in different ways. If, in 6 months, you decide you actually did want to allow HTML then all you need to do is modify your outgoing sanitation filters.

Of course this isn't necessarily always the best way to go about it. For example, if you have a ton of content being outputted then sanitation on every page load may slow things down a bit (of course caching is usually an option too).

At any rate, there is no end-all, be-all solution for sanitation. It is absolutely situational.

EDIT:

Matt you don't have to change anything to insert the backslashes.

PHP does that for you and that is what we mean by escaping it.

It disarms the potentially dangerous SQL Code that can be abused to enter your DB.

SQL injection is how hackers get into a lot of sites so it's important to protect yourself.

But you need to add the slashes into the code somewhere to make php utilize them correct?

The slashes are to prevent SQL injection. Let's say your SELECT query looks like this:

SELECT * FROM users WHERE username=$username AND password=$password

Now, let's say $username and $password just came directly from a POST form. So, they may contain ANYTHING that a user might have put in them - having not sanitized them, we really have no idea so we are trusting.

Now let's say that $username actually contains this value: "bob';--" (in MySQL, -- is a comment - so anything after it is dropped from the query in this example)

So now when we execute the query, this is what it actually executes:

SELECT * FROM users WHERE username='bob';--

Now anyone can login as "bob" without a password (substitute "bob" with an admin and you should be able to see why this is problematic).

So to combat this we can escape the input. When the input is escaped, special characters (like quotes) will act as literally that character instead of part of the code. So if we escape our $username and $password with mysql_real_escape_string(), then our actual executing query would then look like this:

SELECT * FROM users WHERE username='bob\';--' AND password='hispassword'

Now, the username is just "bob\';--" and is completely harmless, since it wasn't allowed to alter the actual query.

It should be noted that if you want to completely avoid this problem all together, you should be using prepared statements. You can do this either using mysqli or PDO. I won't confuse you with the details, but basically you don't even need to sanitize against SQL injection because queries are made safe internally. I would recommend you read up on mysqli functions, they are far better than standard mysql.

Hope this helped.

Matt Ridge · December 2, 2011

Sanitizing is generally inputs-only because one would assume you'd only want to store and/or display clean/valid/verified data. There's nothing stopping you from using sanitizing functions in other instances, it's just an ass backwards way of doing things.

This is not always the case. You should sanitize input so that it is the expected format and safe to store. So, knock off SQL injections and validate UTF8 and that's all you really need.

Then sanitize HTML and such at output. In this way, your data is more flexible. You can use it in different ways. If, in 6 months, you decide you actually did want to allow HTML then all you need to do is modify your outgoing sanitation filters.

Of course this isn't necessarily always the best way to go about it. For example, if you have a ton of content being outputted then sanitation on every page load may slow things down a bit (of course caching is usually an option too).

At any rate, there is no end-all, be-all solution for sanitation. It is absolutely situational.

Ok, so what I'm asking is if sanitation is situational, is there a way to define what type of sanitation works where? As I've said before at the beginning everyone seems to sanitize differently for the same type of code.

Posting, $='1', etc...

There has to be some place beyond http://php.net/, that shows in real world situations how to sanitize... in plain English.

Or is this just stuff that people take for granted and say ok, I put that there, and hope it does what it's meant to... to me this stuff is like salt... I really don't understand why the code works, but being told to put it in there and know that it works really isn't great way of knowing how things work...

Matt Ridge · December 2, 2011

It should be noted that if you want to completely avoid this problem all together, you should be using prepared statements. You can do this either using mysqli or PDO. I won't confuse you with the details, but basically you don't even need to sanitize against SQL injection because queries are made safe internally. I would recommend you read up on mysqli functions, they are far better than standard mysql.

Hope this helped.

I guess what I am getting stuck at is what are prepared statements, and how do I use them to sanitize correctly.

People show me php.net which is fine, but it really doesn't explain to me how it works because I don't understand what they are really doing...

scootstah · December 2, 2011

Sanitizing is generally inputs-only because one would assume you'd only want to store and/or display clean/valid/verified data. There's nothing stopping you from using sanitizing functions in other instances, it's just an ass backwards way of doing things.

This is not always the case. You should sanitize input so that it is the expected format and safe to store. So, knock off SQL injections and validate UTF8 and that's all you really need.

Then sanitize HTML and such at output. In this way, your data is more flexible. You can use it in different ways. If, in 6 months, you decide you actually did want to allow HTML then all you need to do is modify your outgoing sanitation filters.

Of course this isn't necessarily always the best way to go about it. For example, if you have a ton of content being outputted then sanitation on every page load may slow things down a bit (of course caching is usually an option too).

At any rate, there is no end-all, be-all solution for sanitation. It is absolutely situational.

Ok, so what I'm asking is if sanitation is situational, is there a way to define what type of sanitation works where? As I've said before at the beginning everyone seems to sanitize differently for the same type of code.

Posting, $='1', etc...

There has to be some place beyond http://php.net/, that shows in real world situations how to sanitize... in plain English.

Or is this just stuff that people take for granted and say ok, I put that there, and hope it does what it's meant to... to me this stuff is like salt... I really don't understand why the code works, but being told to put it in there and know that it works really isn't great way of knowing how things work...

Read my edit on that post.

Basically, you sanitize for expected input. When you take input from someone, you always assume it is going to be used maliciously. You never think "oh, someone won't think to do that!". You always think "well that guy is shady, he's going to hack my database" and then you take paranoid steps to prevent it.

You always check that your input is the way you expected it to be. Generally sanitation refers to making input safe. So if you were to sanitize for SQL injections, you would escape the data. If you were sanitizing for XSS attacks, you would remove all HTML. There's lots of solutions for specific sanitation methods such as this, but you just have to know where to use what based on what you want the input to be.

Matt Ridge · December 2, 2011

Sanitizing is generally inputs-only because one would assume you'd only want to store and/or display clean/valid/verified data. There's nothing stopping you from using sanitizing functions in other instances, it's just an ass backwards way of doing things.

This is not always the case. You should sanitize input so that it is the expected format and safe to store. So, knock off SQL injections and validate UTF8 and that's all you really need.

Then sanitize HTML and such at output. In this way, your data is more flexible. You can use it in different ways. If, in 6 months, you decide you actually did want to allow HTML then all you need to do is modify your outgoing sanitation filters.

Of course this isn't necessarily always the best way to go about it. For example, if you have a ton of content being outputted then sanitation on every page load may slow things down a bit (of course caching is usually an option too).

At any rate, there is no end-all, be-all solution for sanitation. It is absolutely situational.

Ok, so what I'm asking is if sanitation is situational, is there a way to define what type of sanitation works where? As I've said before at the beginning everyone seems to sanitize differently for the same type of code.

Posting, $='1', etc...

There has to be some place beyond http://php.net/, that shows in real world situations how to sanitize... in plain English.

Or is this just stuff that people take for granted and say ok, I put that there, and hope it does what it's meant to... to me this stuff is like salt... I really don't understand why the code works, but being told to put it in there and know that it works really isn't great way of knowing how things work...

Read my edit on that post.

Basically, you sanitize for expected input. When you take input from someone, you always assume it is going to be used maliciously. You never think "oh, someone won't think to do that!". You always think "well that guy is shady, he's going to hack my database" and then you take paranoid steps to prevent it.

You always check that your input is the way you expected it to be. Generally sanitation refers to making input safe. So if you were to sanitize for SQL injections, you would escape the data. If you were sanitizing for XSS attacks, you would remove all HTML. There's lots of solutions for specific sanitation methods such as this, but you just have to know where to use what based on what you want the input to be.

How do you remove the HTML? That seems counterproductive.... or are you talking about strip_tags()?

scootstah · December 2, 2011

It should be noted that if you want to completely avoid this problem all together, you should be using prepared statements. You can do this either using mysqli or PDO. I won't confuse you with the details, but basically you don't even need to sanitize against SQL injection because queries are made safe internally. I would recommend you read up on mysqli functions, they are far better than standard mysql.

Hope this helped.

I guess what I am getting stuck at is what are prepared statements, and how do I use them to sanitize correctly.

People show me php.net which is fine, but it really doesn't explain to me how it works because I don't understand what they are really doing...

Okay, this is an example of inserting data with a prepared statement.

<?php

$mysqli = new mysqli('localhost', 'root', 'root', 'database');

$username = $_POST['username'];
$password = $_POST['password'];

$query = "INSERT INTO users (username, password) VALUES (?, ?)";

$stmt = $mysql->prepare($query);

$stmt->bind_param('ss', $username, $password);

$stmt->execute();

$stmt->close();

So any variable data that needs to be put in the query is replaced by ?'s. Then later, we will bind values to these placeholders. PHP does this safely and internally, so you don't even need to worry about SQL injection.

If you aren't comfortable using OOP code, the entire mysqli library is also procedural (although I find the syntax a little uglier).

Matt Ridge · December 2, 2011

It should be noted that if you want to completely avoid this problem all together, you should be using prepared statements. You can do this either using mysqli or PDO. I won't confuse you with the details, but basically you don't even need to sanitize against SQL injection because queries are made safe internally. I would recommend you read up on mysqli functions, they are far better than standard mysql.

Hope this helped.

I guess what I am getting stuck at is what are prepared statements, and how do I use them to sanitize correctly.

People show me php.net which is fine, but it really doesn't explain to me how it works because I don't understand what they are really doing...

Okay, this is an example of inserting data with a prepared statement.
<?php

$mysqli = new mysqli('localhost', 'root', 'root', 'database');

$username = $_POST['username'];
$password = $_POST['password'];

$query = "INSERT INTO users (username, password) VALUES (?, ?)";

$stmt = $mysql->prepare($query);

$stmt->bind_param('ss', $username, $password);

$stmt->execute();

$stmt->close();
So any variable data that needs to be put in the query is replaced by ?'s. Then later, we will bind values to these placeholders. PHP does this safely and internally, so you don't even need to worry about SQL injection.

If you aren't comfortable using OOP code, the entire mysqli library is also procedural (although I find the syntax a little uglier).

I am using mysquli... sorry...

I do have a question though I see people use -> in their code all the time, what does that actually do or mean?

scootstah · December 2, 2011

Sanitizing is generally inputs-only because one would assume you'd only want to store and/or display clean/valid/verified data. There's nothing stopping you from using sanitizing functions in other instances, it's just an ass backwards way of doing things.

This is not always the case. You should sanitize input so that it is the expected format and safe to store. So, knock off SQL injections and validate UTF8 and that's all you really need.

Then sanitize HTML and such at output. In this way, your data is more flexible. You can use it in different ways. If, in 6 months, you decide you actually did want to allow HTML then all you need to do is modify your outgoing sanitation filters.

Of course this isn't necessarily always the best way to go about it. For example, if you have a ton of content being outputted then sanitation on every page load may slow things down a bit (of course caching is usually an option too).

At any rate, there is no end-all, be-all solution for sanitation. It is absolutely situational.

Ok, so what I'm asking is if sanitation is situational, is there a way to define what type of sanitation works where? As I've said before at the beginning everyone seems to sanitize differently for the same type of code.

Posting, $='1', etc...

There has to be some place beyond http://php.net/, that shows in real world situations how to sanitize... in plain English.

Or is this just stuff that people take for granted and say ok, I put that there, and hope it does what it's meant to... to me this stuff is like salt... I really don't understand why the code works, but being told to put it in there and know that it works really isn't great way of knowing how things work...

Read my edit on that post.

Basically, you sanitize for expected input. When you take input from someone, you always assume it is going to be used maliciously. You never think "oh, someone won't think to do that!". You always think "well that guy is shady, he's going to hack my database" and then you take paranoid steps to prevent it.

You always check that your input is the way you expected it to be. Generally sanitation refers to making input safe. So if you were to sanitize for SQL injections, you would escape the data. If you were sanitizing for XSS attacks, you would remove all HTML. There's lots of solutions for specific sanitation methods such as this, but you just have to know where to use what based on what you want the input to be.

How do you remove the HTML? That seems counterproductive....

By remove the HTML, I mean the actual tags. So that if users are allowed to insert data that is then read by other users, they can't use HTML maliciously (or annoyingly). For example if they post a comment and in the comment they write "<script type="text/javascript">alert('muahaha im malicious');</script>" then anyone who reads this comment is going to get an alert box. While this example is relatively harmless, there are many worse things that can happen.

So to sanitize for HTML you can either convert to entities (so things like < become <, which don't actually parse as HTML). You can do that with either PHP's htmlentities or htmlspecialchars functions. The input would then be this:

<script type="text/javascript">alert('muahaha im malicious');</script>

and thus anyone reading it would literally see "<script type="text/javascript">alert('muahaha im malicious');</script>" and it wouldn't execute as code.

Now, generally I prefer to sanitize HTML on output. This way if I ever change my mind about the data, or if there are multiple output formats, I don't have to worry. To do this you just use the above method but on output instead of input.

Matt Ridge · December 2, 2011

By remove the HTML, I mean the actual tags. So that if users are allowed to insert data that is then read by other users, they can't use HTML maliciously (or annoyingly). For example if they post a comment and in the comment they write "<script type="text/javascript">alert('muahaha im malicious');</script>" then anyone who reads this comment is going to get an alert box. While this example is relatively harmless, there are many worse things that can happen.

So to sanitize for HTML you can either convert to entities (so things like < become <, which don't actually parse as HTML). You can do that with either PHP's htmlentities or htmlspecialchars functions. The input would then be this:
<script type="text/javascript">alert('muahaha im malicious');</script>
and thus anyone reading it would literally see "<script type="text/javascript">alert('muahaha im malicious');</script>" and it wouldn't execute as code.

Now, generally I prefer to sanitize HTML on output. This way if I ever change my mind about the data, or if there are multiple output formats, I don't have to worry. To do this you just use the above method but on output instead of input.

So in other words you are talking about sanitizing upon posting?

scootstah · December 2, 2011

I am using mysquli... sorry...

Ahh, my mistake. Good then, you are already halfway there.

I do have a question though I see people use -> in their code all the time, what does that actually do or mean?

The "->" is called the object operator. It is used in OOP (object orientated programing). I'm probably not going to explain this very well, so I apologize for that. However I'm sure you could Google better explanations of OOP programming.

Basically, this is how it is used.

class MyClass
{
public $a_variable = 'im a variable';

public function doSomething()
{
	echo 'im doing something';
}

}

$MyClass = new MyClass; // instantiate the class
// now MyClass is an object. You can see this by doing:
// var_dump($MyClass);

// So to use the object we use the object operator, ->

// If we want to use the doSomething() method, we do this
$MyClass->doSomething();

// basically this says "use the doSomething method (function) inside the MyClass object

// We can also access (public) variables inside the class
echo $MyClass->a_variable;

So in other words you are talking about sanitizing upon posting?

Yes, by output I mean sanitize while posting. HOWEVER, while that may be the case for sanitizing HTML - which is only an issue on output (when people view it) - sanitizing after input may not always be the right thing to do. It depends entirely on what you are doing with your data, and what the data is.

EDIT: Wait, no. Posting would refer to input. I mean sanitize on OUTUT which would be when the user views the page and the data is retrieved from the database.

Matt Ridge · December 2, 2011

Ok, I've seen things like this before:

<?php
$_GET = sanitize($_GET);
$_POST = sanitize($_POST);
?>

Now I know these have been "sanitized" sort of... but shouldn't these also have mysql_real_escape_string in there as well?

Sign In

Sanitizing, how's the best way of doing it?

Recommended Posts

Link to comment

Share on other sites

Top Posters In This Topic

Popular Days

Top Posters In This Topic

Popular Days

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Important Information