Jump to content

htmlspecialchars vs FILTER_SANITIZE_SPECIAL_CHARS


MFA
Go to solution Solved by AyKay47,

Recommended Posts

I have a form where external input by users will be fed into a MySQL database and I obviously need to sanitize this input.

 

I don't quite understand the differences between  the htmlspecialchars and FILTER_SANITIZE_SPECIAL_CHARS fuctions. Which is better to use in this scenario. For FILTER_SANITIZE_SPECIAL_CHARS, I have also used FILTER_FLAG_STRIP_HIGH.

Thanks.

Edited by MFA
Link to comment
Share on other sites

The idea of htmlspecialcharacters is that if you store something in your database that will be displayed in another users browser (let's say user A puts in a job description and user B reads it) you don't want the data to contain something like <script src="http://malware.com/infect.php" /> Sanitizing with htmlspecialchars will convert it to <script....

 

htmspcialchars won't protect against sql injection attacks. As AyKay47 said use mysqli_real_escape string for that.

Link to comment
Share on other sites

First off, standard notice: You're not stating what library you're using to connect to your MySQL database, but since AyKay mentioned the old (and no longer maintained) mysql library...

You should be using either mysqli or PDO to connect to your Database, as both of them are actively developed and contains all of the new(ish) features that the old mysql library is missing. Not to mention, since it's no longer developer it is insecure, and thus is deprecated in PHP 5.5 (and onwards).

 

Then, to your question.

The difference between FILTER_VALIDATE_SPECIAL_CHARS and htmlspecialchars is listed in the manual, so I recommend following the first link and reading up on it for yourself.

 

That said, as the two above touched upon: You don't want to be using either prior to inserting the data into the database. Escaping output should only be done immediately before sending the content to third party system, and then only escaping using the proper methods.

Which means that when you add the data to the SQL query, you need to either use mysql::real_escape_string (or PDO's equivalent) or Prepared Statements. The latter is recommended, as the database takes care of the proper way to escape the output automatically. The HTML escaping methods, however, should only be used when adding data to HTML strings, or when you're echoing out content to the browser.

 

Escaping for the wrong system, or prematurely, will corrupt the original data and cause usability issues (at the very least). If you're really unlucky, it may make the data or the whole system unusable.

Link to comment
Share on other sites

Thanks for the replies, security issues and sql injections have always confused me and I really need to understand them. So, I've still got a few questions.
 

The idea of htmlspecialcharacters is that if you store something in your database that will be displayed in another users browser (let's say user A puts in a job description and user B reads it) you don't want the data to contain something like <script src="http://malware.com/infect.php" /> Sanitizing with htmlspecialchars will convert it to <script....

 

htmspcialchars won't protect against sql injection attacks. As AyKay47 said use mysqli_real_escape string for that.

 

Okay, so you're saying if I have something like <Hello> in my database and I echo this out to my webpage, it wouldn't appear as I wanted it to as the < and > signs will be interpreted as HTML language. However, if I used htmlspecialchars the < and > signs will appear as I intended them to?

 

 

Hopefully you are not inserting html into the database, as this would be bad practice and a waste of storage space.

mysql_real_escape_string will make any data passed to it safe to use in an SQL statement by prepending any potentially harmful characters with a backslash.

 

Okay, I've been doing some reading and what happens if the hacker does something similar to the example posted under the heading "Just Escaping Strings Does Not Prevent SQL Injection" on this page (http://www.programmerinterview.com/index.php/database-sql/sql-injection-prevention/).

 

First off, standard notice: You're not stating what library you're using to connect to your MySQL database, but since AyKay mentioned the old (and no longer maintained) mysql library...
You should be using either mysqli or PDO to connect to your Database, as both of them are actively developed and contains all of the new(ish) features that the old mysql library is missing. Not to mention, since it's no longer developer it is insecure, and thus is deprecated in PHP 5.5 (and onwards).

Then, to your question.
The difference between FILTER_VALIDATE_SPECIAL_CHARS and htmlspecialchars is listed in the manual, so I recommend following the first link and reading up on it for yourself.

That said, as the two above touched upon: You don't want to be using either prior to inserting the data into the database. Escaping output should only be done immediately before sending the content to third party system, and then only escaping using the proper methods.
Which means that when you add the data to the SQL query, you need to either use mysql::real_escape_string (or PDO's equivalent) or Prepared Statements. The latter is recommended, as the database takes care of the proper way to escape the output automatically. The HTML escaping methods, however, should only be used when adding data to HTML strings, or when you're echoing out content to the browser.

Escaping for the wrong system, or prematurely, will corrupt the original data and cause usability issues (at the very least). If you're really unlucky, it may make the data or the whole system unusable.


I'm using mysqli (unfortunately, PDO is not supported by my web host). I have looked at both htmlspecialchars and FILTER_SANITIZE_SPECIAL_CHARS and they are both very similar in that they convert symbols such as < and > to html entities so they are displayed correctly and not mistaken for HTML, or am I mistaken?

Also, why won't a combination of htmlspecialchars and FILTER_SANITIZE_SPECIAL_CHARS work to protect against SQL injection. It would convert quotation marks that a hacker might use into a string of characters and prevent the hacker's code from functioning as intended.

Link to comment
Share on other sites

They are quite similar yes, but as the PHP manual states htmlspecialchars escapes a bit more than just FILTER_SANITIZE_SPECIAL_CHARS. For a 1 to 1 comparison, you should use FILTER_SANITIZE_FULL_SPECIAL_CHARS. However, htmlspecialchars is both shorter to write, and more common, so I recommend using it.

 

That brings us to the next point, SQL injection prevention.

As stated htmlspecialchars is for escaping output to a HTML-parser, not a database engine. The DB engine doesn't understand HTML, and doesn't care about it either. What it does understand, is SQL queries. SQL queries and HTML use quite different meta-characters, with only a few in common: Quotes being the most obvious, and even that is somewhat conditional for HTML.

However, due to the other meta-characters (which HTML does not share) using HTML escaping methods for SQL queries will not protect you. Those meta-characters will go through htmlspecialchars unscathed, and thus be able to cause SQL injections.

 

Same the other way around, if you use SQL escaping methods to escape output going to a browser. It will not escape the < and > signs, meaning an attacker can easily perform HTML injection attacks (XSS etc).

Not only that, but you'll suddenly have a lot of slashes in places where there shouldn't be any. Which is quite annoying, at best.

 

This is why it's so important to know, and use, the proper method for the third party system you're sending the data to. If you don't, you are still vulnerable. Not only that, but you just mangled your data to boot.

 

So, I'll repeat: htmlspecialchars only immediately before adding content to the HTML output, and SQL escaping only immediately before adding stuff to SQL queries. Without overwriting the original values.

 

Examples:

// Test data.
$string = "Tim <3 Jenny O'Toole";

// SQL protection:
$query = "INSERT INTO `test` (`string`) VALUES ('%s')";
$query = sprintf ($query, $db->real_escape_string ($string));

// HTML protection:
$outputMessage = "<p>".htmlspecialchars ($string)."</p>\n";
Link to comment
Share on other sites

Also to add to CFs response, there should not be a need to escape HTML characters in data that is to be sent to a database because HTML should never be injected into a database, period.

Not only is it a waste of resources and storage space, but it makes filtering/sanitizing for both inserting and reading data from a database more complicated and leaves room for more human error, which typically equates to more security holes.

Business and presentation logic should always be separate from each other.

Link to comment
Share on other sites

Okay, I now have a much better understanding of mysql injection attacks and what measures I can employ to try and prevent them . One final question, if I was to use prepared statements, should I be using bound parameter prepared statements, bound result prepared statements or both. I would think just bound parameter prepared statements however since I'm new to all this, I'm not sure if using both would confer better protection. Thank you both for your help.

Link to comment
Share on other sites

  • Solution

This question relies on the behavior of the application you are designing. Bound parameters are used when a dynamic SQL statement is used that relies on a certain set of data that needs to be sanitized. Bound results are typically used when your application requires a bound variable(s) of the result set to be passed to another part of the application, or when you simply want a more logical separation of the result set. Otherwise a simple static SQL statement with a call to fetch() is typically used.

 

To directly answer your question, using both bind functions will not confer better protection.

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.