One suggestion that kept recurring was to simply run the value through json_encode() and then inject the result into a script element. The JSON-encoding is supposed to (magically?) prevent cross-site scripting vulnerabilities. And indeed it seemingly works, because naïve attacks like trying to inject a double quote will fail.
Unfortunately, this approach doesn't work at all and is fundamentally wrong for several reasons:
- json_encode() was never intended to be a security function. It simply builds a JSON object from a value. And the JSON specification doesn't make any security promises either. So even if the function happens to prevent some attack, this is implementation-specific and may change at any time.
- The json_encode() function is not encoding-aware, which makes it extremely fragile and unsuitable for any security purposes. Some of you may know this problem from SQL-escaping: There used to be a function called mysql_escape_string() which was based on a fixed character encoding instead of the actual encoding of the database connection. This quickly turned out to be a very bad idea, because a mismatch could render the function useless (e. g. the infamous GBK vulnerability). So back in 2002(!), the function was abandoned in favor of mysql_real_escape_string(). Well, json_encode() is like the old mysql_escape_string() and suffers from the exact same issues.
Any of those issues can be fatal and enable attackers to perform cross-site scripting, as demonstrated below.
The entire “security” of json_encode() is based on side-effects. For example, the current implementation happens to escape forward slashes. But the JSON standard doesn't mandate this in any way, so this feature could be removed at any time (it can also be disabled at runtime). If it does get disabled, then your application is suddenly wide open to even the most trivial cross-site scripting attacks:
<?php header('Content-Type: text/html; charset=UTF-8'); $input = '</script><script>alert(String.fromCharCode(88, 83, 83));</script><script>'; ?> <!DOCTYPE HTML> <html> <head> <meta charset="utf-8"> <title>XSS</title> </head> <body> <script> var x = <?= json_encode($input, JSON_UNESCAPED_SLASHES) ?>; </script> </body> </html>
In XHTML, a script element works like any other element, so HTML entities like " are replaced with their actual characters (in this case a double quote). But JSON does not recognize HTML entities, so an attacker can use them to bypass json_encode() and inject arbitrary characters:
json_encode() blindly assumes that the input and the output should always be UTF-8. If you happen to use a different encoding, or if an attacker manages to trigger a specific encoding, you're again left with no protection at all:
<?php header('Content-Type: text/html; charset=UTF-7'); $input = '+ACIAOw-alert(+ACI-XSS+ACI)+ADsAIg-'; ?> <!DOCTYPE HTML> <html> <head> <meta charset="utf-7"> <title>XSS</title> </head> <body> <script> var x = <?= json_encode($input) ?>; </script> </body> </html>
(This particular example only works in Internet Explorer.)
I hope this makes it very clear that json_encode() is not a security feature in any way. Relying on it is conceptually wrong and simply a very bad idea.
It's generally not recommended to inject code directly into a script element, because any mistake or bug will immediately lead to a cross-site scripting vulnerability. It's also very difficult to do it correctly, because there are special parsing rules and differences between the various flavors of HTML. If you try it, you're asking for trouble.
If you're into micro-optimization and cannot live with the fact that Ajax may need an extra request, there's an alternative approach by the OWASP: You can JSON-encode the data, HTML-escape the result, put the escaped content into a hidden div element and then parse it with JSON.parse():