chrisrulez001 Posted April 4, 2016 Share Posted April 4, 2016 (edited) Hi there, It's been a few months since I've touched PHP. I've read that you only use htmlspecialchars() when outputting data (for example from a database). Is that the correct way of doing it? Put to prevent XSS from getting into the database from the form, could you not use preg_match() to whitelist what you can actually enter into the field? Thanks Edited April 4, 2016 by chrisrulez001 Quote Link to comment Share on other sites More sharing options...
Solution Jacques1 Posted April 4, 2016 Solution Share Posted April 4, 2016 XSS has nothing to do with databases or input validation. It's an output problem caused by programmers who naïvely insert data (from any source) into HTML contexts. This cannot be solved with validation, because formal validity doesn't mean that the data is safe in every possible context. For example, I could give you a perfectly valid e-mail address which is an XSS vector at the same time. Why? Because the format of e-mail addresses was never meant to protect web applications from XSS attacks. Why should it? it's impossible to predict the context in which the data will be used. There's not just HTML. There are thousands of different languages and data formats with distinct syntax rules, and the data may be a threat to every single one of them. a lot of data cannot be validated at all. For example, how would you “validate” the posts on this forum? We obviously have to write down HTML markup and JavaScript code all the time. That's the whole point of this site. XSS must be prevented during the HTML rendering process. The best solution is to use a proper template engine like Twig which automatically applies HTML-escaping to all outbound data. The second-best solution is to write a wrapper for the htmlspecialchars() function. Using htmlspecialchars() directly is not recommended, because it's extremely error-prone. In my experience, almost nobody understands how to use it correctly. In addition to HTML-escaping, you should use Content Security Policy. This allows you to define strict rules for JavaScript execution and block many attacks. Quote Link to comment Share on other sites More sharing options...
chrisrulez001 Posted May 2, 2016 Author Share Posted May 2, 2016 XSS has nothing to do with databases or input validation. It's an output problem caused by programmers who naïvely insert data (from any source) into HTML contexts. This cannot be solved with validation, because formal validity doesn't mean that the data is safe in every possible context. For example, I could give you a perfectly valid e-mail address which is an XSS vector at the same time. Why? Because the format of e-mail addresses was never meant to protect web applications from XSS attacks. Why should it? it's impossible to predict the context in which the data will be used. There's not just HTML. There are thousands of different languages and data formats with distinct syntax rules, and the data may be a threat to every single one of them. a lot of data cannot be validated at all. For example, how would you “validate” the posts on this forum? We obviously have to write down HTML markup and JavaScript code all the time. That's the whole point of this site. XSS must be prevented during the HTML rendering process. The best solution is to use a proper template engine like Twig which automatically applies HTML-escaping to all outbound data. The second-best solution is to write a wrapper for the htmlspecialchars() function. Using htmlspecialchars() directly is not recommended, because it's extremely error-prone. In my experience, almost nobody understands how to use it correctly. In addition to HTML-escaping, you should use Content Security Policy. This allows you to define strict rules for JavaScript execution and block many attacks. Ok thank you for your informative post Jacques1 I'll have a look at Twig and implementing a Content Security Policy. With regards to htmlspecialchars(), I see from your other post you use ENT_QUOTES | ENT_SUBSITITUTE are these the best flags to use? Quote Link to comment Share on other sites More sharing options...
Jacques1 Posted May 4, 2016 Share Posted May 4, 2016 (edited) The ENT_QUOTES flag is crucial for security. If you leave it out, only double quotes are escaped, so single-quoted attributes aren't safe at all. ENT_SUBSTITUTE isn't security-related, but it's still important for Unicode encodings (like UTF-. By default, htmlspecialchars() returns an empty string if the input contains a invalid byte sequence. That's usually not what you want. A more reasonable approach is to substitute the invalid bytes with the Unicode replacement character while leaving the rest of the input intact. And that's what ENT_SUBSTITUTE does. Edited May 4, 2016 by Jacques1 Quote Link to comment Share on other sites More sharing options...
chrisrulez001 Posted May 4, 2016 Author Share Posted May 4, 2016 Ok thanks your your help Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.