Jump to content

Recommended Posts

Hi

 

This is more of a general question - obviously when we have user input in our applications, the traditional defence against XSS and Injections and all that kind of thing is to validate user input (and add slashes) and then encode its output each time. This is as opposed to sanitisation. So the age-old question:

What's really wrong with encoding the input, rather than the output? Then you don't have to encode data each time you call it!

Say this is what my clean function would look like

<?php
function clean($dirty) {
if ($dirty === FALSE)
return '';
$dirty = htmlentities($dirty, ENT_QUOTES, "UTF-8");
return trim($dirty);
}
?>

 

and I run that on all user input before entering it into the DB. Why is this considered bad practise?

Link to comment
https://forums.phpfreaks.com/topic/264541-handling-xss/
Share on other sites

By doing that you are potentially limiting what you can do with the data, because you have mangled it. It's usually preferred to maintain data integrity and only adjust it when it's needed. For example there might be a format where you literally want a < instead of <. Sure you could just reverse the process, but you can't be sure that the data will be exactly as it was.

Link to comment
https://forums.phpfreaks.com/topic/264541-handling-xss/#findComment-1355785
Share on other sites

Hi; thanks for the responses! Just curious though...

 

I agree there could be these situations, but then surely the inverse of the clean function would return it to its original formatting (minus any extraneous whitespace)? htmlentites is a 1to1 mapping, as is html_entity_decode, so any 'mangled' data can be returned to its original form, and then used in such situations. To my mind, it's just as easy (if not easier) to encode the data on input, rather than having to encode it each time on output, and simply running an unclean() function the odd occasion you may need to create an RSS feed or whatever.

 

This is all of course, as matters appear to my na?ve mind.... I could perhaps be completely wrong haha. Essentially, all I'm trying to say is that for every argument made for NOT encoding data on input, the same can be said for the converse.

 

Thanks!

Link to comment
https://forums.phpfreaks.com/topic/264541-handling-xss/#findComment-1355982
Share on other sites

Hmm, I'm not entirely convinced that's a brilliant solution though - and what about those of us who don't template?

idk, it seems there's not really any strong YOU MUST NOT DO THIS OR APACHE WILL EXPLODE kind of reasoning behind not sanitising. As long as you remember do it for every input, and have a method for getting the original data, you can't go wrong!

Link to comment
https://forums.phpfreaks.com/topic/264541-handling-xss/#findComment-1355989
Share on other sites

Hmm, I'm not entirely convinced that's a brilliant solution though - and what about those of us who don't template?

idk, it seems there's not really any strong YOU MUST NOT DO THIS OR APACHE WILL EXPLODE kind of reasoning behind not sanitising. As long as you remember do it for every input, and have a method for getting the original data, you can't go wrong!

Yeah. As long as you remember to convert back from HTML every time you want to use something like strlen() or string replacing or RSS feeds or someone's API or... Because it's easier that way!

Link to comment
https://forums.phpfreaks.com/topic/264541-handling-xss/#findComment-1355994
Share on other sites

I earnestly believe it to be, yes. I find you only actually "need" the original data in those sorts of situations far less often than you do just output it. In which scenario, the same argument can be said that you need to remember to encode on every. single. output. each. time.

In any case, if it's just a question of connivence, then imo both methods are valid, and its just down to preference, and what your application calls for.

Link to comment
https://forums.phpfreaks.com/topic/264541-handling-xss/#findComment-1355999
Share on other sites

Most people like to keep user data as close to its original form as possible. Store it the way the user enters it, and then modify it as needed only when you need to. The only time you need to sanitize for XSS is when outputting the data as HTML. There are tons more ways to use data that don't require it to be sanitized for XSS.

 

Ultimately, it is your decision whether you sanitize on input or output. It will work either way, but if you sanitize on input then you are only creating more work for yourself.

 

 

Link to comment
https://forums.phpfreaks.com/topic/264541-handling-xss/#findComment-1356002
Share on other sites

If the original data has been manipulated before being stored, you can never be certain that anything you do will produce the same data as it was in its original form.

 

Why? If html_entity_decode is simply the exact reverse of htmlentities, the only changes from the original data that would be lost would be whitespace, or unsuitable characters...which we wouldn't want to store anyway.

 

 

Of course if your application has to perform lots of operations on the stored unencoded data, then yes it would make sense to store it as such. But I think for most applications (forums, blogs etc), storing it encoded seems to be a much more hassle-free way of doing it.

 

Anyway, I appreciate the arguments you've made, thanks. For large scale applications, I will store it unencoded, but as long as I know all I'm doing with the data is displaying it, I can't see any reason not to store it encoded.

 

Thanks!

Link to comment
https://forums.phpfreaks.com/topic/264541-handling-xss/#findComment-1356005
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.