Jump to content

Class to clean HTTP data


TomTees

Recommended Posts

I'm new to OOP and trying to write tiny classes to get practice.

 

The next class I want to create will be used to clean HTTP data from when a form gets submitted.

 

I'm embarrassed to say, but I'm scratching my head trying to figure out what types of things I should do as far as "sterilizing" POST and GET data?!  :shrug:

 

Can someone get me started here?

 

Thanks,

 

 

TomTees

 

 

 

 

Link to comment
Share on other sites

It depends how that data will be used, really.

 

Well, that's sorta my question...

 

Are there some common "best practices" that you would always want to do regardless of the data, data-types, or how the data is used and put into a "utility class"?

 

I'm just not very well versed in the kinds of nefarious things people can do with the GET and POST arrays...

 

 

 

TomTees

 

 

Link to comment
Share on other sites

Not a complete list, but I would start with the following.

 

 

Stop directory traversal.
Stop MySQL comments.
Stop B64 encoded.
Remove null characters (to stop sandwiching between ascii characters).
Validate standard ascii/UTF 16 characters (make sure there is a semi colon).
Decode URLs.
Make sure there are no tabs and/or spaces between words like j a v a s c r i p t, vb     script, etc.
Make xml/php tags safe, by converting to html entities.
Remove any disallowed javascript (esp. if they are in links), as well as javascript event handlers.
Remove naughty HTML elements (or change to html entities).
Remove naughty PHP function calls (like eval).

Of course you could put some checks in there for the proper data.

 

Such as strlen, numeric only, alpha numeric, or alpha only.

Link to comment
Share on other sites

Data should be escaped properly just prior to being used; the escape method depends entirely on the way in which the data is being used.

 

Data, however malicious the user intends it to be, is absolutely harmless while it's just sitting there in your variable.

 

<?php
$danger_var = "'; drop table users; --";
?>

 

$danger_var poses no threat if you echo it to text file, echo it to a PDF, export it to an Excel sheet, or dump it to the user's browser.  The only time that variable content is dangerous is if you use it in a database query unescaped.

 

There really is no catch-all sanitation you can perform on POST data that makes it safe for all uses except possibly undoing the effects of magic quotes if they're enabled.

 

On a similar note I've noticed some folks are of the mindset of "I've sanitized before inserting into my database so therefore I do not need to sanitize before sending to the user's browser."  Not true!

 

For example before sending data to a user's browser you should always run it through htmlentities() and you should not use htmlentities() before storing data in the database.  Drawbacks include:

1) You've altered the data entered by the user.

2) You're operating under the assumption that your database is safe and contains safe data.  However if an attacker compromises your database and inserts unescaped and dangerous JavaScript then your users are in big trouble because you'll blindly send that malicious code to them.

 

 

Link to comment
Share on other sites

Not a complete list, but I would start with the following.

 

 

 

Stop directory traversal.Stop MySQL comments.Stop B64 encoded.Remove null characters (to stop sandwiching between ascii characters).Validate standard ascii/UTF 16 characters (make sure there is a semi colon).Decode URLs.Make sure there are no tabs and/or spaces between words like j a v a s c r i p t, vb     script, etc.Make xml/php tags safe, by converting to html entities.Remove any disallowed javascript (esp. if they are in links), as well as javascript event handlers.Remove naughty HTML elements (or change to html entities).Remove naughty PHP function calls (like eval).

 

 

Got any code to go along with that long list?!  :)

 

Or maybe some links or tutorials of how to code for those things?

 

 

 

Of course you could put some checks in there for the proper data.

 

Such as strlen, numeric only, alpha numeric, or alpha only.

 

 

Maybe I missed this before, but I was under the impression that data-types were stripped off by HTTP?

 

So, how effective is magic quotes at handling the things you mentioned above?

 

 

 

TomTees

 

 

Link to comment
Share on other sites

have a look at this PDF;

http://php.net/manual/en/function.mysql-real-escape-string.php...

 

mysql_real_escape_string() is a good function to start with

 

Problem is that A MySQL connection is required before using mysql_real_escape_string() otherwise an error of level E_WARNING is generated, and FALSE is returned.

 

I just want a way to clean data in general.

 

No one said I was going to use GET or POST data in a database or specifically in a MySQL database.

 

 

TomTees

 

 

 

Link to comment
Share on other sites

You could do something like this:

<?php
/**
* Sanitizer exception.
*/
class Sanitizer_Exception extends Exception {
    /**
    * Construct the exception
    * 
    * @param string $msg
    * @param mixed $value
    * @return Sanitizer_Exception
    */
    public function __construct( $msg, $value ) {
        parent::__construct( $msg . ': ' . var_export( $value, true ) );
    }
}

/**
* Sanitizer base class.
*/
class Sanitizer_Base {
    /**
    * The value.
    * 
    * @access protected
    * @var mixed
    */
    protected $value = null;
    
    /**
    * Flag if value has been cleaned.
    * 
    * @access protected
    * @var bool
    */
    protected $cleaned = false;
    
    /**
    * Construct the class
    * 
    * @param mixed $value The value to sanitize.
    * @return Sanitizer_Base
    */
    public function __construct( $value ) {
        $this->_value = $value;
    }
    
    /**
    * Return the sanitize value.
    * 
    * @throws Sanitizer_Exception
    * @return mixed
    */
    public function clean() {
        $this->cleaned = true;
    }
}

/**
* Sanitize as positive integer.
*/
class PositiveInteger_Sanitizer extends Sanitizer_Base {
    const EXCEPTION = 'Not a positive integer';
    
    /**
    * Create object.
    * 
    * @param mixed $value
    * @return PositiveInteger_Sanitizer
    */
    public function __construct( $value ) {
        parent::__construct( $value );
    }
    
    /**
    * Clean the value.
    * 
    * @throws Sanitizer_Exception
    * @return mixed
    */
    public function clean() {
        if( $this->cleaned === false ) {
            do {
                //
                // Initially invalid
                $valid = false;
                //
                // Must be all digits
                if( ctype_digit( $this->value ) === false ) { 
                    break;
                }
                //
                // Convert to true int.
                $this->value = (int)$this->value;
                //
                // Must be positive.
                if( $this->value <= 0 ) {
                    break;
                }
                //
                // Flag as valid
                $valid = true;
            } while( false );
            //
            // If valid, cascade to parent.
            if( $valid === true ) {
                parent::clean();
            }
            //
            // If invalid, throw exception
            else {
                throw new Sanitizer_Exception( self::EXCEPTION, $this->value );
            }
        }
        //
        // Return the value.
        return $this->value;
    }
}
?>

 

Basically you'd have to make many different sanitation classes suitable for how you intend to use the data.

Link to comment
Share on other sites

Or, use a class for what it is designed for.  To group aggregate functions that serve a similar purpose, and that all relate to one another.  Writing a class for sanitation that has functions that cover queries, output, file uploads, file traversals, etc. is a perfectly legit project.

Link to comment
Share on other sites

To group aggregate functions that serve a similar purpose

That's not necessarily what classes and object oriented programming are about.  Objects and classes are not primarily intended for grouping similar features into one area, although they can be used in that regard in languages such as PHP that don't (or didn't) support name spaces.

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.