Jump to content

I have a few questions about sanitizing my data.


mike316

Recommended Posts

Hello everyone, my name is Mike and I'm new here to the forum and new to PHP. I've been practicing with PHP, making forms, receiving data from forms, and printing that data to a page. In my book, PHP for the web, the author stated that using the sanitize features of PHP is a good way to make sure that you receive the data you want, as well as protecting against unwanted HTML tags and that sanitizing data will be discussed later on in the book. Well, I decided to look up sanitizing my data on youtube and Google.

I made a form and used the some of sanitizing filters from PHP.net manual. I'm unsure if declared the filters correctly. The first thing I did is, was declare my variables, next I used nl2br and strip_tags for my comments. I did that because I wanted to the user to be able to make line breaks when hitting enter, and to remove unwanted HTML Tags from the comments. Next, I used the sanitize filters to strip the tags of unwanted HTML tags and sanitize the email.

 

My question is, is it okay to declare the filters after my variables have been declared, or should I use an else statement if the user filled out, for instance, their first name, and run the run the filter there? Again, I'm new to PHP, and most of the time I will using include and requires and keep this "stuff" in separate files but, until I learn about includes and requires practicing like the above example will have to do for now. I'll paste my code below, and thank you for any help.

<!DOCTYPE html>
<html>
<head>
    <meta charset="utf-8" />
    <meta http-equiv="X-UA-Compatible" content="IE=edge">
    <title>Filter Var Practice</title>
    <meta name="viewport" content="width=device-width, initial-scale=1">
</head>
<body>
    <form action="practice.php" method="POST">
        <p>
            <label for="first_name">First Name:</label>
            <input type="text" id="first_name" name="first_name">
        </p>
        <p>
            <label for="last_name">Last Name:</label>
            <input type="text" id="last_name" name="last_name">
        </p>
        <p>
            <label for="email">Email:</label>
            <input type="text" id="email" name="email">
         </p>
         <p>
             <label for="comments">Comments:</label>
             <textarea name="comments" id="comments"></textarea>
         </p>
        <p>
            <input type="submit" name="submit" value="Submit">
        </p>
    </form>
</body>
</html>

My PHP:

<!DOCTYPE html>
<html>
<head>
    <meta charset="utf-8" />
    <meta http-equiv="X-UA-Compatible" content="IE=edge">
    <title>Page Title</title>
    <meta name="viewport" content="width=device-width, initial-scale=1">
    <style>
        .error {
            color: red;
            font-weight: bold;
        }
    </style>
</head>
<body>
    <h1>Filter Var Practice</h1>
</body>
</html>

<?php
    // Setting error managment.
    ini_set('display_errors', 1);
    error_reporting(E_ALL);
    
    // Declaring the variables
    $first_name = $_POST['first_name'];
    $last_name = $_POST['last_name'];
    $email = $_POST['email'];

    /* Allowing users to enter their own line breaks
       in the comments if they chose to, and using strip_tags
       to remove unwanted tags like <i></i><b></b> and <script>
       </script> tags.
    */

    $comments = nl2br(strip_tags($_POST['comments']));


    // Sanitizing the data

    $first_name = filter_var($_POST['first_name'], FILTER_SANITIZE_STRING);
    
    $last_name = filter_var($_POST['last_name'], FILTER_SANITIZE_STRING);
    
    $email = filter_var($_POST['email'], FILTER_SANITIZE_EMAIL);
    

    /* Setting a true variable so, if something is empty it will fail, 
       and the print statement inside the if will be false indicating 
       the user left a input field empty.
    */
    $okay = true;

   if(empty($_POST['first_name'])) {
       print '<p class="error">Please enter your first name.</p>';
       $okay = false;
   }

   if(empty($_POST['last_name'])) {
       print '<p class="error">Please enter your last name.</p>';
       $okay = false;
   }

   if(empty($_POST['email'])) {
       print '<p class="error">Please enter your email address</p>';
       $okay;
   }

   if(empty($_POST['comments'])) {
       print '<p class="error">Please leave us a few comments</p>';
       $okay = false;
   }
   
   // If the first name, last name, email, and comments where all filled out, print out the results.
   if($okay) {
       print "<p>Thank you $first_name $last_name<br>
                 We will be contacting you at: $email<br>
                 About the comments you left us below: <br>
                 $comments</p>";
   }

?>    
Link to comment
Share on other sites

You should try to avoid changing user data as much as you can. Consider this forum: it would not work nearly as well if the software stripped anything that looked like HTML tags in case because they might be malicious. Escaping, where you keep the data as it is and instead make sure it can't be interpreted unintentionally, is almost always better.

Escaping is one form of "sanitization". The other form is what PHP's sanitize filters do which is modify the data. Its validate filters, on the other hand, are perfectly fine because they only report whether the data appears to be safe.

 

On the other hand, there are functions like nl2br which do technically change the data, but it does so more in the sense of converting it from one form (plain text) to another form (HTML). That's not a problem.

 

Given that, in my opinion:

- Don't strip_tags() on the comment. Instead, htmlspecialchars it so that HTML will not be interpreted, then nl2br() it so the line breaks work.

- FILTER_SANITIZE_STRING is basically a more aggressive form of strip_tags() so I would remove it. Use htmlspecialchars() there too.

- On the subject of names, they're tricky. For the most part you should not validate them - because there are so many ways it can seem invalid but be correct. Non-Latin characters. Symbols. Even expecting a first and last name can be an issue. For names it's best to give one entry field to cover whatever the user wants to put there (full name or not) and then make sure they did, in fact, enter some value.

- For something like an email address, sanitizing is actually bad: if the email address is flawed (perhaps a typo) and you sanitize it, you'll accidentally create a valid email address and then try to use it. Instead validate the address, and if invalid present an error to the user so they can correct their mistake.

 

There's also a mistake with one of the $okay.

Link to comment
Share on other sites

Beyond Req's excellent and thorough feedback, you might want to consider the flow of your current code:

 

Display Form -> submit to processing -> Show error || Show success

 

This is not user friendly in the case that there is a problem.  I'm sure you are aware that most systems will let you know what you need to correct within the form you originally submitted.  These days people are using ajax to examine input and try and catch errors as the form is being filled out, but I'm not going to push you in that direction.

 

What would be an improvement is combining your form and your processing into one script.  The basic pattern is:

 

 

 
$form_ok = false;
$errors = array();
 
if (form submitted) {
    //pass $errors by reference. process_form() will validate the submission and add validation errors to $errors array
    $form_ok = process_form($errors);    
}
 
if ($form_ok) {
   handle_form_data();
} else {
   // output form here.
  // If $errors exist, decorate the form with error messages and optionally pre-fill the form with prior submission 
}
 
 

I would also recommend looking at some well known components that provide validation to see the types of things they do.  I frequently recommend that people start with the Symfony components.  

 

Hopefully your project already has started with a composer.json file.   Here's a nice cheatsheet I found the other day:  http://composer.json.jolicode.com/  

 

Here are some components you might want to look at for ideas:

 

 

Form component: https://symfony.com/doc/current/components/form.html

This is a powerful and complex component that orchestrates the generation and handling of forms.  You might not want to use it, but at least you can see what type of code you might use in a fullstack framework

 

HTTP Foundation:  https://symfony.com/doc/current/components/http_foundation.html

This wraps the underlying HTTP request/response data.  Used by scores of well known PHP projects (phpBB, Joomla, Drupal, ratchet, Laravel).

 

Validator component:  https://symfony.com/doc/current/components/validator.html

This is the really interesting component for you.  It was modeled after a Java specification, and is typically integrated with the ORM model classes and the form component, but can easily be used standalone.

 

The basic idea is that you will create a class to store the data required by your form.   You then add constraints (and they have a large number available) to the individual fields you get form your form.  This could be as simple as using setters and getters.

 

The validator will validate individual form elements and you can also have a validator that handles the entire form, where you might have to code for complex relationships between fields.

 

An example might be that you have country, city, state columns, but you don't care about state if the country is not US.  You can't just have a non-null validation on state, but need to have the form evaluate the country to see whether or not state is required.

 

 

Here's an example straight out of the manual on setting up a data class with validation.  

 

// ...
use Symfony\Component\Validator\Mapping\ClassMetadata;
use Symfony\Component\Validator\Constraints as Assert;
use Symfony\Component\Validator\Validation;



class Author
{
    private $firstName;

    public static function loadValidatorMetadata(ClassMetadata $metadata)
    {
        $metadata->addPropertyConstraint('firstName', new Assert\NotBlank());
        $metadata->addPropertyConstraint(
            'firstName',
            new Assert\Length(array("min" => 3))
        );
    }
}
This sets up that the data is for a type of person, but you can have something simple that just handles the data you need, and ignore the data model implications.

 

When processing the form you would need to transfer the data from the HTTP Response to your class. That could be as simple as:

 

 

$formData = new Author();

$formData->setFirstName($_POST['first_name']);
$formData->setLastName($_POST['last_name']);
//etc
At that point, you would be ready to try and validate.

 

 

// This is just going to load the validation rules from your class.  There are other ways you can specify rules(yaml, xml, annotation) which on larger projects might make sense, but are optional.

$validator = Validation::createValidatorBuilder()
    ->addMethodMapping('loadValidatorMetadata')
    ->getValidator();

Now the validation. After pulling this from the manual, I noticed how this is similar to the skeleton I provided earlier.

 

 

    

    $errors = $validator->validate($author);

    if (count($errors) > 0) {
        /*
         * Uses a __toString method on the $errors variable which is a
         * ConstraintViolationList object. This gives us a nice string
         * for debugging.
         */
        $errorsString = (string) $errors;

Hope this helps you along on your path.
Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.