doubledee Posted May 30, 2012 Share Posted May 30, 2012 Security is very important to me, so I have been making sure that all PHP Variables that are being output by my scripts are run through the escape function like this... htmlentities($someVariable, ENT_QUOTES) The problem that I am have, however, is that I find that to be a real PITA because it takes already complex code and makes it even harder to read. So, I am wondering if I could do something like this to streamline thing... 1.) Create a function function escapeOutput($someVariable){ $safeOutput = htmlentities($someVariable, ENT_QUOTES); return $safeOutput; } 2.) And then as I create PHP Variables, run them through my new function like this... $usename = escapeOutput($username); $address = escapeOutput($address); $subject = escapeOutput($subject); 3.) Then instead of this mess... echo "<dl> <dt>FROM:</dt> <dd>htmlentities($fromData, ENT_QUOTES)</dd> <dt>TO:</dt> <dd>htmlentities($toData, ENT_QUOTES)</dd> <dt>DATE:</dt> <dd>" . htmlentities($msgDate, ENT_QUOTES) . "</dd> <dt>SUBJECT:</dt> <dd><b>" . htmlentities($subject, ENT_QUOTES) . "</b></dd>\n\n <dt></dt>"; I could have the streamlined... echo "<dl> <dt>FROM:</dt> <dd>$fromData</dd> <dt>TO:</dt> <dd>$toData</dd> <dt>DATE:</dt> <dd>$msgDate</dd> <dt>SUBJECT:</dt> <dd>$subject</dd> <dt></dt>"; What do you gurus think? Debbie Quote Link to comment Share on other sites More sharing options...
scootstah Posted May 30, 2012 Share Posted May 30, 2012 Looks fine to me. You might want to also specify an encoding as a third parameter for the htmlentities() function (probably UTF8). If it were me, I would add support for arrays - so that you can pass an associative array to the function and get back a sanitized associative array. This way you don't have to run each key manually. Also, if I may be nitpicky, your function name doesn't accurately describe what the function does - since you're technically not escaping anything but rather converting it. Quote Link to comment Share on other sites More sharing options...
doubledee Posted May 30, 2012 Author Share Posted May 30, 2012 Scootstah, You can be nitpicky if I can follow up with more questions! Looks fine to me. You might want to also specify an encoding as a third parameter for the htmlentities() function (probably UTF8). Encoding is a scary topic, and one I started a thread on here before with no good recommendations. I'm not sure what I am using, to be honest, and I know from what I have read, that supporting International Character Sets between PHP and MySQL can be a real b*tch... Think I'll pass on that one until I understand the topic better. If it were me, I would add support for arrays - so that you can pass an associative array to the function and get back a sanitized associative array. This way you don't have to run each key manually. That is an excellent idea!! Care to share how you'd do that? Also, if I may be nitpicky, your function name doesn't accurately describe what the function does - since you're technically not escaping anything but rather converting it. Okay, so what would be a better name? I'm open to more accurately describing what I am doing. To be honest, maybe I don't totally get what HTMLENTITIES is really doing for me... Debbie Quote Link to comment Share on other sites More sharing options...
doubledee Posted May 30, 2012 Author Share Posted May 30, 2012 Okay, don't say I don't try things myself!!! What do you think about this?! function str2htmlentities($var){ /** * Convert all applicable characters to HTML entities. * * To display reserved characers (e.g. < >) we need to use * the function htmlentities to convert text to the appropriate HTML Entity. * * This will also help prevent against Cross-Site Scripting (XSS) attacks. * * Returns either a scalar variable or an array * * * @param {String, Array} $var * @return String */ // Check Data-Type. if (is_scalar($var)){ // Variable is Scalar. $converted = htmlentities($var, ENT_QUOTES); }elseif (is_array($var)){ // Variable is Array. $converted = array_map('htmlentities', $var); }else{ // Invalid Data-Type. $_SESSION['resultsCode'] = 'FUNCTION_HTMLENTITIES_INVALID_TYPE_5004'; // Set Error Source. $_SESSION['errorPage'] = $_SERVER['SCRIPT_NAME']; // Redirect to Display Outcome. header("Location: " . BASE_URL . "/account/results.php"); // End script. exit(); }//End of CHECK DATA-TYPE return $converted; }//End of str2htmlentities <?php // Access Constants. require_once('config/config.inc.php'); // Access Functions. require_once('utilities/functions.php'); // Set Variables. $username = 'DoubleDee'; $ages['John'] = 32; $ages['Mary'] = 25; $ages['Sally'] = 41; $favoriteTags=array("<b>", "<p>", "<html>"); // Convert Variables. $username = str2htmlentities($username); $ages = str2htmlentities($ages); $favoriteTags = str2htmlentities($favoriteTags); // Output Variables. echo '<p>$username = ' . $username . '</p><br />'; foreach($ages as $key => $value){ echo $key . ' is ' . $value . ' years old.<br />'; } echo "<br /><p>My favorite HTML tags include:</p>"; foreach($favoriteTags as $value){ echo "$value<br/>"; } ?> Debbie Quote Link to comment Share on other sites More sharing options...
scootstah Posted May 30, 2012 Share Posted May 30, 2012 Encoding is a scary topic, and one I started a thread on here before with no good recommendations. I'm not sure what I am using, to be honest, and I know from what I have read, that supporting International Character Sets between PHP and MySQL can be a real b*tch... Think I'll pass on that one until I understand the topic better. Well, you still ought to pick an encoding (UTF8 is pretty standard) and make sure everything is the one you choose. It's not particularly difficult and is beneficial in the long run. If it were me, I would add support for arrays - so that you can pass an associative array to the function and get back a sanitized associative array. This way you don't have to run each key manually. That is an excellent idea!! Care to share how you'd do that? Probably with recursion. Check if your input is an array and then loop through it and re-call the function from within itself. This way you can easily traverse through multi-dimensional arrays without any extra effort. Okay, so what would be a better name? Other developers have used some variation of the "html entities" name to solve the same problem. Ultimately you are still using the htmlentities() function, but just wrapping it up first. To be honest, maybe I don't totally get what HTMLENTITIES is really doing for me... It is converting symbols and such to their HTML entities. You can see a list of them here (excuse the w3schools link, their HTML entities charts happen to be good references). Essentially you are preventing XSS attacks by removing your users' ability to add markup to any dynamic content. < and > tags will be changed to their entities - < and > respectively. Quote Link to comment Share on other sites More sharing options...
scootstah Posted May 30, 2012 Share Posted May 30, 2012 Okay, don't say I don't try things myself!!! What do you think about this?! function str2htmlentities($var){ /** * Convert all applicable characters to HTML entities. * * To display reserved characers (e.g. < >) we need to use * the function htmlentities to convert text to the appropriate HTML Entity. * * This will also help prevent against Cross-Site Scripting (XSS) attacks. * * Returns either a scalar variable or an array * * * @param {String, Array} $var * @return String */ // Check Data-Type. if (is_scalar($var)){ // Variable is Scalar. $converted = htmlentities($var, ENT_QUOTES); }elseif (is_array($var)){ // Variable is Array. $converted = array_map('htmlentities', $var); }else{ // Invalid Data-Type. $_SESSION['resultsCode'] = 'FUNCTION_HTMLENTITIES_INVALID_TYPE_5004'; // Set Error Source. $_SESSION['errorPage'] = $_SERVER['SCRIPT_NAME']; // Redirect to Display Outcome. header("Location: " . BASE_URL . "/account/results.php"); // End script. exit(); }//End of CHECK DATA-TYPE return $converted; }//End of str2htmlentities Hmm. The first problem I see is that array_map isn't going to work for multi-dimensional arrays. Also, the error handling is unnecessary here - if it doesn't match expected data type just ignore it. Here is my take: function entities($input) { if (is_array($input)) { $clean = array(); foreach($input as $key => $val) { $clean[$key] = entities($val); } return $clean; } return htmlentities($input, ENT_QUOTES); } Quote Link to comment Share on other sites More sharing options...
doubledee Posted May 30, 2012 Author Share Posted May 30, 2012 Hmm. The first problem I see is that array_map isn't going to work for multi-dimensional arrays. Also, the error handling is unnecessary here - if it doesn't match expected data type just ignore it. Way to burst my bubble! (Just when I thought I figured something out on my own...) Here is my take: function entities($input) { if (is_array($input)) { $clean = array(); foreach($input as $key => $val) { $clean[$key] = entities($val); } return $clean; } return htmlentities($input, ENT_QUOTES); } So that is all I need? And that will handle all multi-dimensional arrays? BTW, how do I do what you are saying here... Well, you still ought to pick an encoding (UTF8 is pretty standard) ...in your above function? Thanks, Debbie Quote Link to comment Share on other sites More sharing options...
Psycho Posted May 30, 2012 Share Posted May 30, 2012 I would still use array_map(), but in such a way that it will work for multidimensional arrays. For the encoding you can just define it inside the function. function entities($input) { if (is_array($input)) { return array_map('entities', $input); } return htmlentities($input, ENT_QUOTES, 'UTF-8'); } EDIT: Fixed a typo in code Quote Link to comment Share on other sites More sharing options...
kicken Posted May 30, 2012 Share Posted May 30, 2012 I just created a wrapper for htmlentities with my defaults and a shorter name: function hent($str, $type=ENT_QUOTES, $char='UTF-8'){ return htmlentities($str, $type, $char); } To add your array support one could do: function hent($str, $type=ENT_QUOTES, $char='UTF-8'){ if (is_array($str)){ foreach ($str as &$v){ $v=hent($v, $type, $char); } return $str; } else { return htmlentities($str, $type, $char); } } Or if your on 5.3 or better: function hent($str, $type=ENT_QUOTES, $char='UTF-8'){ if (is_array($str)){ return array_map(function($s) use ($type,$char){ return hent($s, $type, $char); }, $str); } else { return htmlentities($str, $type, $char); } } Quote Link to comment Share on other sites More sharing options...
doubledee Posted May 30, 2012 Author Share Posted May 30, 2012 I would still use array_map(), but in such a way that it will work for multidimensional arrays. For the encoding you can just define it inside the function. function entities($input) { if (is_array($input)) { return array_map('entities', $input); } return htmlentities($input, ENT_QUOTES, 'UTF-8'); } EDIT: Fixed a typo in code 1.) Shouldn't it be... return array_map('htmlentities', $input); 2.) How is your code different from mine? 3.) How does your code handle a multi-dimensional array? I don't see how it is recursive like scootstah's code. Debbie Quote Link to comment Share on other sites More sharing options...
Andy-H Posted May 30, 2012 Share Posted May 30, 2012 I would still use array_map(), but in such a way that it will work for multidimensional arrays. For the encoding you can just define it inside the function. function entities($input) { if (is_array($input)) { return array_map('entities', $input); } return htmlentities($input, ENT_QUOTES, 'UTF-8'); } EDIT: Fixed a typo in code 1.) Shouldn't it be... return array_map('htmlentities', $input); 2.) How is your code different from mine? 3.) How does your code handle a multi-dimensional array? I don't see how it is recursive like scootstah's code. Debbie 1. Nope, that's how it's recursive, it "calls itself", although this can be achieved another way function entities($input, $quotes = ENT_QUOTES, $charset = 'UTF-8') { if ( is_array($input) ) { array_walk_recursive($input, function(&$v, $k, $params) { $v = htmlentities($v, $params[0], $params[1]); }, array($quotes, $charset)); return $input; } return htmlentities($input, $quotes, $charset); } 2. I can't work out whose code is whose anymore lol, probably that it uses a built-in PHP function rather than foreach. 3. It's recursive, so: $arr = array('on>', array('tw>'), '"thr'); // first iteration function entities($input) { // [ 'on>', [ 'tw>' ], '"thr' ] is array, map it with entities function (run entities on each value) if (is_array($input)) { return array_map('entities', $input); } return htmlentities($input, ENT_QUOTES, 'UTF-8'); } // $arr = array(); // second iteration function entities($input) { { return array_map('entities', $input); } // 'on>' is not array, return htmlentities return htmlentities($input, ENT_QUOTES, 'UTF-8'); } // $arr = array('on>'); // third iteration function entities($input) { // [ 'tw>' ] is array, map it with entities function (run entities on each value) if (is_array($input)) { return array_map('entities', $input); } return htmlentities($input, ENT_QUOTES, 'UTF-8'); } // $arr = array('on>', array()); // fourth iteration function entities($input) { if (is_array($input)) { return array_map('entities', $input); } // 'tw>' is not array, return htmlentities return htmlentities($input, ENT_QUOTES, 'UTF-8'); } // $arr = array('on>', array('tw>')); // fifth iteration function entities($input) { if (is_array($input)) { return array_map('entities', $input); } // '"thr' is not array, return htmlentities return htmlentities($input, ENT_QUOTES, 'UTF-8'); } // $arr = array('on>', array('tw>'), '"thr'); Hope that helps, for the record I wouldn't do it this way at all, I'd use the way you were already doing, that way you only need to store one variable, and can escape it as-per for MySQL, display etc. If you wish to make it more bearable the "ENT_QUOTES" constant resolves to (int)3 so you could use: htmlentities($in, 3, 'UTF-8'); As far as character encoding, I always use UTF-8, it's the go-to-encoding if you don't know anything about character encoding (like myself), and just make sure everything's in sync, I.E. HTML pages have: <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" > PHP (untrusted variables) are output using: htmlentities($var, 3, 'UTF-8'); MySQL charset (default is 'latin1_sweedish_ci') is set to utf8_general_ci or utf8_unicode_ci (see attachment), and you run: mysql_query("SET NAMES 'UTF8'"); // or PDO $dbh = new \PDO('mysql:dbname='. DB .';host='. DB_HOST, DB_USER, DB_PASS); $dbh->setAttribute(\PDO::MYSQL_ATTR_INIT_COMMAND, "SET NAMES 'UTF8'"); And you should be OK. Quote Link to comment Share on other sites More sharing options...
Andy-H Posted May 30, 2012 Share Posted May 30, 2012 Come to think of it, wrapping it in a function and encapsulating how you escape output can't be a bad thing anyway, even if you are escaping as-and-when necessary. Quote Link to comment Share on other sites More sharing options...
Adam Posted May 30, 2012 Share Posted May 30, 2012 I much prefer Psycho's version. It's simple really; if an array is passed to entities() it will array_walk() through it, and pass back the escaped array. If any item in that array happens to also be an array, the same happens but a level deeper; the escaped array is passed back, which is then passed back as part of it's parent. That can happen infinite levels, essentially. If you wish to make it more bearable the "ENT_QUOTES" constant resolves to (int)3 so you could use: htmlentities($in, 3, 'UTF-8'); Eek! There's a reason constants are used -- to prevent future change breaking things! Also for readability, for anyone else who doesn't happen to know that ENT_QUOTES == 3. Given the manual doesn't generally document constant values either, it can be a pain in the arse to work out these kind of things. Quote Link to comment Share on other sites More sharing options...
doubledee Posted May 31, 2012 Author Share Posted May 31, 2012 Kicken, Can you please help me understand your code? (I'm still pretty shaky on array...) I just created a wrapper for htmlentities with my defaults and a shorter name: function hent($str, $type=ENT_QUOTES, $char='UTF-8'){ return htmlentities($str, $type, $char); } To add your array support one could do: function hent($str, $type=ENT_QUOTES, $char='UTF-8'){ if (is_array($str)){ foreach ($str as &$v){ $v=hent($v, $type, $char); } return $str; } else { return htmlentities($str, $type, $char); } } What is going in here... foreach ($str as &$v){ Debbie Quote Link to comment Share on other sites More sharing options...
Adam Posted May 31, 2012 Share Posted May 31, 2012 Which bit of foreach($str as &$v) do you not understand exactly? I would suggest reading the manual for foreach loops. Quote Link to comment Share on other sites More sharing options...
doubledee Posted May 31, 2012 Author Share Posted May 31, 2012 Which bit of foreach($str as &$v) do you not understand exactly? I would suggest reading the manual for foreach loops. I'm used to seeing... foreach($input as $key => $value){ } I don't know why the $key was left out, or what &$v means... Debbie Quote Link to comment Share on other sites More sharing options...
Adam Posted May 31, 2012 Share Posted May 31, 2012 The key was left out because it isn't needed. "&" means loop through by reference. Instead of putting a copy of each array item into the variable, the variable is just a reference. That means you can modify the original array by just changing $value. The following would produce the same result: foreach ($array as $key => $value) { $array[$key] = htmlentities($value); } foreach ($array as &$value) { $value = htmlentities($value); } Worth noting though that after the second method, $value would still exist as a reference. In this situation it's not a problem because nothing happens afterwards (except the array is returned to the parent caller), but if anything else did happen after within that function you should always unset the reference. Quote Link to comment Share on other sites More sharing options...
doubledee Posted May 31, 2012 Author Share Posted May 31, 2012 I tweaked the function like this... function str2htmlentities($input, $type=ENT_QUOTES, $char='UTF-8'){ if (is_array($input)){ foreach ($input as $key => $value){ $v = str2htmlentities($value, $type, $char); } return $input; }else{ return htmlentities($input, $type, $char); } } Is that way okay?? It seems to work, but I am still a little shaky on what is going on even as I step through the code in NetBeans. (NetBeans takes a few strange hops as you cycle through everything?!) Debbie Quote Link to comment Share on other sites More sharing options...
Andy-H Posted May 31, 2012 Share Posted May 31, 2012 I tweaked the function like this... function str2htmlentities($input, $type=ENT_QUOTES, $char='UTF-8'){ if (is_array($input)){ foreach ($input as $key => $value){ $v = str2htmlentities($value, $type, $char); } return $input; }else{ return htmlentities($input, $type, $char); } } Is that way okay?? It seems to work, but I am still a little shaky on what is going on even as I step through the code in NetBeans. (NetBeans takes a few strange hops as you cycle through everything?!) Debbie $v doesn't exist there, should be: function str2htmlentities($input, $type=ENT_QUOTES, $char='UTF-8'){ if (is_array($input)){ foreach ($input as $key => $value){ $input[$key] = str2htmlentities($value, $type, $char); } return $input; }else{ return htmlentities($input, $type, $char); } } Quote Link to comment Share on other sites More sharing options...
doubledee Posted May 31, 2012 Author Share Posted May 31, 2012 $v doesn't exist there, should be: $input[$key] = str2htmlentities($value, $type, $char); Wow! Good catch!! Okay, so to be sure it is... function str2htmlentities($input, $type=ENT_QUOTES, $char='UTF-8'){ if (is_array($input)){ foreach ($input as $key => $value){ $input[$key] = str2htmlentities($value, $type, $char); } return $input; }else{ return htmlentities($input, $type, $char); } } Right? (BTW, how do I know this function is actually working?! I mean had it not been for you and that last catch, I would have never know that things weren't coded/working properly, because neither stepping through NetBeans, nor looking at the output gave anything away... -------------- And how does that compare to Psycho's... function entities($input){ if (is_array($input)){ return array_map('entities', $input); } return htmlentities($input, ENT_QUOTES, 'UTF-8'); } Is there any reason why I would want to use one versus the other? My main goal of this exercise - in addition to obviously streamlining how I use HTMLEntities - was to be able to handle Multi-Dimensional Arrays should they come up. Thanks, Debbie Quote Link to comment Share on other sites More sharing options...
Adam Posted May 31, 2012 Share Posted May 31, 2012 Both can handle multi-dimensional arrays, but I don't see much reason to manually loop through the array, when you can use a native function to do it for you. Quote Link to comment Share on other sites More sharing options...
Andy-H Posted May 31, 2012 Share Posted May 31, 2012 Both can handle multi-dimensional arrays, but I don't see much reason to manually loop through the array, when you can use a native function to do it for you. Although her code allows you to set the ENC type flag and character encoding. @Debbie array_map will be faster than foreach as it's a native PHP function so the looping is executed in C and doesn't need to be interpreted. Although the difference in execution time will not be significant in this case. If you are running PHP >= 5.3 and are pretty sure you always will, my array_map_recursive code will allow you to specify parameters and use a native function, but to be honest, if you understand how your function works, just go with that. As for knowing whether the function has worked: function str2htmlentities($input, $type=ENT_QUOTES, $char='UTF-8'){ if (is_array($input)){ foreach ($input as $key => $value){ $input[$key] = str2htmlentities($value, $type, $char); } return $input; }else{ return htmlentities($input, $type, $char); } } $array = array('<>', array('"<>"'), '">'); $array = str2htmlentities($array); echo '<pre>'. print_r($array, 1); View page source should output < as > > as < and " as " Quote Link to comment Share on other sites More sharing options...
scootstah Posted May 31, 2012 Share Posted May 31, 2012 (BTW, how do I know this function is actually working?! I mean had it not been for you and that last catch, I would have never know that things weren't coded/working properly, because neither stepping through NetBeans, nor looking at the output gave anything away... Run the string: <b>this is bold</b> through the function. If you get: <b>this is bold</b> then it works. If you get: this is bold then it doesn't work. Quote Link to comment Share on other sites More sharing options...
Psycho Posted May 31, 2012 Share Posted May 31, 2012 (BTW, how do I know this function is actually working?! I mean had it not been for you and that last catch, I would have never know that things weren't coded/working properly, because neither stepping through NetBeans, nor looking at the output gave anything away... Run the string: <b>this is bold</b> through the function. If you get: <b>this is bold</b> then it works. If you get: this is bold then it doesn't work. @scootstah: But, you'd also want to throw it a multi-dimensional array with values that would need to be escaped as well. @Debbie: You know what the function is supposed to do, so you should know how to test it to see if it works. Writing some code and posting it here asking if it will work or not can be perceived as arrogant. The function is supposed to encode strings values or string within arrays (including multidimensional arrays). So, pass it both types of values and verify the results. $testString = "<b>This is a bold string value</b>"; $testArray = array(array("<b>This is a bold mutidimensional array value</b>")); echo "<br>String before encoding: " . $testString; echo "<br>String after encoding: " . entities($testString); echo "<br>Array before encoding: <pre>" . print_r($testArray, true) . "</pre>"; echo "<br>Array after encoding: <pre>" . print_r(entities($testArray), true) . "</pre>"; Expected Output (using my function): String before encoding: This is a bold string value String after encoding: <b>This is a bold string value</b> Array before encoding: Array ( [ 0 ] => Array ( [ 0 ] => This is a bold mutidimensional array value ) ) Array after encoding: Array ( [ 0 ] => Array ( [ 0 ] => <b>This is a bold mutidimensional array value</b> ) ) Quote Link to comment Share on other sites More sharing options...
doubledee Posted May 31, 2012 Author Share Posted May 31, 2012 View page source should output < as > > as < and " as " THAT was the key thing I was missing and that a lot of people fail to point out!!! It doesn't matter if you see < or > or <> on your screen, it is how it is being displayed in the View Source that matters... Thanks!!! Debbie Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.