secweb Posted September 15, 2015 Share Posted September 15, 2015 More hardening... Should you always use urlencode / rawurlencode for links and GET requests? Is one better than the other? It appears you don't have to decode, is that true? Test Play: if(isset($_GET['cmd'])){ if($_GET['cmd']=="harp's"){ //if(rawurldecode($_GET['cmd'])=="harp's"){ echo "true<br />"; } else{ echo "false<br />"; } } $s1="harp's"; echo "<br /><br />"; echo "s1: ".$s1."<br />"; echo "<br /><br />"; echo "<a href='?cmd=".$s1."'>Test 1</a><br />\n"; echo "<a href='?cmd=".rawurlencode($s1)."'>Test 2</a><br />\n"; Quote Link to comment Share on other sites More sharing options...
hansford Posted September 15, 2015 Share Posted September 15, 2015 http://php.net/manual/en/function.urlencode.php "cleaning the URL",Totally and in a nut shell: 1.You must use rawurlencode() for parts that come before "?" 2.Use urlencode for all GET parameters (values that come after each "=")(POST parameters are automatically encoded). 3.Use htmlspecialchars for HTML tag parameters and HTML text content <?php $url_page = 'example/page/url.php'; //page the link will request $text = 'this is a simple string'; $id = '4334%3434'; $linktext = "<Clickit> & you will see it"; //text of the link, with HTML unfriendly characters ?> <?php // this gives you a clean link to use $url = "http://localhost/"; $url .= rawurlencode($url_page); $url .= "?text=" . urlencode($text); $url .= "&id=" . urlencode($id); // htmlspecialchars escapes any html that // might do bad things to your html page ?> <a href="<?php echo htmlspecialchars($url); ?>"> <?php echo htmlspecialchars($linktext); ?> </a> Quote Link to comment Share on other sites More sharing options...
Jacques1 Posted September 16, 2015 Share Posted September 16, 2015 It's very important to specify the character encoding of the HTML document when using htmlspecialchars(). Otherwise the function can become useless. You should also set the ENT_QUOTES flag so that both single and double quotes are escaped: E. g. htmlspecialchars($input, ENT_QUOTES | ENT_SUBSTITUTE | ENT_HTML5, 'UTF-8'); The ENT_SUBSTITUTE and ENT_HTML5 flags aren't critical. To avoid encoding the URL parameters by hand (which is somewhat messy and error-prone), you can also use http_build_query() <?php $link = 'https://example.com?' . http_build_query(['x' => 'foo', 'y' => 'bar']); ?> a href="<?= htmlspecialchars($link, ENT_QUOTES | ENT_SUBSTITUTE | ENT_HTML5, 'UTF-8'); ?>">Link</a> Quote Link to comment Share on other sites More sharing options...
secweb Posted September 16, 2015 Author Share Posted September 16, 2015 Thankyou So here's my new wrapper function and tests: if(count($_GET)>0){ foreach($_GET as $k=>$v){ echo "GET['".$k."']: ".$v."<br />"; //echo "GET['".$k."']: ".urldecode($v)."<br />"; } } $link="test3.php"; echo "<br /><br />"; echo "<a href='".rx_url("")."'>test 1</a><br />"; echo "<a href='".rx_url($link)."'>test 2</a><br />"; echo "<a href='".rx_url($link,array("one","two"))."'>test 3</a><br />"; echo "<a href='".rx_url($link,array("one"=>"a's","two"=>"b"))."'>test 4</a><br />"; echo "<a href='".rx_url("",array("one"=>"a's","two"=>"b"))."'>test 5</a><br />"; echo "<a href='".rx_url("",array("one"=>"a's","two"=>null))."'>test 6</a><br />"; echo "<a href='".rx_url("",array("one"=>"a's","two"=>array(1,2,3)))."'>test 7</a><br />"; echo "<br /><br />"; function rx_url($link,$args=array()){ return htmlspecialchars(rawurlencode($link)."?".http_build_query($args), ENT_QUOTES | ENT_SUBSTITUTE | ENT_HTML5, 'UTF-8'); } Not that it matters to me really but test 7 fails even though there is an example on the manual page of a similar nature. I did also try with a prefix too. Oh, again, it appears you don't have to use urldecode() and that its handled automatically, is that true for all instances or just my install? Quote Link to comment Share on other sites More sharing options...
secweb Posted September 16, 2015 Author Share Posted September 16, 2015 Just put into practice and I got issues, mainly because I don't use a schema on my links (http / https). So after rawurlencode did its thing, it then treat the link as relative rather than absolute. return htmlspecialchars($link."?".http_build_query($args), ENT_QUOTES | ENT_SUBSTITUTE | ENT_HTML5, 'UTF-8'); Quote Link to comment Share on other sites More sharing options...
Jacques1 Posted September 16, 2015 Share Posted September 16, 2015 (edited) You mustn't apply URL-encoding to the entire URL, only the parts which actually need to be encoded. Encoding means that certain characters which have a special meaning in a URL (like slashes, question marks etc.) are turned into literal characters. For example, if you wish to insert a dynamic segment into the path, you'd encode that to prevent it from changing the structure of the URL: $link = 'https://mysite.com/users/' . rawurlencode($username); Without encoding, the $username variable could manipulate the path by injecting slashes, or it could inject an entire query string. Edited September 16, 2015 by Jacques1 Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.