Jump to content

Archived

This topic is now archived and is closed to further replies.

secweb

urlencode

Recommended Posts

More hardening...

 

 

Should you always use urlencode / rawurlencode for links and GET requests?

Is one better than the other?

 

It appears you don't have to decode, is that true?

 

 

 

Test Play:

 

if(isset($_GET['cmd'])){
    if($_GET['cmd']=="harp's"){
    //if(rawurldecode($_GET['cmd'])=="harp's"){
        echo "true<br />";
    }
    else{
        echo "false<br />";
    }
}

$s1="harp's";

echo "<br /><br />";
echo "s1: ".$s1."<br />";

echo "<br /><br />";
echo "<a href='?cmd=".$s1."'>Test 1</a><br />\n";
echo "<a href='?cmd=".rawurlencode($s1)."'>Test 2</a><br />\n";

Share this post


Link to post
Share on other sites

 

"cleaning the URL",Totally and in a nut shell:

1.You must use rawurlencode() for parts that come before "?" 

2.Use urlencode for all GET parameters (values that come after each "=")(POST parameters are automatically encoded).

3.Use htmlspecialchars for HTML tag parameters and HTML text content


<?php
$url_page = 'example/page/url.php';
//page the link will request
$text = 'this is a simple string';
$id = '4334%3434';
$linktext = "<Clickit> & you will see it";
//text of the link, with HTML unfriendly characters
?>
<?php
// this gives you a clean link to use
$url = "http://localhost/";
$url .= rawurlencode($url_page);
$url .= "?text=" . urlencode($text);
$url .= "&id=" . urlencode($id);

// htmlspecialchars escapes any html that
// might do bad things to your html page
?>
<a href="<?php echo htmlspecialchars($url); ?>">
<?php echo htmlspecialchars($linktext); ?>
</a>

Share this post


Link to post
Share on other sites

It's very important to specify the character encoding of the HTML document when using htmlspecialchars(). Otherwise the function can become useless. You should also set the ENT_QUOTES flag so that both single and double quotes are escaped:

 

E. g.

htmlspecialchars($input, ENT_QUOTES | ENT_SUBSTITUTE | ENT_HTML5, 'UTF-8');

The ENT_SUBSTITUTE and ENT_HTML5 flags aren't critical.

 

 

 

To avoid encoding the URL parameters by hand (which is somewhat messy and error-prone), you can also use http_build_query()

<?php

$link = 'https://example.com?' . http_build_query(['x' => 'foo', 'y' => 'bar']);

?>

a href="<?= htmlspecialchars($link, ENT_QUOTES | ENT_SUBSTITUTE | ENT_HTML5, 'UTF-8'); ?>">Link</a>

Share this post


Link to post
Share on other sites

Thankyou

 

 

So here's my new wrapper function and tests:

 

if(count($_GET)>0){
    foreach($_GET as $k=>$v){
        echo "GET['".$k."']: ".$v."<br />";
        //echo "GET['".$k."']: ".urldecode($v)."<br />";
    }
}

$link="test3.php";

echo "<br /><br />";
echo "<a href='".rx_url("")."'>test 1</a><br />";
echo "<a href='".rx_url($link)."'>test 2</a><br />";
echo "<a href='".rx_url($link,array("one","two"))."'>test 3</a><br />";
echo "<a href='".rx_url($link,array("one"=>"a's","two"=>"b"))."'>test 4</a><br />";
echo "<a href='".rx_url("",array("one"=>"a's","two"=>"b"))."'>test 5</a><br />";
echo "<a href='".rx_url("",array("one"=>"a's","two"=>null))."'>test 6</a><br />";
echo "<a href='".rx_url("",array("one"=>"a's","two"=>array(1,2,3)))."'>test 7</a><br />";

echo "<br /><br />";


function rx_url($link,$args=array()){
    return htmlspecialchars(rawurlencode($link)."?".http_build_query($args), ENT_QUOTES | ENT_SUBSTITUTE | ENT_HTML5, 'UTF-8');
}

 

Not that it matters to me really but test 7 fails even though there is an example on the manual page of a similar nature. I did also try with a prefix too.

 

 

 

 

Oh, again, it appears you don't have to use urldecode() and that its handled automatically, is that true for all instances or just my install?

Share this post


Link to post
Share on other sites

Just put into practice and I got issues, mainly because I don't use a schema on my links (http / https). So after rawurlencode did its thing, it then treat the link as relative rather than absolute.

 

 

return htmlspecialchars($link."?".http_build_query($args), ENT_QUOTES | ENT_SUBSTITUTE | ENT_HTML5, 'UTF-8');

Share this post


Link to post
Share on other sites

You mustn't apply URL-encoding to the entire URL, only the parts which actually need to be encoded.

 

Encoding means that certain characters which have a special meaning in a URL (like slashes, question marks etc.) are turned into literal characters. For example, if you wish to insert a dynamic segment into the path, you'd encode that to prevent it from changing the structure of the URL:

$link = 'https://mysite.com/users/' . rawurlencode($username);

Without encoding, the $username variable could manipulate the path by injecting slashes, or it could inject an entire query string.

Share this post


Link to post
Share on other sites

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.