Jump to content

urlencode


secweb

Recommended Posts

More hardening...

 

 

Should you always use urlencode / rawurlencode for links and GET requests?

Is one better than the other?

 

It appears you don't have to decode, is that true?

 

 

 

Test Play:

 

if(isset($_GET['cmd'])){
    if($_GET['cmd']=="harp's"){
    //if(rawurldecode($_GET['cmd'])=="harp's"){
        echo "true<br />";
    }
    else{
        echo "false<br />";
    }
}

$s1="harp's";

echo "<br /><br />";
echo "s1: ".$s1."<br />";

echo "<br /><br />";
echo "<a href='?cmd=".$s1."'>Test 1</a><br />\n";
echo "<a href='?cmd=".rawurlencode($s1)."'>Test 2</a><br />\n";
Link to comment
Share on other sites


 

"cleaning the URL",Totally and in a nut shell:

1.You must use rawurlencode() for parts that come before "?" 

2.Use urlencode for all GET parameters (values that come after each "=")(POST parameters are automatically encoded).

3.Use htmlspecialchars for HTML tag parameters and HTML text content


<?php
$url_page = 'example/page/url.php';
//page the link will request
$text = 'this is a simple string';
$id = '4334%3434';
$linktext = "<Clickit> & you will see it";
//text of the link, with HTML unfriendly characters
?>
<?php
// this gives you a clean link to use
$url = "http://localhost/";
$url .= rawurlencode($url_page);
$url .= "?text=" . urlencode($text);
$url .= "&id=" . urlencode($id);

// htmlspecialchars escapes any html that
// might do bad things to your html page
?>
<a href="<?php echo htmlspecialchars($url); ?>">
<?php echo htmlspecialchars($linktext); ?>
</a>
Link to comment
Share on other sites

It's very important to specify the character encoding of the HTML document when using htmlspecialchars(). Otherwise the function can become useless. You should also set the ENT_QUOTES flag so that both single and double quotes are escaped:

 

E. g.

htmlspecialchars($input, ENT_QUOTES | ENT_SUBSTITUTE | ENT_HTML5, 'UTF-8');

The ENT_SUBSTITUTE and ENT_HTML5 flags aren't critical.

 

 

 

To avoid encoding the URL parameters by hand (which is somewhat messy and error-prone), you can also use http_build_query()

<?php

$link = 'https://example.com?' . http_build_query(['x' => 'foo', 'y' => 'bar']);

?>

a href="<?= htmlspecialchars($link, ENT_QUOTES | ENT_SUBSTITUTE | ENT_HTML5, 'UTF-8'); ?>">Link</a>

Link to comment
Share on other sites

Thankyou

 

 

So here's my new wrapper function and tests:

 

if(count($_GET)>0){
    foreach($_GET as $k=>$v){
        echo "GET['".$k."']: ".$v."<br />";
        //echo "GET['".$k."']: ".urldecode($v)."<br />";
    }
}

$link="test3.php";

echo "<br /><br />";
echo "<a href='".rx_url("")."'>test 1</a><br />";
echo "<a href='".rx_url($link)."'>test 2</a><br />";
echo "<a href='".rx_url($link,array("one","two"))."'>test 3</a><br />";
echo "<a href='".rx_url($link,array("one"=>"a's","two"=>"b"))."'>test 4</a><br />";
echo "<a href='".rx_url("",array("one"=>"a's","two"=>"b"))."'>test 5</a><br />";
echo "<a href='".rx_url("",array("one"=>"a's","two"=>null))."'>test 6</a><br />";
echo "<a href='".rx_url("",array("one"=>"a's","two"=>array(1,2,3)))."'>test 7</a><br />";

echo "<br /><br />";


function rx_url($link,$args=array()){
    return htmlspecialchars(rawurlencode($link)."?".http_build_query($args), ENT_QUOTES | ENT_SUBSTITUTE | ENT_HTML5, 'UTF-8');
}

 

Not that it matters to me really but test 7 fails even though there is an example on the manual page of a similar nature. I did also try with a prefix too.

 

 

 

 

Oh, again, it appears you don't have to use urldecode() and that its handled automatically, is that true for all instances or just my install?

Link to comment
Share on other sites

Just put into practice and I got issues, mainly because I don't use a schema on my links (http / https). So after rawurlencode did its thing, it then treat the link as relative rather than absolute.

 

 

return htmlspecialchars($link."?".http_build_query($args), ENT_QUOTES | ENT_SUBSTITUTE | ENT_HTML5, 'UTF-8');
Link to comment
Share on other sites

You mustn't apply URL-encoding to the entire URL, only the parts which actually need to be encoded.

 

Encoding means that certain characters which have a special meaning in a URL (like slashes, question marks etc.) are turned into literal characters. For example, if you wish to insert a dynamic segment into the path, you'd encode that to prevent it from changing the structure of the URL:

$link = 'https://mysite.com/users/' . rawurlencode($username);

Without encoding, the $username variable could manipulate the path by injecting slashes, or it could inject an entire query string.

Edited by Jacques1
Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.