Jump to content

rawurldecode() not decoding %20 as white space


jbonnett

Recommended Posts

Hey all,

 

My URL is being sent as http://localhost/videos/On%20the%20go although if I use rawurldecode("On%20the%20go") I get "Onthego" rather than "On the go"...

 

Any ideas? This should work, I've also tried urldecode() as it's a similar method but had no luck.

How are you retrieving that URI? Can we see some actual code?

 

echo rawurldecode("On%20the%20go"); running this produces "On the go" as expected, for me anyway.

 

If I don't encode the URL or decode it e.g. keep the white space in the URL, it works fine.

I'm have my own MVC, I'm retrieving the URI via $_GET request forwarded to a function using call_user_func_array(), so I then do something like this:

some_function($get = "") {
    if($get != "") {
        echo rawurldecode($get);
    }
}

Nevermind I found the problem:

private function parseURL() {
    if(isset($_GET["url"])) {
        return explode("/", filter_var(rtrim($_GET["url"], "/"), FILTER_SANITIZE_URL));
    }
}
I fixed it by:

private function parseURL() {
    if(isset($_GET["url"])) {
        return explode("/", str_replace('\/', ' ', filter_var(rtrim(str_replace(' ', '\/', $_GET["url"]), "/"), FILTER_SANITIZE_URL)));
    }
}
Do you have any ideas to make this better? Maybe a different character to make the spaces?

When you call $_GET['url'], PHP automatically decodes it. So when you run it through filter_var(), the string has a space in it which is an illegal character, and is thus removed.

 

I've played with it for a bit and I can't see a real clean way to do what you want. I came up with this: (warning - I'm very tired right now. This could be terrible)

$parts = explode("/", trim($_GET['url'], "/"));
$parts = array_map(function($part){
	return urldecode(filter_var(urlencode($part), FILTER_SANITIZE_URL));
}, $parts);
This will break up your parameter without sanitizing it, and then it will iterate over each part and do the following: urlencode (to convert the decoded spaces back to +), filter_var (to remove illegal characters), urldecode (to convert any encoded characters back again).

 

Pretty hacky.

 

The way I would do it though, is to just not use filter_var. I would just use regex to replace illegal characters, since there is a whole bunch of characters allowed in FILTER_SANITIZE_URL that really don't need to be in a URL.

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.