larson22 Posted September 19, 2010 Share Posted September 19, 2010 Can explain to me what this does exactly? Also, what are reserved characters in a URI do anyways? Quote Link to comment Share on other sites More sharing options...
.josh Posted September 19, 2010 Share Posted September 19, 2010 it changes certain characters like space, +, etc... and changes them to special values like space is %20. The reason for this is because those characters mean something special in different systems, so in order to ensure that the url gets read properly, those special characters are converted into something else, and then decoded later. For example, say you setup some function to accept a string and in that string there are values you parse and use a pipe as a delimiter: "foo|bar|blah" and then you would just split at the pipe delim and do whatever from there. But what if one of the values contains a pipe as part of the value? "foo|bar|bl|ah" The intention is to have 3 values: "foo" "bar" and "bl|ah" but your script doesn't know that, as far as it is concerned, you actually have 4 values. So in order to get around that, you have to make the script not recognize it as a delimiter. Sometimes it is simply escaped with a backslash, which is itself a form of encoding: "foo|bar|bl\|ah" Then the system will know that if the pipe is preceded by a backslash, treat it as a literal pipe and not a special character. But not every system follows this convention. Some systems use other codes to stand for the special symbols. The encodeURIComponent code for a pipe is %7C so it would make the string look like "foo|bar|bl%7Cah" and then you script would know that there are only 3 values and you would then decode the values and get and use your original "bl|ah" in a context where the pipe is not a special character. A more practical example in this context would be with a URL string... http://www.somesite.com/somepage.html?foo=bar&question=how are you?&answer=I am fine!&question2=what is 2 + 2?&answer2=2+2=4 Now imagine yourself trying to parse this example url. Task is to break it up into the various components. Protocol, host, domain, page, path, query string, query string values, etc... Can you notice from this example some red flags? For example, how do you know where the query string starts? Normally it is the ? but some of the values of the parameters also has a ? as part of the value so...how can you programatically know what is the query string delimiter and what is just part of a parameter value? Or with answer2=2+2=4 how do you know that the whole value is "2+2=4" and vs. trying to somehow parse that = sign as a different key=value? Or look how the forum rendered the url itself. Notice how the link broke at the first space? It failed to recognize the full string, because as far as it is concerned, in the context of this post, the url string stopped at the first space it encountered. Browsers are generally smarter about spaces in the url string if it's being put directly in the address bar or within href="...." tags because there are other overall delimiters it can look at. But not all browsers are smart like that... So yeah, overall, the point of url encoding is to make sure that urls don't end up breaking, because of certain characters that mean something special to the system(s) that process the url on different levels. Depending on the system(s) there are different conventions or ways of encoding the special characters, but overall the principle is the same. Quote Link to comment Share on other sites More sharing options...
larson22 Posted September 20, 2010 Author Share Posted September 20, 2010 wow, perfect explanation thanks Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.