Jump to content

encodeURIComponent


larson22

Recommended Posts

it changes certain characters like space, +, etc... and changes them to special values like space is %20.  The reason for this is because those characters mean something special in different systems, so in order to ensure that the url gets read properly, those special characters are converted into something else, and then decoded later.

 

For example, say you setup some function to accept a string and in that string there are values you parse and use a pipe as a delimiter:

 

"foo|bar|blah"

 

and then you would just split at the pipe delim and do whatever from there.

 

But what if one of the values contains a pipe as part of the value?

 

"foo|bar|bl|ah"

 

The intention is to have 3 values: "foo" "bar" and "bl|ah" but your script doesn't know that, as far as it is concerned, you actually have 4 values.  So in order to get around that, you have to make the script not recognize it as a delimiter.  Sometimes it is simply escaped with a backslash, which is itself a form of encoding:

 

"foo|bar|bl\|ah"

 

Then the system will know that if the pipe is preceded by a backslash, treat it as a literal pipe and not a special character.  But not every system follows this convention.  Some systems use other codes to stand for the special symbols. The encodeURIComponent code for a pipe is %7C so it would make the string look like

 

"foo|bar|bl%7Cah"

 

and then you script would know that there are only 3 values and you would then decode the values and get and use your original "bl|ah" in a context where the pipe is not a special character.

 

 

A more practical example in this context would be with a URL string...

 

http://www.somesite.com/somepage.html?foo=bar&question=how are you?&answer=I am fine!&question2=what is 2 + 2?&answer2=2+2=4

 

Now imagine yourself trying to parse this example url.  Task is to break it up into the various components.  Protocol, host, domain, page, path, query string, query string values, etc...

 

Can you notice from this example some red flags?  For example, how do you know where the query string starts? Normally it is the ? but some of the values of the parameters also has a ? as part of the value so...how can you programatically know what is the query string delimiter and what is just part of a parameter value? Or with answer2=2+2=4  how do you know that the whole value is "2+2=4" and vs. trying to somehow parse that = sign as a different key=value?  Or look how the forum rendered the url itself. Notice how the link broke at the first space?  It failed to recognize the full string, because as far as it is concerned, in the context of this post, the url string stopped at the first space it encountered. Browsers are generally smarter about spaces in the url string if it's being put directly in the address bar or within href="...." tags because there are other overall delimiters it can look at.  But not all browsers are smart like that...

 

 

 

So yeah, overall, the point of url encoding is to make sure that urls don't end up breaking, because of certain characters that mean something special to the system(s) that process the url on different levels.  Depending on the system(s) there are different conventions or ways of encoding the special characters, but overall the principle is the same.

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.