Jump to content

UTF encoding, accents and such


LLLLLLL

Recommended Posts

I saw another thread here about utf8_encode, but that doesn't seem to help me, even though the issues seem about the same.

 

My application has a configuration area where you create products, and the ordering area where customers order them. I'm confused about accent handling, and character sets in general.

 

Some notes:

* When é is added to a field (configuration area) that displays as a textbox, the user saves and then the character displays as nonsense.

* When é is added to a textarea field in the configuration area, the user saves and then the character displays properly.

* In either case, the é character does not appear correctly in HTML in the customer ordering area.

 

Thoughts?

Link to comment
Share on other sites

Make sure all your pages are with the same character encoding. UTF-8 is one of the best choices.

 

When you mix-and-match them, a character will be encoded with method A on one page, then decoded with method B on another page. Many times that will leave you with mojibake.

utf8_encode() and utf8_decode() are one solution, but tend to require a fair bit more work than it would be to just fix the pages.

 


Alternatively there's header():

header("Content-Type: text/html; charset=utf-8");

Link to comment
Share on other sites

There is always issues when comes to different characters and encodings/languages, gotta work your way through them all as there is no best solution to this as of yet.

 

As some pointers, save all data to database as utf-8

Adding this line before the insert should ensure it

mysql_query("SET NAMES 'utf8'");

 

You need to find ways of detecting the encoding/language and then convert them as needed.

http://php.net/manual/en/function.mb-detect-encoding.php

 

Then convert it

http://www.php.net/manual/en/function.mb-convert-encoding.php

 

Is also another option to convert

http://php.net/manual/en/book.iconv.php

 

On display:

header("Content-Type:text/html; charset=UTF-8");

<meta charset="utf-8">

 

If is any characters that you need different can replace the characters yourself with maybe a custom function.

Link to comment
Share on other sites

QuickOldCar's solution didn't work for me (I had already tried that) but I will look into the idea from requinix.

 

I noticed that when my string was called with htmlentities( $string) it showed the jibberish, but htmlentities( $string, ENT_COMPAT, 'UTF-8' ) worked as expected. Clearly I'd rather set the page to be UTF-8 than to call this html entities stuff every time.

 

I hope what you're saying will work. I'll get back to you.

Link to comment
Share on other sites

<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

This works for simply displaying text. However, in the few places that my text is an attribute, I still need to call htmlentities( $str ), but I need to use all three parameters as explained above:

htmlentities( $str, ENT_COMPAT, 'UTF-8' )

 

So it looks like my solution is a combination of both. I guess that's the best PHP can do for me right now.

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.