Jump to content

Converting ASCII to UTF-8


lazarantal
Go to solution Solved by lazarantal,

Recommended Posts

Hi,

 

I have the following text: Utaz&AOE-s

 

If I check the encoding of this text with mb_detect_encoding, it says that the text is pure ASCII. My page's encoding is UTF-8, so on the webpage I should see Utazás (&AOE- is an UT8 code for á.

 

It doesnt matter what I do, the text is displayed as above. I've tried doing many things like converting the text to utf8, converting to utf7 than to utf8, ....

 

Nothing worked.

 

Any idea why cant I display the above text correcly?

 

I also have this line in my code: header('Content-Type: text/html; charset=utf-8');

 

Any idea is appreciated!

 

Thanks,

Tony

Link to comment
Share on other sites

Hi,

 

I have the following text: Utaz&AOE-s[/size]

 

If I check the encoding of this text with mb_detect_encoding, it says that the text is pure ASCII. My page's encoding is UTF-8, so on the webpage I should see Utazás (&AOE- is an UT8 code for á.

&AOE- is not the code for anything. Are you trying to use an html entity? Those have the pattern &...; For a specific UTF-8 character you would use ✏

 

If you have a utf8 string, then you should just be able to echo it out.

Link to comment
Share on other sites

&AOE- is not the code for anything. Are you trying to use an html entity? Those have the pattern &...; For a specific UTF-8 character you would use ✏

 

If you have a utf8 string, then you should just be able to echo it out.

 

According to this website the code is us-ascii code:

 

http://www.string-functions.com/encodingtable.aspx?encoding=65000&decoding=20127

 

What I would like to achieve is to echo the corresponding utf8 character. In this case -> á.

Link to comment
Share on other sites

No, is not.

 

I think the us-ascii code for an "á" symbol is:

$str = "\xe1"; // á

echo utf8_encode($str);

Can you try to get a ascii code for an "á" symbol?

 

echo ord('á');

 

This gives me 255, which is correct according to the extended ASCII table.

 

225 341 E1 11100001 á á á Latin small letter a with acute

 

However, the above text is returned by Gmail imap (via a socket connection), so I assume it is correct. I just can`t figure out the encoding. Actually the question is why Google returns &AOE- instead of á for an á.

Link to comment
Share on other sites

  • Solution

In the mean time, I found the answer. It is an UTF-7 encoded text. For some reasons (that I am not aware of), IMAP protocol still uses this old encoding. This is the site which helped me to figure out the solution.

 

http://fetchmail.berlios.de/Mailbox-Names-UTF7.html

 

In order to echo the text correctly, I just had to use the imap_utf7_decode function.

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.