Jump to content

Converting ASCII to UTF-8


lazarantal

Recommended Posts

Hi,

 

I have the following text: Utaz&AOE-s

 

If I check the encoding of this text with mb_detect_encoding, it says that the text is pure ASCII. My page's encoding is UTF-8, so on the webpage I should see Utazás (&AOE- is an UT8 code for á.

 

It doesnt matter what I do, the text is displayed as above. I've tried doing many things like converting the text to utf8, converting to utf7 than to utf8, ....

 

Nothing worked.

 

Any idea why cant I display the above text correcly?

 

I also have this line in my code: header('Content-Type: text/html; charset=utf-8');

 

Any idea is appreciated!

 

Thanks,

Tony

Link to comment
Share on other sites

Hi,

 

I have the following text: Utaz&AOE-s[/size]

 

If I check the encoding of this text with mb_detect_encoding, it says that the text is pure ASCII. My page's encoding is UTF-8, so on the webpage I should see Utazás (&AOE- is an UT8 code for á.

&AOE- is not the code for anything. Are you trying to use an html entity? Those have the pattern &...; For a specific UTF-8 character you would use ✏

 

If you have a utf8 string, then you should just be able to echo it out.

Link to comment
Share on other sites

&AOE- is not the code for anything. Are you trying to use an html entity? Those have the pattern &...; For a specific UTF-8 character you would use ✏

 

If you have a utf8 string, then you should just be able to echo it out.

 

According to this website the code is us-ascii code:

 

http://www.string-functions.com/encodingtable.aspx?encoding=65000&decoding=20127

 

What I would like to achieve is to echo the corresponding utf8 character. In this case -> á.

Link to comment
Share on other sites

According to this website the code is us-ascii code:

 

 

No, is not.

 

I think the us-ascii code for an "á" symbol is:

$str = "\xe1"; // á

echo utf8_encode($str);

Can you try to get a ascii code for an "á" symbol?

Link to comment
Share on other sites

No, is not.

 

I think the us-ascii code for an "á" symbol is:

$str = "\xe1"; // á

echo utf8_encode($str);

Can you try to get a ascii code for an "á" symbol?

 

echo ord('á');

 

This gives me 255, which is correct according to the extended ASCII table.

 

225 341 E1 11100001 á á á Latin small letter a with acute

 

However, the above text is returned by Gmail imap (via a socket connection), so I assume it is correct. I just can`t figure out the encoding. Actually the question is why Google returns &AOE- instead of á for an á.

Link to comment
Share on other sites

In the mean time, I found the answer. It is an UTF-7 encoded text. For some reasons (that I am not aware of), IMAP protocol still uses this old encoding. This is the site which helped me to figure out the solution.

 

http://fetchmail.berlios.de/Mailbox-Names-UTF7.html

 

In order to echo the text correctly, I just had to use the imap_utf7_decode function.

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.