Jump to content

Special characters


rgarrot

Recommended Posts

Hi People!

 

I'm from Brazil and I'm having some troubles with special characters when i'm getting an object from a REST web service.

 

after I send the GET method, I receive the return, and at the message body...

 

I should get this (in portuguese):

"União da Ilha"

"Braços abertos"

 

But i'm getting:

"Un]e0o da ilha"

"Bra{f3os abertos"

 

It is the special latim characters.

I've tried almost all codification methods, like utf8_encode, htmlentities, etc.

 

someone could help me ?

thks a lot!! (and sry my poor english).

Link to comment
https://forums.phpfreaks.com/topic/259094-special-characters/
Share on other sites

Normally mojibake shows as unusual characters, not things you'd find on a keyboard.

 

Can you post what you get when you run

echo bin2hex($string);

(if $string is the response)? Without any modifications to it.

 

Also try using mb_detect_encoding on that response. Sometimes it can tell you what encoding the string is, other times it doesn't know.

Link to comment
https://forums.phpfreaks.com/topic/259094-special-characters/#findComment-1328310
Share on other sites

with mb_detect_encoding, sometimes i'm getting ASCII and sometimes UTF-8.

 

ASCII -  Uni;e0o da Ilha

ASCII -  Unidos da Tijuca

ASCII -  Portela

ASCII -  Unidos de Vila Isabel

ASCII -  Uni;e0o do Parque Curicica

UTF-8 - Uni;e0o de Jacarepagu80

ASCII - Infantes do Lins

 

and with bin2hex() :

 

556e693b65306f20646120496c6861 - G.R.E.S Uni;e0o da Ilha do Governador

556e69646f732064612054696a756361 - G.R.E.S Unidos da Tijuca

506f7274656c61 - G.R.E.S Portela

556e69646f732064652056696c612049736162656c - G.R.E.S Unidos de Vila Isabel

556e693b65306f20646f20506172717565204375726963696361 - G.R.E.S Uni;e0o do Parque Curicica

556e693b65306f206465204a616361726570616775103263 - G.R.E.S Uni;e0o de Jacarepagu80

472e522e432e452e532e4d2e20496e66616e74657320646f204c696e73 - Infantes do Lins

 

any idea?

Link to comment
https://forums.phpfreaks.com/topic/259094-special-characters/#findComment-1328483
Share on other sites

I can't tell what it is. Accented characters are represented by some kind of a control character (like a semicolon or 0x10) followed by two hex digits but I don't see a correlation between those and the original character.

0x3B 0x65 0x30 = ã
0x10 0x32 0x63 = á

I suggest you contact the people who own the web service and ask them about this.

 

Short of that you can manually substitute those sequences with (for example) strtr() like

$string = strtr($string, array(
    "\x3B\x65\x30" => "ã",
    "\x10\x32\x63" => "á"
));

If you do this, be sure to save the file with this code in whatever encoding you want the characters to be. So save the file as UTF-8 if you want that, or ISO 8859-1 if you want that.

Link to comment
https://forums.phpfreaks.com/topic/259094-special-characters/#findComment-1328608
Share on other sites

OMG!

sometimes the same character is with different code...

 

I replaced:

Acadêmicos da Abolição - 41636164c382c2ab36396d69636f732064612041626f6c697865663b63306f -

 

but:

Cora}65?es Unidos do Amarelinho - 436f72617d36353f657320556e69646f7320646f20416d6172656c696e686f

 

"Corações Unidos do Amarelinho"

 

The "ç" is with a different codification...

 

 

 

Link to comment
https://forums.phpfreaks.com/topic/259094-special-characters/#findComment-1328639
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.