asmith Posted December 17, 2009 Share Posted December 17, 2009 Hey guys I'm trying to get some text from various parts of a file. I have converted the file to hex (bin2hex) and I've got this: 64 72 e6 62 65 72 (without spaces) I'm converting that back by: <?php function hex2bin($h) { if (!is_string($h)) return null; $r=''; for ($a=0; $a<strlen($h); $a+=2) $r .= chr(hexdec($h{$a}.$h{($a+1)})); return $r; } ?> It works fine, except for unicode characters. For example, the above hex is giving me: dr�ber While it must be: dræber How can I get the correct word? Thanks a lot Link to comment https://forums.phpfreaks.com/topic/185471-cant-get-the-unicode-character/ Share on other sites More sharing options...
oni-kun Posted December 17, 2009 Share Posted December 17, 2009 Try this: It'll retain the codepoints: <?php define('HEX2BIN_WS', " \t\n\r"); function hex2bin($hex_string) { $pos = 0; $result = ''; while ($pos < strlen($hex_string)) { if (strpos(HEX2BIN_WS, $hex_string{$pos}) !== FALSE) { $pos++; } else { $code = hexdec(substr($hex_string, $pos, 2)); $pos = $pos + 2; $result .= chr($code); } } return $result; } echo hex2bin('6472e6626572'); //Returns 'dræber' ?> Link to comment https://forums.phpfreaks.com/topic/185471-cant-get-the-unicode-character/#findComment-979186 Share on other sites More sharing options...
asmith Posted December 17, 2009 Author Share Posted December 17, 2009 Thanks for your reply. It is working fine I guess. But when I have this on my pages: <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /> It kinda mess with it again. Any idea? Link to comment https://forums.phpfreaks.com/topic/185471-cant-get-the-unicode-character/#findComment-979196 Share on other sites More sharing options...
oni-kun Posted December 17, 2009 Share Posted December 17, 2009 Thanks for your reply. It is working fine I guess. But when I have this on my pages: <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /> It kinda mess with it again. Any idea? ASCII is the default character set from binary, you need to simply encode it in unicode, try this example: <html> <head> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /> </head> <body> <?php define('HEX2BIN_WS', " \t\n\r"); function hex2bin($hex_string) { $pos = 0; $result = ''; while ($pos < strlen($hex_string)) { if (strpos(HEX2BIN_WS, $hex_string{$pos}) !== FALSE) { $pos++; } else { $code = hexdec(substr($hex_string, $pos, 2)); $pos = $pos + 2; $result .= chr($code); } } return utf8_encode($result); } echo hex2bin('6472e6626572'); ?> </body </html> And it should return it correctly. If it doesn't, than add the header from serverside.. Link to comment https://forums.phpfreaks.com/topic/185471-cant-get-the-unicode-character/#findComment-979201 Share on other sites More sharing options...
asmith Posted December 17, 2009 Author Share Posted December 17, 2009 Thank you mate!! Thanks a TON!! Link to comment https://forums.phpfreaks.com/topic/185471-cant-get-the-unicode-character/#findComment-979204 Share on other sites More sharing options...
Recommended Posts
Archived
This topic is now archived and is closed to further replies.