asmith Posted December 17, 2009 Share Posted December 17, 2009 Hey guys I'm trying to get some text from various parts of a file. I have converted the file to hex (bin2hex) and I've got this: 64 72 e6 62 65 72 (without spaces) I'm converting that back by: <?php function hex2bin($h) { if (!is_string($h)) return null; $r=''; for ($a=0; $a<strlen($h); $a+=2) $r .= chr(hexdec($h{$a}.$h{($a+1)})); return $r; } ?> It works fine, except for unicode characters. For example, the above hex is giving me: dr�ber While it must be: dræber How can I get the correct word? Thanks a lot Quote Link to comment https://forums.phpfreaks.com/topic/185471-cant-get-the-unicode-character/ Share on other sites More sharing options...
oni-kun Posted December 17, 2009 Share Posted December 17, 2009 Try this: It'll retain the codepoints: <?php define('HEX2BIN_WS', " \t\n\r"); function hex2bin($hex_string) { $pos = 0; $result = ''; while ($pos < strlen($hex_string)) { if (strpos(HEX2BIN_WS, $hex_string{$pos}) !== FALSE) { $pos++; } else { $code = hexdec(substr($hex_string, $pos, 2)); $pos = $pos + 2; $result .= chr($code); } } return $result; } echo hex2bin('6472e6626572'); //Returns 'dræber' ?> Quote Link to comment https://forums.phpfreaks.com/topic/185471-cant-get-the-unicode-character/#findComment-979186 Share on other sites More sharing options...
asmith Posted December 17, 2009 Author Share Posted December 17, 2009 Thanks for your reply. It is working fine I guess. But when I have this on my pages: <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /> It kinda mess with it again. Any idea? Quote Link to comment https://forums.phpfreaks.com/topic/185471-cant-get-the-unicode-character/#findComment-979196 Share on other sites More sharing options...
oni-kun Posted December 17, 2009 Share Posted December 17, 2009 Thanks for your reply. It is working fine I guess. But when I have this on my pages: <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /> It kinda mess with it again. Any idea? ASCII is the default character set from binary, you need to simply encode it in unicode, try this example: <html> <head> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /> </head> <body> <?php define('HEX2BIN_WS', " \t\n\r"); function hex2bin($hex_string) { $pos = 0; $result = ''; while ($pos < strlen($hex_string)) { if (strpos(HEX2BIN_WS, $hex_string{$pos}) !== FALSE) { $pos++; } else { $code = hexdec(substr($hex_string, $pos, 2)); $pos = $pos + 2; $result .= chr($code); } } return utf8_encode($result); } echo hex2bin('6472e6626572'); ?> </body </html> And it should return it correctly. If it doesn't, than add the header from serverside.. Quote Link to comment https://forums.phpfreaks.com/topic/185471-cant-get-the-unicode-character/#findComment-979201 Share on other sites More sharing options...
asmith Posted December 17, 2009 Author Share Posted December 17, 2009 Thank you mate!! Thanks a TON!! Quote Link to comment https://forums.phpfreaks.com/topic/185471-cant-get-the-unicode-character/#findComment-979204 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.