Jump to content

[SOLVED] Decode Unicode String


everisk

Recommended Posts

The string can be anything based on what user input. As of now I'm testing with Thai characters but I think there is a function that decode Unicode character to their correct language (just like http://www.crypo.com/eng_urld.php <-- this is actually what I expect). The page uses UTF-8 as encoding. The result is the character corresponding to the Unicode. For Thai language the code table is http://www.unicode.org/charts/PDF/U0E00.pdf.

 

My test string is %u0E19%u0E32 which should output นา

<?php
header('Content-type: text/html; charset=utf-8');

function unicode2utf8($c)
{
    $output="";
   
    if($c < 0x80)
    {
        return chr($c);
    }
    else if($c < 0x800)
    {
        return chr( 0xc0 | ($c >> 6) ).chr( 0x80 | ($c & 0x3f) );
    }
    else if($c < 0x10000)
    {
        return chr( 0xe0 | ($c >> 12) ).chr( 0x80 | (($c >> 6) & 0x3f) ).chr( 0x80 | ($c & 0x3f) );
    }
    else if($c < 0x200000)
    {
        return chr(0xf0 | ($c >> 18)).chr(0x80 | (($c >> 12) & 0x3f)).chr(0x80 | (($c >> 6) & 0x3f)).chr(0x80 | ($c & 0x3f));
    }
    return false;
}

$string = '%u0E19%u0E32';

$string = preg_replace('#%u([0-9a-f]+)#ie', 'unicode2utf8(0x$1)', $string);

echo $string;

 

The unicode2utf8() function came from: http://php.net/manual/en/function.unicode-encode.php#79829

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.