Jump to content

How to convert unknown string format to UTF-8


Sardach

Recommended Posts

I have a problem with converting strings to UTF-8.

Iconv an get encoding function are not working properly.

 

 

I am curently using 'home made' function (listed below)

 

but still geting some error like this :

 

query: INSERT INTO `product_import` (`title`, `id_source`, `id_product_type`) VALUES (?, ?, ?)
exception: SQLSTATE[HY000]: General error: 1366 Incorrect string value: '\xA9 FunC...' for column 'title' at row 1
record: Array ( [0] => Dreamfall: The Longest Journey PL (PC) � FunCom [1] => 10 [2] => 30 ) 

 

If some now how to bulid up convert - table to cover all chars i would be very glad.

 

Home made function:

function fixMixedEncoding($string){
$fixArray=array(
//ó
pack("C",0xF3)=>'ó',
pack("CC",0xC3,0xB3)=>'ó',

pack("C",0xBF)=>'ż',
pack("CC",0xC5,0xBC)=>'ż',
);

$encodingArray=array(
//Ą
'Ą' => pack("CC",0xC4,0x84),
'Ą' => pack("CC",0xC4,0x84),
pack("C",0xA5)=> pack("CC",0xC4,0x84),
pack("C",0xA1)=> pack("CC",0xC4,0x84),

//ą
'±'=>pack("CC",0xC4,0x85),
'ą'=>pack("CC",0xC4,0x85),
'±'=>pack("CC",0xC4,0x85),
'ą'=>pack("CC",0xC4,0x85),
pack("C",0xB9)=>pack("CC",0xC4,0x85),
pack("C",0xB1)=>pack("CC",0xC4,0x85),

//Ć
'Ć'=>pack("CC",0xC4,0x86),
'Ć'=>pack("CC",0xC4,0x86),
pack("C",0xC6)=>pack("CC",0xC4,0x86),

//ć
'æ'=>pack("CC",0xC4,0x87),
'ć'=>pack("CC",0xC4,0x87),
'æ'=>pack("CC",0xC4,0x87),
'ć'=>pack("CC",0xC4,0x87),
pack("C",0xE6)=>pack("CC",0xC4,0x87),

//Ę
'Ę'=>pack("CC",0xC4,0x98),
'Ę'=>pack("CC",0xC4,0x98),
pack("C",0xCA)=>pack("CC",0xC4,0x98),

//ę
'ê'=>pack("CC",0xC4,0x99),
'ę'=>pack("CC",0xC4,0x99),
'ê'=>pack("CC",0xC4,0x99),
'ę'=>pack("CC",0xC4,0x99),
pack("C",0xEA)=>pack("CC",0xC4,0x99),

//Ł
'Ł'=>pack("CC",0xC5,0x81),
'Ł'=>pack("CC",0xC5,0x81),
pack("C",0xA3)=>pack("CC",0xC5,0x81),

//ł
'³'=>pack("CC",0xC5,0x82),
'ł'=>pack("CC",0xC5,0x82),
'³'=>pack("CC",0xC5,0x82),
'ł'=>pack("CC",0xC5,0x82),
pack("C",0xB3)=>pack("CC",0xC5,0x82),

//Ń
'Ń'=>pack("CC",0xC5,0x83),
'Ń'=>pack("CC",0xC5,0x83),
pack("C",0xD1)=>pack("CC",0xC5,0x83),

//ń
'ñ'=>pack("CC",0xC5,0x84),
'ń'=>pack("CC",0xC5,0x84),
'ñ'=>pack("CC",0xC5,0x84),
'ń'=>pack("CC",0xC5,0x84),
pack("C",0xF1)=>pack("CC",0xC5,0x84),

//Ó
'Ó'=>pack("CC",0xC3,0x93),
'Ó'=>pack("CC",0xC3,0x93),
'Ó'=>pack("CC",0xC3,0x93),
'Ó'=>pack("CC",0xC3,0x93),
pack("C",0xD3)=>pack("CC",0xC3,0x93),

//ó
'ó'=>pack("CC",0xC3,0xB3),
'ó'=>pack("CC",0xC3,0xB3),
'ó'=>pack("CC",0xC3,0xB3),
'ó'=>pack("CC",0xC3,0xB3),
pack("C",0xF3)=>pack("CC",0xC3,0xB3),

//Ś
'Ś'=>pack("CC",0xC5,0x9A),
'Ś'=>pack("CC",0xC5,0x9A),
pack("C",0x8C)=>pack("CC",0xC5,0x9A),
pack("C",0xA6)=>pack("CC",0xC5,0x9A),

//ś
'¶'=>pack("CC",0xC5,0x9B),
'¦'=>pack("CC",0xC5,0x9B),
'ś'=>pack("CC",0xC5,0x9B),
'¶'=>pack("CC",0xC5,0x9B),
'¦'=>pack("CC",0xC5,0x9B),
'ś'=>pack("CC",0xC5,0x9B),
pack("C",0x9C)=>pack("CC",0xC5,0x9B),
pack("C",0xB6)=>pack("CC",0xC5,0x9B),

//Ź
'Ź'=>pack("CC",0xC5,0xB9),
'Ź'=>pack("CC",0xC5,0xB9),
pack("C",0x8F)=>pack("CC",0xC5,0xB9),
pack("C",0xAC)=>pack("CC",0xC5,0xB9),

//ź
'¼'=>pack("CC",0xC5,0xBA),
'ź'=>pack("CC",0xC5,0xBA),
'¼'=>pack("CC",0xC5,0xBA),
'ź'=>pack("CC",0xC5,0xBA),
pack("C",0x9F)=>pack("CC",0xC5,0xBA),
pack("C",0xBC)=>pack("CC",0xC5,0xBA),

//Ż
'Ż'=>pack("CC",0xC5,0xBB),
'Ż'=>pack("CC",0xC5,0xBB),
pack("C",0xAF)=>pack("CC",0xC5,0xBB),

//ż
'¿'=>pack("CC",0xC5,0xBC),
'ż'=>pack("CC",0xC5,0xBC),
'¿'=>pack("CC",0xC5,0xBC),
'ż'=>pack("CC",0xC5,0xBC),
pack("C",0xBF)=>pack("CC",0xC5,0xBC),

);
$string=str_replace(array_keys($fixArray),array_values($fixArray),$string);
return str_replace(array_keys($encodingArray),array_values($encodingArray),$string);
}

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.