Yorick Posted December 21, 2010 Share Posted December 21, 2010 Hello, I've got a huge database that is filled with text. It is encoded in UTF8 and some of the symbols used (like emoticons) are encoded in the private use area of UTF8 (http://www.fileformat.info/info/unicode/block/private_use_area/utf8test.htm). Now I want to replace those codes of the private use area with the corresponding smilies etcetera. So actually my question is, how do I replace specific UTF8 codes with something else in PHP? Thanks in advance! Link to comment https://forums.phpfreaks.com/topic/222331-replacing-utf8-codes-in-private-use-area/ Share on other sites More sharing options...
requinix Posted December 21, 2010 Share Posted December 21, 2010 UTF-8 is just an encoding. Behind it are actual bytes of data. Hopefully utf8_encode() allows you to convert private use Unicode characters into UTF-8 sequences. Can't test where I am. U+E8B9 should be... 0xEEA2B9 I think. Get the byte encoding of whatever character, if you don't have that already, and do a binary-safe search-and-replace for each emoticon. If you want to do it in PHP, //$text = str_replace(utf8_encode("\xE8\xB9"), ":)", $text); $text = str_replace("\xEE\xA2\xB9", ":)", $text); Link to comment https://forums.phpfreaks.com/topic/222331-replacing-utf8-codes-in-private-use-area/#findComment-1150067 Share on other sites More sharing options...
Yorick Posted December 21, 2010 Author Share Posted December 21, 2010 UTF-8 is just an encoding. Behind it are actual bytes of data. Hopefully utf8_encode() allows you to convert private use Unicode characters into UTF-8 sequences. Can't test where I am. U+E8B9 should be... 0xEEA2B9 I think. Get the byte encoding of whatever character, if you don't have that already, and do a binary-safe search-and-replace for each emoticon. If you want to do it in PHP, //$text = str_replace(utf8_encode("\xE8\xB9"), "", $text); $text = str_replace("\xEE\xA2\xB9", "", $text); Great! That worked, thank you very much for the quick reply! Link to comment https://forums.phpfreaks.com/topic/222331-replacing-utf8-codes-in-private-use-area/#findComment-1150072 Share on other sites More sharing options...
Recommended Posts
Archived
This topic is now archived and is closed to further replies.