filoaman Posted September 12, 2013 Share Posted September 12, 2013 I have a utf8 string and I'm trying to replace some of the utf8 charachters with equivalent "plain latin characters" in certain positions of the string. In this test i try to replace the first " î " character with " i ". I found that the position in the string for the utf8 chracter i like to replace is 2. So i excecute substr_replace but i get a strange result. Here is the code: $str="Thîs îs ã ütf8 strîng"; // try to replace the first " î " (position #2) $str = substr_replace($str, "i", 2, 1); // i get this "Thi�s îs ã ütf8 strîng Any ideas? Thanks in advance. Quote Link to comment Share on other sites More sharing options...
cataiin Posted September 12, 2013 Share Posted September 12, 2013 http://ca3.php.net/manual/en/function.iconv.php Quote Link to comment Share on other sites More sharing options...
filoaman Posted September 12, 2013 Author Share Posted September 12, 2013 http://ca3.php.net/manual/en/function.iconv.php Thank you for your answer. I read the material but really i can't find how this can solve my problem. The material is about "conv" function which "Convert string to requested character encoding". In my case i don't want to convert the encoding, i just want to replace a certain character of a string. Do i need to convert the encoding? Quote Link to comment Share on other sites More sharing options...
cataiin Posted September 12, 2013 Share Posted September 12, 2013 (edited) <?php $text = "Thîs îs ã ütf8 strîng"; $replace = array( 'i' => array('î'), 'a' => array('ã'), 'u' => array('ü') ); foreach ($replace as $changed => $initial) { $text = str_replace($initial, $changed, $text); } echo $text; ?>iconv example: <?php $text = "Thîs îs ã ütf8 strîng"; echo iconv("UTF-8", "ISO-8859-1//TRANSLIT", $text); ?>But this will remove diacritics, and is not what you want. Use first code, sorry. Edited September 12, 2013 by cataiin Quote Link to comment Share on other sites More sharing options...
filoaman Posted September 12, 2013 Author Share Posted September 12, 2013 <?php $text = "Thîs îs ã ütf8 strîng"; $replace = array( 'i' => array('î'), 'a' => array('ã'), 'u' => array('ü') ); foreach ($replace as $changed => $initial) { $text = str_replace($initial, $changed, $text); } echo $text; ?> This will replace ALL characters. i only want to replace CERTAIN characters in certain positions of the sting, not all the characters. Quote Link to comment Share on other sites More sharing options...
kicken Posted September 12, 2013 Share Posted September 12, 2013 $str = substr_replace($str, "i", 2, 1); // i get this "Thi�s îs ã ütf8 strîngAny ideas? $str = substr_replace($str, "i", 2, 2); If you read about the UTF8 encoding, you'll notice that a single character can be stored as anywhere from 1 to 6 bytes. In the case of î, it is using 2 bytes so you need to use a length of 2 in your substr_replace. Quote Link to comment Share on other sites More sharing options...
Solution filoaman Posted September 13, 2013 Author Solution Share Posted September 13, 2013 Yes, this do the job! I search for an on-line source to check the byte length of all my utf8 characters i like to replace (thanks goad all of them are 2 bytes, so i don't have to make different routine for every character) and the problem solved. Thank's kicken. Quote Link to comment Share on other sites More sharing options...
requinix Posted September 13, 2013 Share Posted September 13, 2013 mb_substr can worry about the byte encoding for you. No mb_substr_replace() though. $str = mb_substr($str, 0, 2, "UTF-8") . "i" . mb_substr($str, 3, null, "UTF-8"); Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.