Jump to content

Replace utf8 character in a certain position of a string


filoaman

Recommended Posts

I have a utf8 string and I'm trying to replace some of the utf8 charachters with equivalent "plain latin characters" in certain positions of the string.

In this test i try to replace the first " î " character with " i ".
I found that the position in the string for the utf8 chracter i like to replace is 2. So i excecute substr_replace but i get a strange result.

Here is the code:
$str="Thîs îs ã ütf8 strîng";
// try to replace the first " î " (position #2)

$str = substr_replace($str, "i", 2, 1);
// i get this "Thi�s îs ã ütf8 strîng

Any ideas?

 

Thanks in advance.

 

  On 9/12/2013 at 12:09 PM, cataiin said:

 

Thank you for your answer. I read the material but really i can't find how this can solve my problem.

The material is about "conv" function which "Convert string to requested character encoding".

In my case i don't want to convert the encoding, i just want to replace a certain character of a string.

Do i need to convert the encoding? 

<?php
$text = "Thîs îs ã ütf8 strîng";
$replace = array(
	'i' => array('î'),
	'a' => array('ã'),
	'u' => array('ü')
	);
foreach ($replace as $changed => $initial)
{
	$text = str_replace($initial, $changed, $text);
}
echo $text;
?>
iconv example:

<?php
$text = "Thîs îs ã ütf8 strîng";
echo iconv("UTF-8", "ISO-8859-1//TRANSLIT", $text);
?>
But this will remove diacritics, and is not what you want. Use first code, sorry.
  On 9/12/2013 at 1:08 PM, cataiin said:
<?php
$text = "Thîs îs ã ütf8 strîng";
$replace = array(
	'i' => array('î'),
	'a' => array('ã'),
	'u' => array('ü')
	);
foreach ($replace as $changed => $initial)
{
	$text = str_replace($initial, $changed, $text);
}
echo $text;
?>

 

This will replace ALL characters. i only want to replace CERTAIN characters in certain positions of the sting, not all the characters. 

  On 9/12/2013 at 11:43 AM, filoaman said:

 

$str = substr_replace($str, "i", 2, 1);
// i get this "Thi�s îs ã ütf8 strîng
Any ideas?

 

$str = substr_replace($str, "i", 2, 2);
If you read about the UTF8 encoding, you'll notice that a single character can be stored as anywhere from 1 to 6 bytes. In the case of î, it is using 2 bytes so you need to use a length of 2 in your substr_replace.

Yes, this do the job! I search for an on-line source to check the byte length  of all my utf8 characters i like to replace (thanks goad all of them are 2 bytes, so i don't have to make different routine for every character) and the problem solved.

Thank's kicken.

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.