Replace utf8 character in a certain position of a string

filoaman · September 12, 2013

I have a utf8 string and I'm trying to replace some of the utf8 charachters with equivalent "plain latin characters" in certain positions of the string.

In this test i try to replace the first " î " character with " i ".

I found that the position in the string for the utf8 chracter i like to replace is 2. So i excecute substr_replace but i get a strange result.

Here is the code:

$str="Thîs îs ã ütf8 strîng";
// try to replace the first " î " (position #2)

$str = substr_replace($str, "i", 2, 1);
// i get this "Thi�s îs ã ütf8 strîng

Any ideas?

Thanks in advance.

cataiin · September 12, 2013

http://ca3.php.net/manual/en/function.iconv.php

filoaman · September 12, 2013

http://ca3.php.net/manual/en/function.iconv.php

Thank you for your answer. I read the material but really i can't find how this can solve my problem.

The material is about "conv" function which "Convert string to requested character encoding".

In my case i don't want to convert the encoding, i just want to replace a certain character of a string.

Do i need to convert the encoding?

cataiin · September 12, 2013

<?php
$text = "Thîs îs ã ütf8 strîng";
$replace = array(
	'i' => array('î'),
	'a' => array('ã'),
	'u' => array('ü')
	);
foreach ($replace as $changed => $initial)
{
	$text = str_replace($initial, $changed, $text);
}
echo $text;
?>

iconv example:

<?php
$text = "Thîs îs ã ütf8 strîng";
echo iconv("UTF-8", "ISO-8859-1//TRANSLIT", $text);
?>

But this will remove diacritics, and is not what you want. Use first code, sorry.

filoaman · September 12, 2013

<?php
$text = "Thîs îs ã ütf8 strîng";
$replace = array(
	'i' => array('î'),
	'a' => array('ã'),
	'u' => array('ü')
	);
foreach ($replace as $changed => $initial)
{
	$text = str_replace($initial, $changed, $text);
}
echo $text;
?>

This will replace ALL characters. i only want to replace CERTAIN characters in certain positions of the sting, not all the characters.

kicken · September 12, 2013

$str = substr_replace($str, "i", 2, 1);
// i get this "Thi�s îs ã ütf8 strîng

Any ideas?

$str = substr_replace($str, "i", 2, 2);

If you read about the UTF8 encoding, you'll notice that a single character can be stored as anywhere from 1 to 6 bytes. In the case of î, it is using 2 bytes so you need to use a length of 2 in your substr_replace.

filoaman · September 13, 2013

Yes, this do the job! I search for an on-line source to check the byte length of all my utf8 characters i like to replace (thanks goad all of them are 2 bytes, so i don't have to make different routine for every character) and the problem solved.

Thank's kicken.

requinix · September 13, 2013

mb_substr can worry about the byte encoding for you. No mb_substr_replace() though.

$str = mb_substr($str, 0, 2, "UTF-8") . "i" . mb_substr($str, 3, null, "UTF-8");

Sign In

Replace utf8 character in a certain position of a string

Recommended Posts

filoaman

Link to comment

Share on other sites

cataiin

Link to comment

Share on other sites

filoaman

Link to comment

Share on other sites

cataiin

Link to comment

Share on other sites

filoaman

Link to comment

Share on other sites

kicken

Link to comment

Share on other sites

filoaman

Link to comment

Share on other sites

requinix

Link to comment

Share on other sites

Archived

Browse

Activity

Important Information