Jump to content

Replace utf8 character in a certain position of a string


Go to solution Solved by filoaman,

Recommended Posts

I have a utf8 string and I'm trying to replace some of the utf8 charachters with equivalent "plain latin characters" in certain positions of the string.

In this test i try to replace the first " î " character with " i ".
I found that the position in the string for the utf8 chracter i like to replace is 2. So i excecute substr_replace but i get a strange result.

Here is the code:
$str="Thîs îs ã ütf8 strîng";
// try to replace the first " î " (position #2)

$str = substr_replace($str, "i", 2, 1);
// i get this "Thi�s îs ã ütf8 strîng

Any ideas?

 

Thanks in advance.

 

 

Thank you for your answer. I read the material but really i can't find how this can solve my problem.

The material is about "conv" function which "Convert string to requested character encoding".

In my case i don't want to convert the encoding, i just want to replace a certain character of a string.

Do i need to convert the encoding? 

<?php
$text = "Thîs îs ã ütf8 strîng";
$replace = array(
	'i' => array('î'),
	'a' => array('ã'),
	'u' => array('ü')
	);
foreach ($replace as $changed => $initial)
{
	$text = str_replace($initial, $changed, $text);
}
echo $text;
?>
iconv example:

<?php
$text = "Thîs îs ã ütf8 strîng";
echo iconv("UTF-8", "ISO-8859-1//TRANSLIT", $text);
?>
But this will remove diacritics, and is not what you want. Use first code, sorry. Edited by cataiin
<?php
$text = "Thîs îs ã ütf8 strîng";
$replace = array(
	'i' => array('î'),
	'a' => array('ã'),
	'u' => array('ü')
	);
foreach ($replace as $changed => $initial)
{
	$text = str_replace($initial, $changed, $text);
}
echo $text;
?>

 

This will replace ALL characters. i only want to replace CERTAIN characters in certain positions of the sting, not all the characters. 

 

$str = substr_replace($str, "i", 2, 1);
// i get this "Thi�s îs ã ütf8 strîng
Any ideas?

 

$str = substr_replace($str, "i", 2, 2);
If you read about the UTF8 encoding, you'll notice that a single character can be stored as anywhere from 1 to 6 bytes. In the case of î, it is using 2 bytes so you need to use a length of 2 in your substr_replace.
  • Solution

Yes, this do the job! I search for an on-line source to check the byte length  of all my utf8 characters i like to replace (thanks goad all of them are 2 bytes, so i don't have to make different routine for every character) and the problem solved.

Thank's kicken.

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.