Jump to content

Recommended Posts

Hi folks, I would be thankful if someone could clarify the following points. I have been having a real hell of a time dealing with unicode in php... in fact i wanna dump php and move elsewhere....!

1. I am using str_replace function to replace a string, say : $x=str_replace ("a", è£",$x);. It is supposed to replace all "a" with "è£" but it is replacing it with "裠" (probably a japanese character). The ascii values of the characters in the string ("è£") as given by the ord function is 232 and 163. But if i use: echo chr(232).chr(163), i get the same "裠". What is wrong here? I am using the utf-8 charset. Experimenting with other charsets too did not produce much different results.

 

2. A generic question: How does one handle unicode in php? Is python or something better than php for handling unicodes extensively?

 

thanks

 

-msr

 

Link to comment
https://forums.phpfreaks.com/topic/42694-a-question-of-character/
Share on other sites

è is 0xE8 and £ is 0xA3. The ideogram 裠 (which means a short skirt by the way), has a UTF-8 encoding of 0xE8 0xA3 0xA0. Do you see the connection? è and £ are two thirds of the encoding for 裠.

 

Are you sure your replace is working? Are the a's being removed?

What are the surrounding characters?

Is the string you're processing encoded or decoded?

hi frost: yes, i did mention in the html that it is charset utf-8

 

hi effigy: thanks for your observation... yes, the replacement is working....the sad part is that only that is working... not the way as intended... (and btw, how did you find the lookup table for the skirt symbol...probably chinese girl's!)

 

i tried to print:

$ss1=str_replace("2*3",chr(232).chr(163),$ss1);

echo $ss1; => it is outputing the same short skirt!...)

 

also in a series of replacements, php is ignoring valid replacements...  The case here is:

...

$st=str_replace("$2995;$3016;","¬÷", $st);

$st=str_replace("$2979;$3016;","’", $st);

$st=str_replace("$2970;$3016;","", $st);

$st=str_replace("$2965;$3006;","è£", $st);     ..... (x)

$ss="2*3";  //some dummy expression to test

$ss1=str_replace("2*3",chr(232).chr(163),$ss);

echo $ss1;

...

 

in this case the statement marked (x), though valid replacement is to be made, has been ignored.

 

thanks!

U+88E0

 

What is in your string and how is the string encoded? Keep in mind that some replaces may have an affect on others.

 

What are you trying to achieve through all of these? Are you cleaning up incorrect data, attempting a character set conversion...?

 

Is "¬÷" the actual replacement you want, or an encoding for another character? If you want the literal characters, you should be passing them through utf8_encode.

hi effigy!... as you guessed correctly, i am attempting a font conversion...the program is to convert unicode to a given font and vice versa... i tried help on utf8_encode but i felt it is not the one i wanted...

 

the long string of characters (as mentioned, eg: $2995;$3016;) are the unicode characters (ளை) that are to be replaced by those extended ascii symbols... the javascript version is working perfectly...

 

thanks!!

Here's an example:

 

<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<pre>
<?php
### Make a UTF-8 string.
$left_dbl_quote = pack('c*', 0xE2, 0x80, 0x9C);
$right_dbl_quote = pack('c*', 0xE2, 0x80, 0x9D);
print $utf8_string = $left_dbl_quote . 'quote' . $right_dbl_quote;
print '<br>';
### Convert it to ISO-8859-1.
$iso_8859_1_string = preg_replace('/[\x{201C}-\x{201D}]/u', '"', $utf8_string);
print $iso_8859_1_string;
?>
</pre>

 

You may want to look at this; I found it in the User Notes on php.net.

Did you run the code I posted by itself? What version of PHP are you using?

 

I get the following:

 

“quote”

"quote"

 

If you still have troubles, could you provide a specific example, like mine, which contains the original string and the desired string?

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.