Jump to content

A VERY serious problem related to functions' compatibility for UTF-8 encoding


dorm

Recommended Posts

Hello,

 

I'm a new user in this forum from Israel.

At first I would like to say sorry for my bad English.

 

And now for my problem, that it's solution I couldn't find anywhere so you're kind of my last hope.

 

I'm writing a system with PHP which encodes with UTF-8 encoding. Everything is encoded with UTF-8 encoding.

 

In order to work with UTF-8 encoded strings, I need to use special functions - mbString function (stands for Multi Byte String), that specially compatible for UTF-8 encoding and others.

 

The problem is that there aren't enough mbString functions so that I will be able to work well with UTF-8 encoded strings. Many important mbString functions are missing.

 

I wrote a list of regular functions and I need to know if they can work well & suitable for UTF-8 encoded strings.

 

Here is the list (links to the functions are included):

 

mysql_real_escape_string() http://il2.php.net/manual/en/function.mysql-real-escape-string.php

stripslashes() http://il2.php.net/manual/en/function.stripslashes.php

addslashes() http://il2.php.net/manual/en/function.addslashes.php

strstr() http://il2.php.net/manual/en/function.strstr.php

trim() http://il2.php.net/manual/en/function.trim.php

wordwrap() http://il2.php.net/manual/en/function.wordwrap.php

vsprintf() http://il2.php.net/manual/en/function.vsprintf.php

nl2br() http://il.php.net/manual/en/function.nl2br.php

 

The list above contains only part of the functions that I need to know if I can use with UTF-8 encoded strings.

 

Does someone know if the above functions are compatible for UTF-8 encoded strings?

How can I tell which functions is suitable for UTF-8 encoded strings?

If all the above functions aren't compatibale for UTF-8 encoded strings, so what am I need to do which replace these functions?

What is the solution?

 

THANK YOU VERY MUCH !!!

Dor.

Link to comment
Share on other sites

do yuo know what utf8?

the function you list works in regular character but i dont try that using greek

i dont use special char so i dont know maybe try to use even one of those functon using your character then if it works i guess everything will work fine

Link to comment
Share on other sites

The functions you listed are fine for UTF8 in MOST cases.  UTF8 extends ascii by using characters with the high bit set, so as long as you are dealing only with the standard ascii character set, you are ok.  Basically, most functions will treat your utf8 extended characters as binary data and ignore them.

 

For example, stripslashes() deals with the character '\', which is standard ascii, so it is safe.  But calling trim() to trim a character above 0x7f may corrupt your UTF8.  Standard trim() is fine

 

Functions like mysql_real_escape_string() are binary-safe, so you do not need to worry what encoding you are using.

 

binary safe - mysql_real_escape_string()      http://il2.php.net/manual/en/function.mysql-real-escape-string.php

safe - stripslashes()      http://il2.php.net/manual/en/function.stripslashes.php

? - addslashes()      http://il2.php.net/manual/en/function.addslashes.php

safe - strstr()      http://il2.php.net/manual/en/function.strstr.php

safe for input < 0x7f - trim()      http://il2.php.net/manual/en/function.trim.php

? probably safe - wordwrap()      http://il2.php.net/manual/en/function.wordwrap.php

safe - vsprintf()      http://il2.php.net/manual/en/function.vsprintf.php

safe - nl2br()      http://il.php.net/manual/en/function.nl2br.php

 

I don't think addslashes makes much sense on UTF8.  I would avoid it if possible.  Whatever you use addslashes for can be replaced with more specific escaping.

 

Note that trim() will only trim ascii whitespace, and will not trim any UTF8 characters that are "whitespace".  You'll have to do that yourself if you happen to have some of those.

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.