Jump to content

[SOLVED] Unicode Problems


Spaceman-Spiff

Recommended Posts

I'm working on a new website that uses 3 softwares: SMF 1.1, Drupal 6, and formFIELDS (form generation/submission tool, from now will be referred as FF). All of the pages are using UTF-8 charset.

 

When handling Unicode text, in this case Japanese, SMF and Drupal (and phpMyAdmin) behaves the same way. When I post 日本語 on SMF or Drupal, it appears identical in phpMyAdmin. However, when using FF, it appears like: "日本語" in phpMyAdmin (though it always appear fine when viewed in FF). And when I edit it manually in phpMyAdmin, FF will show it as ??????.

 

The problem comes when trying to display the fields stored using FF in a page that includes SMF's contents, or the other way around. Example:

- http://opdb.info/chara/nossi.php (plain db query without SMF's SSI.php)

- http://opdb.info/chara/ssi.php (with SMF's SSI.php)

 

The first entry was edited directly on phpMyAdmin so it was reversed.

 

Stuff that I have tried but didn't work:

- changed mysql tabel collation from latin to utf8 to others

- tried using PHP functions: utf8_encode and utf8_decode

- tried sending header header('content-type: text/html; charset: utf-8');

 

Been trying to solve the problem in the past 2 weeks. If anyone can shed some light or provide any solution I would really appreciate it.

 

My last resort will be to dive in to FF codes, but I don't wanna go there yet. If there's some way to convert these unicode texts, it would be preferable...

Link to comment
https://forums.phpfreaks.com/topic/102433-solved-unicode-problems/
Share on other sites

If I use those, any unicode input will turn into ???.

 

There's clearly something wrong (or different) in FF. The funny thing is when there's nothing else on the script or not integrated with phpMyAdmin/SMF/vBulletin, the unicode text prints fine. But when integrated with another system, it prints unreadable text.

 

 

I'm at work, so I can't post too much right now, but working with Japanese is a HUGE pain in the ass (I am programming Japanese sites in my current job).

 

Here are a few things you should check.Your problem could be any, all or none of them:

 

1) Your database should be set as utf-8

2) the internal encoding for PHP should be set to utf-8 (use phpinfo() to check the current settings). If its not utf-8 you need to change your php.ini

3) You should be sending the following queries to the database before doing any other queries to the database:

 

$sql = "SET NAMES ujis; ";
$result = mysql_query($sql);
$sql = "SET character_set_results = NULL;";
$result2 = mysql_query($sql);

 

4) Make sure that you have the following meta tag the head of your document:

 

<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

 

There are some things you can start with. As a final note, the above is only applicable if you are creating English websites that have Japanese in them. If you are creating Japanese sites, then you shouldn't use utf-8 but rather a combination of EUC-JP and Shift-JIS. As such, the above information will be both incorrect and incomplete.

Thank you guys, for all your suggestions.

 

I have now used the following queries after mysql_select_db for formfields:

 

mysql_query("SET NAMES 'utf8'");

mysql_query("SET CHARACTER SET 'utf8'");

 

I have also converted the whole database (not just the tables) to utf8_general_ci collation. I'm not sure which one fixed the problem, but now formfields stores unicode characters correctly.

 

The only problem now is to rewrite all the data that are already stored in formfields. Which is still less of a headache than before. I'm gonna attempt to write a conversion script or some sort...

 

PS: I can't use EUC-JP or Shift-JIS because the website is a mix of English and Japanese.

EUC-JP and Shift_JIS both display English with no problems. You could do an entirely English website in those encodings if you wanted, although it wouldn't make much sense.

 

Basically the rule of thumb is - if its a Japanese site, use Japanese encodings. If its an English site with Japanese in it, use utf-8.

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.