Jump to content

How to convert signs and exotic letters in UTF-8 to NCRs?


BagoZonde

Recommended Posts

Hello freaks ;), I have issue with converting special chars. This is example of string I'm working with:

 

„Quảng Trị”

 

So here we go with bdquo and rdquo quotes and some other stuff with exotic letters (Vietnamese alphabet letters).

 

Of course I'm working with UTF-8 character set.

 

And that's what I need to do with this:

 

1. I'm using some WYSIWYG script for textarea so that exotic letters should be displayed as Decimal NCRs. I'm saving that field to MySQL database (5.5):

 

[Engine] => InnoDB

[Version] => 10

[table_collation] => latin2_general_ci

 

2. Then I want to retrieve that string from database and display as it is.

 

I was trying with htmlentities(), html_entity_decode(), mb_encode_numericentity() but I'm still confused. I can get these exotic chars with ord() (for example first quote stands for char: 226, 128, 158) then replace to Decimal NCR or name with ampersand, but I have no time and strength to build whole table with conversions, besides re-inventing wheel it's nothing I need here ;).

 

I found great site to convert strings so this converter for Decimal NCRs working like a charm: http://rishida.net/tools/conversion/

 

I hope my problem is clearly explained and there's some solution for such things.

Thanks for interesting!

Link to comment
Share on other sites

Thanks Christian, your point about collate is well taken. I changed some collations and charsets but it still not working properly.

Here's details about my database after few operations you've suggested:

 

 

[b]CHARACTER SET:[/b]
Array
(
   [0] => Array
       (
           [Variable_name] => character_set_client
           [Value] => utf8
       )

   [1] => Array
       (
           [Variable_name] => character_set_connection
           [Value] => utf8
       )

   [2] => Array
       (
           [Variable_name] => character_set_database
           [Value] => utf8
       )

   [3] => Array
       (
           [Variable_name] => character_set_filesystem
           [Value] => binary
       )

   [4] => Array
       (
           [Variable_name] => character_set_results
           [Value] => utf8
       )

   [5] => Array
       (
           [Variable_name] => character_set_server
           [Value] => latin2
       )

   [6] => Array
       (
           [Variable_name] => character_set_system
           [Value] => utf8
       )

   [7] => Array
       (
           [Variable_name] => character_sets_dir
           [Value] => /home/mysql55/share/charsets/
       )

)

[b]COLLATION:[/b]
Array
(
   [0] => Array
       (
           [Variable_name] => collation_connection
           [Value] => utf8_general_ci
       )

   [1] => Array
       (
           [Variable_name] => collation_database
           [Value] => utf8_general_ci
       )

   [2] => Array
       (
           [Variable_name] => collation_server
           [Value] => latin2_general_ci
       )

)


[b]TABLE:[/b]
[table_collation] => utf8_unicode_ci

 

What I'm doing it's just a simple submitter for quick-test purposes only of course:

   if ($_POST['string']){
       //GET & SAVE
       print 'Input string: ' . $_POST['string'] . '<br />';
       $sql='UPDATE sites SET content=? WHERE id=?';
       $params=array($_POST['string'], ;
       $result=Database::query($sql, $params);
   }

   //CHECK
   $sql='SELECT * FROM sites WHERE id=?';
   $params=array(;
   $result=Database::query($sql, $params);
   print 'Output string: ' . $result[0]['content'];

   //WHAT I WANT
   $string='„Quảng Trị”';
   print <<<FORM
   <form action='/iterator' method="POST">
   <textarea name="string">{$string}</textarea>
   <input type="submit" value="Submit">
   </form>
FORM;

 

I hope there's something simple I've missing. I'm not good in MySQL at all.

Thanks!

Link to comment
Share on other sites

Was your data saved to the DB before you set the correct collate? If so, then said data is already corrupted, and you need to fix it. This might be as simple as running a query to update it, but it may also require a manual intervention.

I'm afraid I can't tell you exactly how to fix it, as it depends upon the data itself. Though, all new content should be OK.

 

PS: I'm assuming you're sending the correct charset in the HTTP headers as well.

Link to comment
Share on other sites

Thank you so much!

 

I was trying with new table and it works like a charm!

 

So I found that for existing tables is needed to CONVERT to other CHARSET like that:

 

   $table='sites';
   $charset='utf8';
   $sql='ALTER TABLE ' . $table . ' CONVERT TO CHARACTER SET ' . $charset;

 

Thank you so much for help!

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.