Jump to content

Russian chars part II: Moving to UTF8


ctiberg

Recommended Posts

Hello!

I posted earlier about having problems with russian characters. I now have decided to move to UTF8, but can't seem to get this to work. My test system contains 3 scripts - an editor (a form), a storer, and a viewer.

I seem to be able to get the stuff into the database in UTF8, but then I can't show it on screen - all I get is garbage. So I hope for some help here, preferrably hands-on :)

The editor is just a form, with the following "specials":

[code]
<?php header("Content-type: text/html; charset=utf-8"); ?>
<META HTTP-EQUIV="content-type" CONTENT="text/html; charset=utf-8">
<form name="inputfrm" method="POST" action="lagra_txt.php" accept-charset="utf-8">
[/code]

Despite specifying utf-8 in the accept-charset, I seem to get windows-1252. Why?

On to the storer. Here I've got this:

[code]
// Connect to the DB using mysql_connect and mysql_select_db

  $sql = "SET NAMES 'utf8'";
  mysql_query($sql);

  // The lines below were copied from an article on mysql.com - they check if I got UTF-8
  $test  = $_POST["charset_check"];
  if (bin2hex($test) == "c3a4e284a2c2ae")
    $OK = true;
  elseif (bin2hex($test) == "e499ae")
    $OK = false;
  else
    die("Sorry, I didn't understand the character set of the data you sent!");

  foreach ($_POST as $key => $val)
    {
      if ($key == "charset_check") continue;
      if ($val != "")
        {
          if (!$OK) $val = iconv("windows-1252", "utf-8", $val);
          $sql = "UPDATE luka_texter SET `Text`='".$val."' WHERE ID='".$key."' AND Sprak='ru'";
          mysql_query($sql);
        }
    }
[/code]

As I said, this seems to get the stuff into the DB alright, and I think it's in UTF8 in there (at least it looks like junk, which is what UTF8 seems to me).

The viewer is very simple, like this:

[code]
<?php header("Content-type: text/html; charset=utf-8"); ?>
<html>
<head>
<META HTTP-EQUIV="content-type" CONTENT="text/html; charset=utf-8">
<title>DB-edit</title>
</head>
<body>
<?php
// Connect to the DB using mysql_connect and mysql_select_db

  $sql = "SET NAMES 'utf8'";
  mysql_query($sql);

  $sql = "SELECT ID, Text FROM luka_texter WHERE Sprak='ru'";
  $res = mysql_query($sql) or die(mysql_error());
  while ($rad = mysql_fetch_assoc($res))
    print $rad["ID"]." ".iconv("utf-8", "windows-1252", $rad["Text"])."<br>";
  mysql_free_result($res);
?>
</body></html>
[/code]

The trouble I get is that some texts are truncated, some characters replace by question marks, and so on. So, can anyone point out where I do something wrong?
Link to comment
Share on other sites

Everything is set to latin1 when I do the above in MyDB Studio, except for character_set_system, which is set to utf8.

The Text column have had its character set to UTF8, though, using:

DROP TABLE IF EXISTS `luka_texter`;
CREATE TABLE `luka_texter` (
  `ID` varchar(50) NOT NULL default '',
  `Sprak` char(2) NOT NULL default '',
  `TEXT` text CHARACTER SET utf8,
  PRIMARY KEY  (`ID`,`Sprak`)
) ENGINE=MyISAM DEFAULT CHARACTER SET=latin1;
Link to comment
Share on other sites

The input is from a form (I gave you the form element syntax above), containing some 40-50 text strings that's been translated into russian from english. I copy them from an Excel sheet one at a time, and then paste them into each form field. Each form field is given a name that is then used as the ID in the MySQL table.

This is of course a very simple example, but I need this to work before I go on to the rest of the site.
Link to comment
Share on other sites

This is working for me. Note that I changed the table a little.

[code]
<?php header("Content-type: text/html; charset=utf-8"); ?>
<META HTTP-EQUIV="content-type" CONTENT="text/html; charset=utf-8">
<pre>
<?php
if ($_POST) {
### Show what we received and proceed with database interaction.
print_r($_POST);
### Connect, select, drop/create if needed.
mysql_connect('localhost', 'user', 'password') or die;
mysql_select_db('test') or die (mysql_error());
$table_check = mysql_query('DESC `luka_texter`');
if (mysql_error()) {
mysql_query('
CREATE TABLE `luka_texter` (
`ID` INT NOT NULL AUTO_INCREMENT,
`Sprak` char(2) NOT NULL,
`TEXT` text CHARACTER SET utf8,
PRIMARY KEY  (`ID`,`Sprak`)
) ENGINE=MyISAM DEFAULT CHARACTER SET=latin1;
') or die (mysql_error());
}
### Insert.
mysql_query("INSERT INTO `luka_texter` (`Sprak`, `TEXT`) VALUES ('ru', '{$_POST['utf8_textarea']}')") or die (mysql_error());
$query = mysql_query('SELECT TEXT FROM `luka_texter`') or die (mysql_error());
while ($row = mysql_fetch_array($query)) {
echo $row['TEXT'], '<br/>';
}
}

### Create some characters from the Cyrillic block...
$characters  = pack('c*', 0xD0, 0x89);
$characters .= pack('c*', 0xD0, 0x8A);
$characters .= pack('c*', 0xD0, 0x8B);
$characters .= pack('c*', 0xD0, 0x8C);
$characters .= pack('c*', 0xD0, 0x8D);
$characters .= pack('c*', 0xD0, 0x8E);
$characters .= pack('c*', 0xD0, 0x8F);
### ...and put them in the form...
?>

<form name="utf8_test" method="post" action="<?php echo $_SERVER['PHP_SELF']; ?>" accept-charset="utf-8">
<textarea name="utf8_textarea"><?php echo $characters; ?></textarea>
<input type="submit"/>
</form>

</pre>
[/code]
Link to comment
Share on other sites

I had a rave reply in this textbox, until I tried it out on the production server. There, it has the same problems as my own attempts. That is it gets most of the text right, but some of it is replaced by ?'s.... So I'll try to get a response out of our provider, which I guess will prove very futile. Sigh.
Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.