Jump to content

utf-8 woes, tried many options


richrock

Recommended Posts

Hi,

 

I've got an issue with one section of a script which is supposed to return people's names.  The names come from all over the world, so I've opted to use UTF-8, as this seems to be the universal method.

 

Here's what I've done -

 

Set the mysql db to utf8_bin, as recommended on this forum.

Set the meta tag to a range of options, from utf-, iso-8899 (? can't remember the number), latin-1 etc.  No change there.

Collated the tables in mysql to utf-8_bin.

Reinstalled the CSV file after each mysql change.

Set 'mysql_set_charset' to utf-8.

 

This has affected everything apart from one result.  The names.

 

I've got a dropdown which displays the correct name, for instance, 'DAHLÉN' with the accented E.

 

The results table (using standard echo of the string) displays 'Dahl�n'. 

 

Using mb_convert_encoding (found on the PHP manual), I get this: 'Dahlãn'

 

Now, putting it succinctly, what the hell is going on here?  I've set pretty much everything I can to UTF-8, yet this 1 name entry decides to show something completely different....

 

here's the code from that point :

 

<table cellpadding="2" cellspacing="0" border="1" id="resultstable" width="894">
<?php

//echo "<form action=\"".$_SERVER['PHP_SELF']."\" method=\"post\">";

echo "<thead>";
    echo "<tr>";
    echo "<th>Surname</th>";
    echo "<th width=\"75\">Year</th>";
    echo "<th width=\"100\">Instrument Type</th>";
    echo "<th width=\"75\">Country</th>";
    echo "<th width=\"80\">City</th>";
    echo "<th width=\"85\">Auction House</th>";
    echo "<th width=\"60\">Sale Date</th>";
    echo "<th width=\"50\">Lot No.</th>";
    echo "<th width=\"60\">Sale Price</th>";
    echo "</tr>";
    echo "</thead>";
    echo "<tbody>";
//echo "</form>";

while ($rowYBRes = mysql_fetch_array($YBResult)) {
    $YBlname = strtolower($rowYBRes['surname']);
    $YBfstname = strtolower($rowYBRes['firstname']);
    $YBsecname = strtolower($rowYBRes['secondname']);
    $YBthiname = strtolower($rowYBRes['thirdname']);
    $YBfouname = strtolower($rowYBRes['fourthname']);
    $YBfifname = strtolower($rowYBRes['fifthname']);
    $YByear = $rowYBRes['year'];
    $YBinstrument = $rowYBRes['instr_type'];
    $YBcountry = $rowYBRes['country'];
    $YBcity = $rowYBRes['city'];
    $YBlotnum = $rowYBRes['lot_number'];
    $YBhouse = $rowYBRes['auction_house'];
    $YBmonth = $rowYBRes['month'];
    $YBarchive_year = $rowYBRes['archive_year'];
    $YBArchiveDate = $YBmonth." ".$YBarchive_year;
    $YBprice_gbp = $rowYBRes['value_gbp'];
    $YBItemID = $rowYBRes['id'];

    //$YBlname = mb_convert_encoding($YBlname, "utf-8");
    //$YBfstname = mb_convert_encoding($YBfstname, "utf-8");
    //$YBsecname = mb_convert_encoding($YBsecname, "utf-8");
    //$YBthiname = mb_convert_encoding($YBthiname, "utf-8");
    //$YBfouname = mb_convert_encoding($YBfouname, "utf-8");
    //$YBfifname = mb_convert_encoding($YBfifname, "utf-8");


    $fullname = ucfirst($YBlname);
    if ($YBfstname != NULL){
        $fullname .= ", ".ucfirst($YBfstname);
    }
    if ($YBsecname != NULL){
        $fullname .= " ".ucfirst($YBsecname);
    }
    if ($YBthiname != NULL){
        $fullname .= " ".ucfirst($YBthiname);
    }
    if ($YBfouname != NULL){
        $fullname .= " ".ucfirst($YBfouname);
    }
    if ($YBfifname != NULL){
        $fullname .= " ".ucfirst($YBfifname);
    }


    //$YBPrice = $YBprice_gbp;
    //$YBPrice =preg_replace("/[^0-9]/","", $YBPrice);
    //print $string."<br />";
    // echo "<tbody>";
    echo "<form action='yearbook.php' method='get' name='test'>";
        echo "<tr>";
            echo "<td><a href='yearbook.php?optionID=".$YBItemID."&request=detailed' onclick='this.form.submit();'>".$fullname."</a></td>";
            echo "</form>";

 

As you can see I've commented out the mb_convert_encoding, as explained above.  Any thoughts on this would be worth a beer or two, plus less grey hair for me!!!

 

TIA

Link to comment
https://forums.phpfreaks.com/topic/152968-utf-8-woes-tried-many-options/
Share on other sites

I found this little function very useful in fixing strings that are not in utf-8 but need be converted

<?php
// Fixes the encoding to uf8
function fixEncoding($in_str)
{
  $cur_encoding = mb_detect_encoding($in_str) ;
  if($cur_encoding == "UTF-8" && mb_check_encoding($in_str,"UTF-8"))
    return $in_str;
  else
    return utf8_encode($in_str);
} // fixEncoding
?>

url

http://uk2.php.net/utf8_encode

 

 

might help dont no.

A yes, getting UTF functioning is notoriously hard... this is from the top of my head what I remember

 

Set database to utf-8

Add accept-charset="utf-8" attribute to the submitting form.

Send out header via PHP for every browser that makes sense: header('Content-Type: text/html; charset=utf-8');

Add metatag for IE (this awesome browser apparently decides on what encoding is used AFTER reading the page..) <meta http-equiv="Content-Type" content="text/html; charset=utf-8">

 

And I think sending out some query before doing any other query on the page.

Just googled it and found what's below, but it doesn't really ring a bell.

mysql_query("SET CHARACTER SET 'utf8'");

mysql_query("SET NAMES 'utf8'");

 

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.