grejon04 Posted November 12, 2007 Share Posted November 12, 2007 Hello; I know this problem is so irritating but its sadly become my problem too. I did do the required reading before I posted this, though... It's the damn curly text. I tried the convert_smart_quotes function from the article, and htmlentities (which actually solved the problem for the people who ran the site with the same code before). So, since the problem was solved on another server, could the problem be with the server's default character encoding? (do they have this?) Mac OSX server, MySQL database with tables of Latin1 encoding. There is a generic header file that is used with every page, <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en-AU"> <head> <link rel="stylesheet" type="text/css" href="css/new.css" media="screen" title="New CSS" /> <link rel="stylesheet" type="text/css" href="css/new-printer.css" media="print" title="New Print CSS" /> <script src="includes/scripts.js" type="text/javascript"></script> <title><?php echo $title ?></title> Notice there is no meta tag, specifying <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"> as with the other fix. And I feel like this may have something to do with it, but don't want to add it if it isn't necessary. I looked at the site with my Firefox browser and the view->character encoding-> was set to ISO 8859-1. But I don't know if that means the site is set to that or just my browser, I suspect. Any help with this would be awesome. I know you all have heard this too many times. j Quote Link to comment Share on other sites More sharing options...
grejon04 Posted November 12, 2007 Author Share Posted November 12, 2007 bump Quote Link to comment Share on other sites More sharing options...
darkfreaks Posted November 12, 2007 Share Posted November 12, 2007 try changing it to UTF-8 Quote Link to comment Share on other sites More sharing options...
grejon04 Posted November 12, 2007 Author Share Posted November 12, 2007 Do you mean changing the HTML meta tag content to UTF-8 or the MySQL table character set (will that cause problems)? Quote Link to comment Share on other sites More sharing options...
effigy Posted November 12, 2007 Share Posted November 12, 2007 Smart quotes are found in Windows-1252. What character set do you want/need to run your site in--ISO-8859-1? Quote Link to comment Share on other sites More sharing options...
grejon04 Posted November 12, 2007 Author Share Posted November 12, 2007 The priorities are to prevent any translational errors from the database, and to get rid of the curly text. MySQL tables default to latin1, but part of my question is that I don't know whether I can change that, or if I even need to, or if the problem lies in the fact that the HTML meta tags need to be specific. Would just inserting that type into the HTML header meta tags solve the problem? Quote Link to comment Share on other sites More sharing options...
darkfreaks Posted November 12, 2007 Share Posted November 12, 2007 changing the header to UTF-8 will solve getting rid of special characters that persist in Latin-1 Quote Link to comment Share on other sites More sharing options...
effigy Posted November 12, 2007 Share Posted November 12, 2007 Smart quotes are not in Latin-1. You need to decide what your site should be running, then handle it properly. If you expect to handle international characters at some point, convert everything to UTF-8 now. If you only work in ISO-8859-1 (Latin-1), then you need an up-front process that will catch the smart quotes and translate them to normal quotes before they are entered into the database. Quote Link to comment Share on other sites More sharing options...
darkfreaks Posted November 12, 2007 Share Posted November 12, 2007 you could do something like <?php function convert_smart_quotes($string) { $search = array(chr(145),chr(146),chr(147),chr(148),chr(151)); $replace = array("'", "'", '"', '"','-'); returnstr_replace($search, $replace, $string); } ?> Quote Link to comment Share on other sites More sharing options...
grejon04 Posted November 12, 2007 Author Share Posted November 12, 2007 Right, I'm familiar with Shiflett's article and such. OK, so this isn't particularly your domain, but where is that character set changed (apache server OR html header)? And what about all the database entries that already contain the smart quotes? Quote Link to comment Share on other sites More sharing options...
grejon04 Posted November 12, 2007 Author Share Posted November 12, 2007 How would I convert everything to UTF-8? Quote Link to comment Share on other sites More sharing options...
darkfreaks Posted November 12, 2007 Share Posted November 12, 2007 <?php function convert_smart_quotes($string) { $search = array(chr(145),chr(146),chr(147),chr(148),chr(151)); $replace = array("'", "'", '"', '"','-'); returnstr_replace($search, $replace, $string); } ?> try this function to convert smart quotes also the header needs to be UTF-8 <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> to be more specific if you want to update the entries already in the database you could do something like <?php convert_smart_quotes($_GET['message']); ?> Quote Link to comment Share on other sites More sharing options...
grejon04 Posted November 12, 2007 Author Share Posted November 12, 2007 Ok, the curly text is gone . Awesome (but now the ? in black diamonds). I recall seeing previous forum posts about the question marks in the black diamonds. Will the convert function remove those? I found the section of code that pulls the database entries... $short_title = htmlspecialchars(substr($row[title],0,55),ENT_QUOTES ); $title = htmlspecialchars($row[title],ENT_QUOTES); will this do anything? Any other tips on the black question mark thingies before I go hunting? Quote Link to comment Share on other sites More sharing options...
darkfreaks Posted November 12, 2007 Share Posted November 12, 2007 no it wont if you want to remove the question marks do this <?php $title=str_replace("?","",$title); ?> Quote Link to comment Share on other sites More sharing options...
grejon04 Posted November 12, 2007 Author Share Posted November 12, 2007 Wait a second...those aren't regular question marks. They're kind of like the curly text...bad representations of something that's supposed to be there. And what I don't get is, if I leave the page in iso-8859-1, then call convert_smart_quotes($title) on the title as I call it from the db, which has the curly text in it, why doesn't that pull the quotes off? function convert_smart_quotes($string) { $search = array(chr(145), chr(146), chr(147), chr(148), chr(151)); $replace = array("'", "'", '"', '"', '-'); return str_replace($search, $replace, $string); } Quote Link to comment Share on other sites More sharing options...
darkfreaks Posted November 12, 2007 Share Posted November 12, 2007 that function just replaces smart quotes with regular quotes Quote Link to comment Share on other sites More sharing options...
effigy Posted November 12, 2007 Share Posted November 12, 2007 iconv. Quote Link to comment Share on other sites More sharing options...
darkfreaks Posted November 12, 2007 Share Posted November 12, 2007 <?php // assuming '†' is actually UTF8, htmlentities will assume it's iso-8859 // since we did not specify in the 3rd argument of htmlentities. // This generates "â[bad utf-8 character]" // If passed to any libxml, it will generate a fatal error. $badUTF8 = htmlentities('†'); // iconv() can ignore characters which cannot be encoded in the target character set $goodUTF8 = iconv("utf-8", "utf-8//IGNORE", $badUTF8); ?> Quote Link to comment Share on other sites More sharing options...
grejon04 Posted November 12, 2007 Author Share Posted November 12, 2007 Will that work with htmlspecialchars() as well? Quote Link to comment Share on other sites More sharing options...
grejon04 Posted November 12, 2007 Author Share Posted November 12, 2007 That function just removes the black question marks. There are characters that are supposed to be there. It just removes the whole character, and still leaves things like â , #039, and &. Quote Link to comment Share on other sites More sharing options...
darkfreaks Posted November 12, 2007 Share Posted November 12, 2007 you need htmlentities instead of htmlspecialchars as htmlspecialchars does not translate everything into html just certain characters <?php $short_title = htmlentities(substr($row[title],0,55),ENT_QUOTES ); $title =htmlentities($row[title],ENT_QUOTES);?> Quote Link to comment Share on other sites More sharing options...
grejon04 Posted November 13, 2007 Author Share Posted November 13, 2007 Actually, it seems like those two functions are what's causing those symbols. what about htmlentities($string, ENTQUOTES, "UTF-8") iconv("ISO-8859-1", "UTF-8", $string) Looks like I'll have to perform these conversions wherever db data comes out. What a pain in the ass. by the way, thank you for your help. I appreciate it, and I think I've almost got it. Also, what do you think about converting the database tables to UTF-8? Would that cause conversion problems? Quote Link to comment Share on other sites More sharing options...
darkfreaks Posted November 13, 2007 Share Posted November 13, 2007 <?php //outputs all html entities in html form in UTF-8 charset htmlentities($string, ENT_QUOTES, "UTF-8"); // outputs from ISO to UTF-8 iconv("ISO-8859-1", "UTF-8", $string); ?> Quote Link to comment Share on other sites More sharing options...
grejon04 Posted November 13, 2007 Author Share Posted November 13, 2007 Getting there. I still have a problem with those a's with umlauts (â ?) Quote Link to comment Share on other sites More sharing options...
darkfreaks Posted November 13, 2007 Share Posted November 13, 2007 <?php /* * Function htmlentities which support iso-8859-2 * * @param string * @return string * @author FanFataL */ function htmlentities_iso88592($string='') { $pl_iso = array('ê', 'ó', '±', '¶', '³', '¿', '¼', 'æ', 'ñ', 'Ê', 'Ó', '¡', '¦', '£', '¬', '¯', 'Æ', 'Ñ'); $entitles = get_html_translation_table(HTML_ENTITIES); $entitles = array_diff($entitles, $pl_iso); return strtr($string, $entitles); } ?> Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.