svivian Posted October 31, 2007 Share Posted October 31, 2007 I have a text file in utf8 that I wish to add to my database. All the tables, fields and the database as a whole have been set to utf8. But when I read the text file and add this to the database, accented characters come out as gibberish. Have I missed a step here? Do I need to tell PHP explicitly to read the file as utf8 or something?? Quote Link to comment Share on other sites More sharing options...
toplay Posted October 31, 2007 Share Posted October 31, 2007 You have to be careful in what type of editor you open this text file in since automatic conversion (like to latin1) could occur. Make sure the data in the file is in UTF-8 and write a program to read it and insert into the database directly. Use "SET NAMES utf8" after opening MySQL connection and before doing queries/inserts/updates. When showing UTF-8 data on a web page, you must have the following in the head tags (on every page): <meta http-equiv="content-type" content="text/html;charset=utf-8" /> hth. Quote Link to comment Share on other sites More sharing options...
effigy Posted October 31, 2007 Share Posted October 31, 2007 Make sure the database knows you're sending UTF-8 as well: Example Quote Link to comment Share on other sites More sharing options...
svivian Posted October 31, 2007 Author Share Posted October 31, 2007 Make sure the data in the file is in UTF-8 and write a program to read it and insert into the database directly. Use "SET NAMES utf8" after opening MySQL connection and before doing queries/inserts/updates. Thanks, that worked. Does this mean I have to do a "SET NAMES utf8" query in every script that needs to insert/update data? That might get a bit annoying... Quote Link to comment Share on other sites More sharing options...
fenway Posted November 1, 2007 Share Posted November 1, 2007 Make sure the data in the file is in UTF-8 and write a program to read it and insert into the database directly. Use "SET NAMES utf8" after opening MySQL connection and before doing queries/inserts/updates. Thanks, that worked. Does this mean I have to do a "SET NAMES utf8" query in every script that needs to insert/update data? That might get a bit annoying... Shouldn't be annoying... don't you have a connection "include" file? Quote Link to comment Share on other sites More sharing options...
svivian Posted November 1, 2007 Author Share Posted November 1, 2007 Well yes I do, good point. I always try to minimize queries, so given that the majority of pages use select and no inserts it's not necessary for every script. But it won't hurt. It does seem a little pointless in any case...why can't PHP/MySQL do this automatically when it detects the db is UTF8? Quote Link to comment Share on other sites More sharing options...
fenway Posted November 2, 2007 Share Posted November 2, 2007 Well yes I do, good point. I always try to minimize queries, so given that the majority of pages use select and no inserts it's not necessary for every script. But it won't hurt. It does seem a little pointless in any case...why can't PHP/MySQL do this automatically when it detects the db is UTF8? Because each database/table/field can have a different collation. Quote Link to comment Share on other sites More sharing options...
svivian Posted November 3, 2007 Author Share Posted November 3, 2007 Well yes I do, good point. I always try to minimize queries, so given that the majority of pages use select and no inserts it's not necessary for every script. But it won't hurt. It does seem a little pointless in any case...why can't PHP/MySQL do this automatically when it detects the db is UTF8? Because each database/table/field can have a different collation. Sorry what I meant was why can't it decide how to input data based on the collation of each field? i.e. if one field is UTF8, input in UTF8, and if another field is ISO-8859-1, input the data in ISO-8859-1 ? Though actually my original question still sorta makes sense - currently if I have a database with ISO-8859-1 collation, any utf8 fields are input in ISO-8859-1 anyway. So it wouldn't really be any different if a database with utf8 collation input in utf8 by default... Quote Link to comment Share on other sites More sharing options...
fenway Posted November 5, 2007 Share Posted November 5, 2007 Yes, but the ISO is the default... it's not "detecting" that. Quote Link to comment Share on other sites More sharing options...
svivian Posted November 6, 2007 Author Share Posted November 6, 2007 OK fair enough. I'll use the solution posted above. Thanks for the help, everyone. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.