Jump to content

[SOLVED] Writing a UTF-8 text file to a UTF-8 database still output garbage...


Recommended Posts

I have a text file in utf8 that I wish to add to my database. All the tables, fields and the database as a whole have been set to utf8. But when I read the text file and add this to the database, accented characters come out as gibberish.

 

Have I missed a step here? Do I need to tell PHP explicitly to read the file as utf8 or something??

You have to be careful in what type of editor you open this text file in since automatic conversion (like to latin1) could occur.

 

Make sure the data in the file is in UTF-8 and write a program to read it and insert into the database directly. Use "SET NAMES utf8" after opening MySQL connection and before doing queries/inserts/updates.

 

When showing UTF-8 data on a web page, you must have the following in the head tags (on every page):

 

<meta http-equiv="content-type" content="text/html;charset=utf-8" />

 

hth.

Make sure the data in the file is in UTF-8 and write a program to read it and insert into the database directly. Use "SET NAMES utf8" after opening MySQL connection and before doing queries/inserts/updates.

Thanks, that worked. Does this mean I have to do a "SET NAMES utf8" query in every script that needs to insert/update data? That might get a bit annoying...

Make sure the data in the file is in UTF-8 and write a program to read it and insert into the database directly. Use "SET NAMES utf8" after opening MySQL connection and before doing queries/inserts/updates.

Thanks, that worked. Does this mean I have to do a "SET NAMES utf8" query in every script that needs to insert/update data? That might get a bit annoying...

Shouldn't be annoying... don't you have a connection "include" file?

Well yes I do, good point. I always try to minimize queries, so given that the majority of pages use select and no inserts it's not necessary for every script. But it won't hurt. It does seem a little pointless in any case...why can't PHP/MySQL do this automatically when it detects the db is UTF8?

Well yes I do, good point. I always try to minimize queries, so given that the majority of pages use select and no inserts it's not necessary for every script. But it won't hurt. It does seem a little pointless in any case...why can't PHP/MySQL do this automatically when it detects the db is UTF8?

Because each database/table/field can have a different collation.

Well yes I do, good point. I always try to minimize queries, so given that the majority of pages use select and no inserts it's not necessary for every script. But it won't hurt. It does seem a little pointless in any case...why can't PHP/MySQL do this automatically when it detects the db is UTF8?

Because each database/table/field can have a different collation.

Sorry what I meant was why can't it decide how to input data based on the collation of each field? i.e. if one field is UTF8, input in UTF8, and if another field is ISO-8859-1, input the data in ISO-8859-1 ?

 

Though actually my original question still sorta makes sense - currently if I have a database with ISO-8859-1 collation, any utf8 fields are input in ISO-8859-1 anyway. So it wouldn't really be any different if a database with utf8 collation input in utf8 by default...

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.