collations: utf-8 or latin1?

HeaDmiLe · May 13, 2009

I see many guys prefer latin1 instead od utf-8 collation. Do you know why?

My problem is that I live in Croatia and we have some "special" characters like č and ć and because of that I have moved to utf-8.

Thank u

fenway · May 13, 2009

latin1 handles the standard ASCII character set... nothing more.

jackpf · May 13, 2009

Is latin1 faster then?

gffg4574fghsDSGDGKJYM · May 13, 2009

I don't think Latin1 (ISO8859-1) is faster than UTF-8.

ISO8859-1 only use 1 byte per character. It can save space in database and strlen() will work fine. It can display most America, Western Europe and Africa language.

You can see the characters list here :

http://en.wikipedia.org/wiki/ISO_8859-1

UTF-8 may use more than 1 byte if needed per characters (variable length). It may take a lot more space in database than latin1 if you use a lot of japanese/chinese. You can store almost any language in the same database. It may not work (if you use any special chars) with strlen unless you utf8decode() it or use mb_strlen().

The first 127 characters of these charset are ASCII and are the same, so if you use htmlentities() or only use the 127 first characters you don't need any special function to convert from one charset to the others. If you don't use any special charaters they are the same (and you can't make the difference between them).

If you are sure to never need more than America, Western Europe and Africa language (english, french, spanish, ...) go for ISO8859-1

Anything else UTF-8.

I don't know Croatian but from what i have read it didn't seem to fit in ISO8859-1. (Wikipedia)

latin1 handles the standard ASCII character set... nothing more.

I think you are wrong here it can handle 256 characters including most special characters you can see with latin language like french.

fenway · May 14, 2009

Well, it supported extended ASCII as well... and you'd need to use CHAR_LENGTH() with multi-byte collations.

HeaDmiLe · May 14, 2009

latin1_swedish_ci supports croatian, latin2_croatian_ci too, but problem is that I use phpmyadmin from time to time and it shows "?" instead od "č" for example. On OS X, on Windows there is no problem. I know it's up to phpmyadmin, but it makes me mad sometimes

Sign In

collations: utf-8 or latin1?

Recommended Posts

HeaDmiLe

Link to comment

Share on other sites

fenway

Link to comment

Share on other sites

jackpf

Link to comment

Share on other sites

gffg4574fghsDSGDGKJYM

Link to comment

Share on other sites

fenway

Link to comment

Share on other sites

HeaDmiLe

Link to comment

Share on other sites

Archived

Browse

Activity

Important Information