Jump to content


Question about collation

  • Please log in to reply
2 replies to this topic

#1 Firestorm ZERO

Firestorm ZERO
  • New Members
  • Pip
  • Newbie
  • 5 posts

Posted 24 March 2006 - 06:48 PM

Can anyone give a brief run down about this? I kinda understand what it is for but I dunno how would it affect my php scripts. I plan on making my own scripts and would like it to be able to handle multi-langauges so do I have to use the utf8_general_ci? Also is there any pros/cons on collation? Like any security risks or such?

#2 wickning1

  • Members
  • PipPipPip
  • Advanced Member
  • 405 posts

Posted 24 March 2006 - 09:34 PM

Which collation you use depends on the languages you plan to support. If you just want to support latin languages (English, French, German, Spanish, etc) latin1_general_ci should be fine. If you want to support languages like Chinese with a completely different character set, then yes utf8 is a good choice.

The collation will take care of things like comparing strings (including sorting them), but you'll still have to have the appropriate character set support in PHP (set the collation to make comparisons work, use the multi-byte string functions like mb_substr()), and on the client's browser (usually you can assume they've configured their browser to support their native language, but you have to tell the browser what you're sending it).

There are several more considerations, but I am not too familiar with it in practice, I've never done it myself.

#3 Firestorm ZERO

Firestorm ZERO
  • New Members
  • Pip
  • Newbie
  • 5 posts

Posted 24 March 2006 - 10:16 PM

I noticed that Wordpress and phpbb databases have it set at latin. I can type in multibyte characters and it does save it to the database and display properly when viewed because the charset of the html page is set to utf-8. But in the database, it is just scrambled letters.

I wondering more about this because I am planning to program my own CMS (to learn more about php+mysql). I just don't want to go back and mess with it or start from scratch again.

I also did some thinking and there would be a problem if I use utf-8 in the database. Like for username there is an ascii letter 'a' but there is also an unicode letter 'a' which would be a problem.

0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users