Jump to content

Encryption and UTF-8


Logical1

Recommended Posts

I have searched and read the posts on this forum (and many others) about this issue because encry. is get asked about quite often. But I have not been able to find anything directly related either here or on other forums hence the questions:

1.  If your users might select a combination of characters in English, a non-english language (Turkish for example) and numbers  or either english+ number or Turkish+numbers, how would it effect the functionality of encryption functions?  Any experience or thoughts?

(I have seen the saved passwords in MyQL tables in which each character turns into &+4digits+; )

2.  Where do I find the alg. or the code for examples of reasonably secure enc./dec. funcions.  Especially if UTF-8 is a factor.

Thanks in advance

Link to comment
Share on other sites

At the risk of sounding like I am reinventing the wheel I want to write a script based on Vigenere encryption concept.  However since my experience with utf8 characters going back and forth between browser and mysql been very surprising so far I thought I ask if any one had any experience or ideas about it.

Specific problem for example: I set length of password column in the mysql table to be 7 characters.  Then I enter some passwords (utf8 characters) through the php form to test, none of the passwords works!  I look at the mysql table directly and see each of characters is converted to द type of deal which means only one nad half character in reality (the first 7 chars) are saved!  Surprises like this.

Have you any experience with utf8?  I guess any encryption code will have similar type of concerns to deal with.

Link to comment
Share on other sites

You will need some sort of char <-> int translation table. E.g. a=0, b=1, etc. It wouldn't make sense to say 'a'+7 unless you define a numerical value for 'a'. On a computer and with the English language, one would normally use ASCII, but this is obviously not a choice for you. You would also need to handle overflows such that if you shift 'z' two positions forward in the English language you would get 'b'. You can use the modulo operator that.

 

I don't see how you can pull this off easily though. You say "English or a non-English language", which literally means every possible language. As I mentioned earlier, 'z'+2='b' in English, but 'z'+2='ø' in Danish and Norwegian for instance.

 

The classical crypto systems such as Vigenère and Ceasar have fixed ranges and domains. If you want to limit it to only English and Turkish then you can just setup a translation table in form of an array and use that. E.g.:

array(
'a','b','c', // etc.
),

So now 0=a, 1=b, 2=c, etc.

 

Upon detecting an invalid or undefined character you could either choose to 1) strip it out, 2) ignore it and don't shift it, or 3) throw an error.

 

As for storing it in the database, that's no problem. The only thing you have to assure is that you use the same character set throughout the entire application. This means your files must be saved in utf8 format, your tables and rows must be set to utf8 and when you open a connection to MySQL you will also need to set the character set to utf8 (run the query SET NAMES utf8; after connecting).

 

I suspect the reason why you will get e.g. &#2342; is that you are running it through htmlentities(). This is not at all necessary for inserting into a database. For that you should do something like mysql_real_escape_string or prepared statements using MySQLi or PDO. htmlentities is for escaping in an HTML context, but you're using it (incorrectly) in an SQL context.

 

On an entirely different level, encrypting passwords using Vigenère is not at all a safe option. It would be a much better idea going with a one-way hashing algorithm such as SHA-256. Vigenère is simply too easy to crack and shouldn't be used to secure sensitive information in any way.

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.