Jump to content

Unicode - PHP -> MySQL issues...


shlomikalfa

Recommended Posts

hey there,

 

i'm in a desperate need of understanding some issues with UNICODE, i'll start by describing what is it that i need to do.

 

I need to form a TCP-IP POST operation into a PHP page which receives it and saves it into a MySQL data-base.

A. the required info is grabbed, converted to a proper HTML ENTITY, then is encrypted into a valid TCP-IP send and sent to the PHP page. [DONE OK.]

B. the PHP page now decrypts it and receives the HTML ENTITY. [DONE OK.]

C. the HTML ENTITY is to be saved in a MySQL server as the normal string/ UNICODE. [FAILS HERE!!!]

 

as you can see i have issues with conveying the HTML ENTITY into the MySQL database, when trying to use "html_entity_decode" before sending the content to the MySQL database it seems like it fails and returns the HTML ENTITY as is with no changes...

 

How do i Convert HTML ENTITY into a proper UNICODE string ?!

-->> i want to save the content in the database as it is received in the first place before any changes... so that the database will still be searchable !!!

 

THANKS IN ADVANCE!

 

Link to comment
Share on other sites

Yes you can, you just need to tell the database that it's utf-8 you're saving (and the database collations/charsets settings should be set to utf-8 of course):

 

Use this before your queries:

<?php
mysql_query("SET NAMES 'utf8'", $connection);
mysql_query('SET CHARACTER SET utf8', $connection);
?>

Link to comment
Share on other sites

oky that solves one part of the issue...

 

The שלום UNICODE string is now save'able on the database.

However, How can i transform the HTML ENTITY [ie. &#1502;&#1497;] to the normal UNICODE string in PHP?

 

seems like using: "html_entity_decode" function doesn't give a UNICODE string but an HTML ENTITY....

Link to comment
Share on other sites

You're right that

 

<?php
$str = 'ampersand#1502;';
echo html_entity_decode($str);
?>

 

outputs a HTML entity (because the character isn't found in ASCII I guess), but I found this to work:

 

<?php
$str = 'ampersand#1502;';
echo html_entity_decode($str, ENT_COMPAT, 'UTF-8');
?>

 

EDIT: ARGH! The ampersands are converted by the forum, so please change 'ampersand' with '&' above.

Link to comment
Share on other sites

welll bah!!!

 

my "Title" table entry is set for "utf8_unicode_ci" collation and after using the following code with:

&#1513;&#1500;&#1493;&#1502;&#1497;&#1511

i get the same thing [&#1513;&#1500;&#1493;&#1502;&#1497;&#1511] in the MySQL database - WHY ?!

 

mysql_query("SET NAMES 'utf8'", $link);
mysql_query('SET CHARACTER SET utf8', $link);

...
$preTitle = mysql_real_escape_string($_POST['pTitle']);
$Title = html_entity_decode($preTitle, ENT_COMPAT, "UTF-8");
echo Title;
...
/// Inserting to table.

 

all i want to do is to get that info sent to the PHP page:

&#1513;&#1500;&#1493;&#1502;&#1497;&#1511

 

to show on the MySQL table as:

שלומיק

[the decoded unicode string it represents]

 

HOW DO I ACHIEVE THAT ?!

 

PLEASE HELP, ANY HELP WILL BE APPRECIATED !!!

 

 

Link to comment
Share on other sites

It is working for me :-\

 

<?php

$str = '&#1513;&#1500;&#1493;&#1502;&#1497;&#1511;';

echo html_entity_decode($str, ENT_NOQUOTES, 'UTF-8');

?>

 

outputs

שלומיק

 

while

<?php

$str = '&#1513;&#1500;&#1493;&#1502;&#1497;&#1511;';

echo html_entity_decode($str, ENT_NOQUOTES);

?>

 

outputs

&#1513;&#1500;&#1493;&#1502;&#1497;&#1511;

 

 

Also, when you're already escaping the string, I would use ENT_NOQUOTES as the second parameter in the html_entity_decode function, to leave already escaped single and double quotes. But that has nothing to do with the entities.

Link to comment
Share on other sites

well since it didn't work for me online, [as it does work OK localy !!!] i'm using a different way to transfer the data into the PHP page...

 

Now, i can get the UNICODE string [שלומיק] to appear in the PHP page correctly [after the send and the conversions] but after i'm using the following to send it to the MySQL database, i get it in there as: [ùìåîé÷] which is not quite the same :)

 

$sql = 'INSERT INTO Table (
	`Var1`, ...)
                VALUES
	("'.$Title.'", ...)

mysql_query("SET NAMES 'utf8'", $link);
mysql_query('SET CHARACTER SET utf8', $link);
$result = mysql_query($sql);

 

 

:'(  :'(  :'( why is there something that always messes things up ?!

Link to comment
Share on other sites

well... i don't get what u mean sir.

 

when i open "phpmyadmin" and click the Edit button then i write something in hebrew inside an entry ie, שלומיק and then i click save - it works perfectly...

 

however, when Insert it through the PHP page it shows as i've mentioned....

Link to comment
Share on other sites

of course i have it's just that i wasn't that sure about using PHP5...

 

:'(

 

i never thought it'll be that messy !!!! [i hate PHP!]

 

the thing is all i want to do is just the same as it is in PHPBB3 that you can place as a post subj whatever u wish, ie. "ффлыдоврфы ששחלדף does this work ?!" just tried that as a subject and it works just fine... and when u enter the table through phpmyadmin you can still find it just the same "ффлыдоврфы ששחלדף does this work ?!"

 

i've just tried: "aasd שלומי ффыввогйн" it was Inserted and all should be oky, however, when i try and look at the entry through phpmyadmin -> "aasd שלומי ?????????" is what i saw :'(

 

--> i've just checked and it seems like PHPBB3 is using "PHP Version 4.4.8" so i don't think this is the only reason as to why has my script failed the process.

Link to comment
Share on other sites

well.. sir, after your last comment i've went back to UTF8...

 

It seems like you were right, the:

html_entity_decode($Title, ENT_NOQUOTES, "UTF-8");

failed because i was using the wrong PHP version. [which is now changed to PHP5]

 

However the other issue of seeing the data on the other side

[phpmyadmin] is still there...
-> i send: "akunhe שלומיק флгтру"

-> which is then converted [before send] to: "&#97;&#107;&#117;&#110;&#104;&#101;&#32;&#1513;&#1500;&#1493;&#1502;&#1497;&#1511;&#32;&#1092;&#1083;&#1075;&#1090;&#1088;&#1091;"

-> Then once received by the PHP page it is being converted by the above line to:
"akunhe ׳©׳œ׳•׳ž׳™׳§ ׁ„׀»׀³ׁ‚ׁ€ׁƒ" 

-> which is then sent to the MySQL db by:
$sql = 'INSERT INTO Table (
	`Var1`, ...)
                VALUES
	("'.$Title.'", ...)

mysql_query("SET NAMES utf8");
mysql_query('SET CHARACTER SET utf8');

$result = mysql_query($sql);

]

 

-> when i look it up in phpmyadmin i get that:

"akunhe שלומיק флгтру"

 

it's not yet what i'm trying to achieve :'(

Link to comment
Share on other sites

Edit: Disregard what was here..

All I can say is, that the code you're using is working for me. When you view these characters in your browser, are you sure that you've set the charset to UTF-8 with a meta tag (between <head> and </head> in you HTML)?

Link to comment
Share on other sites

i'm not viewing these through a web-page of mine, i'm viewing it through phpmyadmin and as i was saying when i'm opening the PHPBB3 tables and look for the entery: "ффлыдоврфы ששחלדף does this work ?!" [for example.] it shows it just fine.

 

also if i use the edit button and write something in heb/russian and then click the go [save] button it shows the same... but with the data that has been INSERTED through the PHP page it shows as i showed before: "akunhe שלומיק флгтру"

 

please if you have any idea for me to try... please help me...

Link to comment
Share on other sites

It must be when you convert the entities something goes wrong. Have you tried the scripts I've posted standalone?

 

<?php

$str = '&#1513;&#1500;&#1493;&#1502;&#1497;&#1511;';

echo html_entity_decode($str, ENT_NOQUOTES, 'UTF-8');

?>

 

outputs (when browser is set to UTF-8):

שלומיק

 

What's your output?

Link to comment
Share on other sites

׳©׳œ׳•׳ž׳™׳§

 

my output for the same...

 

oky i get what u'r saying.... when i change the pages' encoding to UTF8 it appears oky... but how come it's not the same in phpmyadmin ?! changing the pages' encoding there isn't that helpful... it stays as i told before...

Link to comment
Share on other sites

I'm pretty sure your browser is not set to UTF-8, then, 'cause when I change the character encoding to ISO-8859-1, I get similar chars compared to yours. If you're using Firefox, you can change the charset to UTF-8 under View > Character Encoding > Unicode (UTF-8). (If you haven't already set it using the meta tag). Going to bed now, I will check back tomorrow :)

 

Edit: Maybe an idea: Try an utf8_encode() on the string before inserting to the database. Might solve it!

Link to comment
Share on other sites

well..

 

-> i'm 100% using UNICODE(UTF-8) in my browser.

-> i've tried using it with/without that "utf8_encode()"

 

I'm freakin crying here soon... why can't it just freakin' work... LET ME BE damn u PHP !!!!

 

by the way, how does this function work: "unicode_decode()"

cause it seems to not function at all, however it does exists in the PHP manuals...

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.