Jump to content

Recommended Posts

I have experience parsing different languages with php :)

 

my best advice is to keep everything from the php script to the page you are parsing in UTF-8

 

for your php file you can do these things:

1. save the file in encoding "utf-8"

2. send headers or add this html to the top of your php script

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8" /><title>title of page</title></head>

 

3. and if you are running php 5 you can encode and check encoding of strings with the mb_encode functions, look them up in the php.net manual, (only for php 5)

 

 

FOR THE PAGE YOU ARE PARSING

------------------------------------

1. set the page that you are parsing or getting strings from in the same encoding of utf-8

with html:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8" /><title>title of page</title></head>

 

2. if the data you are parsing is from a mysql database, make sure you are encoding the data from the connection as well:

//connect to db first, then put this code

$sql = "SET NAMES utf8 COLLATE utf8_bin";

mysql_query($sql) or die(mysql_error());

 

 

i gave you specific code and examples, the the solution here is to keep everything in the SAME encoding, for whichever encoding you want to use, I recommend UTF-8 for arabic, and it is pretty much universal, and it supports right to left languages as well

  • 1 month later...

Hi,

 

I have a database for storing company data for various print catalogues. Now we have a requirement to get Arabic information from companies as well. Hence, the company data table should now contain Arabic text along with the English.

 

To achieve that I first converted the table from latin1 to UTF8 (utf8_general_ci). Then I changed the charcter set on my web pages containing the update forms to UTF-8.

 

Now, the arabic data entered using the webpages is displayed fine in the resultant php pages, but when we want to see the same data in mysql (using mysql query browser/SQL yog/EMS Mysql Manager) the data shows up in binary form or ????.

 

I fail to understand how can I view Arabic text as it is in a mysql query window, so that when I epxort data it comes as it is and does not come in form of special characters.

 

Have tried searching the net but no solutions have worked so far.

 

Kindly advise.

 

Best regards

 

Hitendra

 

here's your solution:

 

make sure you table is in charset UTF-8

and collation is utf8_bin

 

it not enough that you database is in that charset encoding, you also have to send the data in that encoding as well, when you query, select, update, or insert data

 

do to this after you connect() line write:

$sql = "SET NAMES utf8 COLLATE utf8_bin";

mysql_query($sql) or die(mysql_error());

 

this will make all data sent to your mysql database in the correct encoding

 

 

also, your php script must be saved in the correct encoding and the html page must be forced to display the data in teh right charset as well

1. save your source_code (the php script) in utf-8 encoding (dreamweaver, notepad can do this)

2. set your headers to utf-8 to force the browser(IE, mozilla) to display the output of your script in utf-8

example: <head>

<meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> </head>

 

 

 

so thats 3 things:

1. mysql database or specific table must be in utf-8 charset, and collation of utf8_bin

2. the connection to your database must be in that charset with the SET NAMES code

3. your php script must be in that encoding

4. force the browser to display it in that encoding

 

 

the pattern as you see here, is that everything piece of data is being display and inserted in the same encoding

 

 

I use utf8_bin collation supports many many languages, including hebrew, arabic, and persian

its pretty much universal

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.