Jump to content

Site in 5 languages, special entities convert to asci problem


mindsoul

Recommended Posts

i have developed a website in 5 languages and i got to build a content management system to control the content.

i wanted to build a function that replace special characters with ascii code to be sure that in my db everything is clean and i don't want a surprise when the text is publish this is a a part of the list of char i need to replace, i used various function to make this but neither one  was ok for the 5 language i use(en/de/it/hu/ro):

 [ è ]_______________è............................. e grave//
				[ é ]_______________é.............................e acute//
				[ Á ]_______________Á.............................A acute//
				[ À ]_______________À.............................A grave//
				[ á ]_______________á.............................a acute//
				[ à ]_______________à.............................a grave//
				[ ì ]_______________ì.............................i grave//
				[ í ]_______________í.............................i acute//
				[ ò ]_______________ò.............................o grave//
				[ ó ]_______________ó.............................o acute//
				[ ő ]_______________ő.............................o maghiar//
				[ Ó ]_______________Ó.............................O acute//
				[ ù ]_______________ù.............................u grave//
				[ ú ]_______________ú.............................u acute//
				[ ű ]_______________ű.............................u maghiar//
				[ ü ]_______________ü.............................u uml//
				[ ë ]_______________ë.............................e uml//
				[ ö ]_______________ö.............................o uml//
				[ Ö ]_______________Ö.............................O uml//
				[ ü ]_______________ü.............................u uml//
				[ Ü ]_______________Ü.............................U uml//
				[ ä ]_______________ä.............................a uml//
				[ Ä ]_______________Ä.............................A uml//
				[ ß ]_______________ß.............................ss zed//
				[ - ]_______________−.............................minus sign//
				[ ~ ]_______________∼.............................tilde sign//
				[ \ ]_______________".............................quot, quotation mark//
				[ \"]_______________".............................quot, quotation mark//
				[ < ]_______________&#60;............................. less than//
				[ > ]_______________&#62;............................. greater than//
				[ ´ ],_______________&#180;.............................  acute//
				[ ' ]_______________&#180;.............................  acute//

Do somebody know a method to solve this without problems?

 

Thanks in advance. It's quite a wile I'm searching the answer to this problem It was easy with 3 languages it/de/en but when i added the east european languages(hu/ro) all became  a real headache.

i try this function

function htmlnumericentities($str){
  return preg_replace('/[^!-%\x27-;=?-~ ]/e', '"&#".ord("$0").chr(59)', $str);
}

function numericentitieshtml($str){
  return utf8_encode(preg_replace('/&#(\d+);/e', 'chr(str_replace(";","",str_replace("&#","","$0")))', $str));
}

but still doesn't recognize the char

&#238;

î

You could place them into the database as the code, and when retrieving them convert them back. IE:

 

<?php
$charArr = array("char" => "î");

$input = "char and than there was char";

foreach ($charArr as $char => $val) {
       $input = str_replace($char, $val, $input);
}

// place input into DB

// now grab input from DB
$output = "î and than there was î";

foreach ($charArr as $char => $val) {
       $output = str_replace($val, $char, $input);
}

print $output;
?>

 

I may be missing the point, but I think it would work?

the problem is that i already have all the chars in db converted to ascii.

so i want something that convert every special caracters(letters) to ascii &#...;

i've made this function that replace only what i need :

function formatare($text){

				$text=str_replace("è","&#232;",$text);//e grave//
				$text=str_replace('é','&#233;',$text);//e acute//
				$text=str_replace('Á','&#193;',$text);//A acute//
				$text=str_replace('À','&#192;',$text);//A grave//
				$text=str_replace('á','&#225;',$text);//a acute//
				$text=str_replace('à','&#224;',$text);//a grave//
				$text=str_replace('ì','&#236;',$text);//i grave//
				$text=str_replace('í','&#237;',$text);//i acute//
				$text=str_replace('ò','&#242;',$text);//o grave//
				$text=str_replace('ó','&#243;',$text);//o acute//
				$text=str_replace('Ó','&#211;',$text);//O acute//
				$text=str_replace('ù','&#249;',$text);//u grave//
				$text=str_replace('ú','&#250;',$text);//u acute//
				$text=str_replace('ë','&#235;',$text);//e uml//
				$text=str_replace('ö','&#246;',$text);//o uml//
				$text=str_replace('Ö','&#214;',$text);//O uml//
				$text=str_replace('ü','&#252;',$text);//u uml//
				$text=str_replace('Ü','&#220;',$text);//U uml//
				$text=str_replace('ä','&#228;',$text);//a uml//
				$text=str_replace('Ä','&#196;',$text);//A uml//
				$text=str_replace('ß','&#223;',$text);//ss zed//
				//$text=str_replace('-','&#8722;',$text);//minus sign//
				$text=str_replace('~','&#8764;',$text);//tilde sign//
				$text=str_replace("´","&#180;",$text);// acute//
				$text=str_replace("'","&#180;",$text);// acute//			     
				return $text;
			   } //
			   

but still there are some chars like &#337 ; that can not be seen so if i put in my function this char like :

$text=str_replace("ő'","&#337;",$text);// acute//

this char is not seen by my function and will not replaced but replace all "a" chars and that is something that i do not want.

I thought that the problems is may declaration of the charset

<meta http-equiv="Content-Type" content="text/xhtml; charset=utf-8"/>

i try with different declarations like east-european  iso-8859-2 but nothing i don't really understand maybe php don't recognize this type of chars.

this is the db text in hungarian:

Show fesztiv&#225;l szilveszter &#233;jszak&#225;n Jesol&#243;ban
Rendezv&#233;nyek Jesol&#243;ban
Szilveszter &#233;jszak&#225;n zen&#233;s, var&#225;zslatos mix v&#225;rja a l&#225;togat&#243;kat Jesol&#243;ban, a Show&#8722;fesztiv&#225;l alkalm&#225;b&#243;l, a rendezv&#233;nyt Jesol&#243; v&#225;ros polg&#225;rmesteri hivatala szponzor&#225;lja &#233;s a R&#225;di&#243; Birikina, valamint a R&#225;di&#243; Bella&#38;Monella szervez&#233;se. A fesztiv&#225;l december 2006/12/31&#8722;&#233;n  22,00 &#243;rakor kezd&#337;dik a Milan&#243;&#8722;i piac k&#246;rny&#233;k&#233;n Jesol&#243; Lid&#243;ban.A bel&#233;p&#233;s d&#237;jtalan.

and this is the output:

Show fesztivál szilveszter éjszakán Jesolóban Rendezvények Jesolóban Szilveszter éjszakán zenés, varázslatos mix várja a látogatókat Jesolóban, a Show−fesztivál alkalmából, a rendezvényt Jesoló város polgármesteri hivatala szponzorálja és a Rádió Birikina, valamint a Rádió Bella&Monella szervezése. A fesztivál december 2006/12/31−én 22,00 órakor kezdődik a Milanó−i piac környékén Jesoló Lidóban.A belépés díjtalan.

this is the db text in italian:

La notte di San Silvestro a Jesolo si festeggia con la musica e la magia del Festival Show, manifestazione promossa dal Comune di Jesolo ed organizzata da Radio Birikina e Radio Bella &#38; Monella. L&#180;evento si terr&#224; dalle ore 22.00 in Piazza Mazzini sabato 31 dicembre. L&#180;ingresso sar&#224; gratuito. 

La notte di San Silvestro a Jesolo si festeggia con la musica e la magia del Festival Show, manifestazione promossa dal Comune di Jesolo ed organizzata da Radio Birikina e Radio Bella & Monella. L´evento si terrà dalle ore 22.00 in Piazza Mazzini sabato 31 dicembre. L´ingresso sarà gratuito.

 

all this texts are available in 5 languages. to be sure i convert all the chars to ascii.

 

as you can see this form here in the forum just do the job of converting the chars, i want something similar. in some pages i have html so i don't need something to convert html to ascii, but only the chars(letters), here i see that the chars set is iso-8859-1 called also latin1

 

the text i put it up there is shown like this

<div class="quote">Show fesztivál szilveszter éjszakán Jesolóban Rendezvények Jesolóban Szilveszter éjszakán zenés, varázslatos mix várja a látogatókat Jesolóban, a Show&#8722;fesztivál alkalmából, a rendezvényt Jesoló város polgármesteri hivatala szponzorálja és a Rádió Birikina, valamint a Rádió Bella&Monella szervezése. A fesztivál december 2006/12/31&#8722;én 22,00 órakor kezd&#337;dik a Milanó&#8722;i piac környékén Jesoló Lidóban.A belépés díjtalan.</div>

as you can see they replace only few chars when i submit strange text, they are changing only this chars  &#337 ;that are not in iso-8859-1 charset but in iso-8859-2 so that's why they are replacing with ascii.

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.