fatmikey Posted April 23, 2009 Share Posted April 23, 2009 Hi, I have a website where people signup using a form and sometimes they have a name that contains a French accented character in their name. Unfortunately, when the PHP script tries to process the record the name gets all screwed up and doesn't display properly. Is there an easy script or function to strip our French accented characters from a string? Thanks for your help, Mikey Quote Link to comment https://forums.phpfreaks.com/topic/155373-solved-how-to-strip-our-accent-characters-from-text-string/ Share on other sites More sharing options...
jonsjava Posted April 23, 2009 Share Posted April 23, 2009 <?php function transcribe($string) { $string = strtr($string, "\xA1\xAA\xBA\xBF\xC0\xC1\xC2\xC3\xC5\xC7 \xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF\xD0\xD1 \xD2\xD3\xD4\xD5\xD8\xD9\xDA\xDB\xDD\xE0 \xE1\xE2\xE3\xE5\xE7\xE8\xE9\xEA\xEB\xEC \xED\xEE\xEF\xF0\xF1\xF2\xF3\xF4\xF5\xF8 \xF9\xFA\xFB\xFD\xFF", "!ao?AAAAAC EEEEIIIIDN OOOOOUUUYa aaaaceeeei iiidnooooo uuuyy"); $string = strtr($string, array("\xC4"=>"Ae", "\xC6"=>"AE", "\xD6"=>"Oe", "\xDC"=>"Ue", "\xDE"=>"TH", "\xDF"=>"ss", "\xE4"=>"ae", "\xE6"=>"ae", "\xF6"=>"oe", "\xFC"=>"ue", "\xFE"=>"th")); return($string); } //usage: $data = "Àorvar"; print transcribe($data); Quote Link to comment https://forums.phpfreaks.com/topic/155373-solved-how-to-strip-our-accent-characters-from-text-string/#findComment-817462 Share on other sites More sharing options...
JonnoTheDev Posted April 23, 2009 Share Posted April 23, 2009 You shouldn't try to strip them out. You should be using the correct character encoding for your DB storage, php, HTML document. UTF-8 Unicode should work. Check your HTML headers: <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> If that doesnt work use php (place at the top of each script preferably in a common include prior to any screen output). header('Content-Type: text/html; charset=UTF-8'); Quote Link to comment https://forums.phpfreaks.com/topic/155373-solved-how-to-strip-our-accent-characters-from-text-string/#findComment-817469 Share on other sites More sharing options...
.josh Posted April 23, 2009 Share Posted April 23, 2009 I agree with the "don't just strip them out". If you are wanting to stick with the "get rid of them" instead of "make sure they display" route, I would give the user the option to enter in something of the "correct" format, rather than you "stripping" them out. Check if they are there, tell the user he can't use them, re-enter name, sort of thing. if(!preg_match('~^[a-z ]+$~i',$name)) { // name has more than a-z, A-Z or space, give error } Quote Link to comment https://forums.phpfreaks.com/topic/155373-solved-how-to-strip-our-accent-characters-from-text-string/#findComment-817530 Share on other sites More sharing options...
Daniel0 Posted April 23, 2009 Share Posted April 23, 2009 *snip* Ø and Å aren't O and A, but OE and AA. Same goes for their lower case variants. There is another problem with your function. If you for instance have the word "Æble" (Apple in Danish) it would be transliterated into Aeble. However, if it's in all caps, ÆBLE, then it would be AEBLE. Your function doesn't take that into account. You would have to figure the case of the other characters out as well. Otherwise you could end up with something like AeBLE or AEble, but of which look stupid. The same goes for all the other letters that stand for more than one letter. Niel and CV, it could be the case that he needs it in a URL (e.g. /user/Daniel). In that case he might only want the letters A-Z without any sort of diacritics. Quote Link to comment https://forums.phpfreaks.com/topic/155373-solved-how-to-strip-our-accent-characters-from-text-string/#findComment-817538 Share on other sites More sharing options...
.josh Posted April 23, 2009 Share Posted April 23, 2009 well that's why I opt for the opinion to tell the user that non <insert good stuff here>'s are not allowed, and let the user pick an alternative. Quote Link to comment https://forums.phpfreaks.com/topic/155373-solved-how-to-strip-our-accent-characters-from-text-string/#findComment-817549 Share on other sites More sharing options...
Daniel0 Posted April 23, 2009 Share Posted April 23, 2009 Well, that's easy to say for someone whose primary language is English. English doesn't really use other characters than a through z. Many other languages use various diacritics to give different meanings. Compare these words in Spanish for instance: año (year) vs. ano (anus), papá (dad) vs. papa (potato (or pope)). Quote Link to comment https://forums.phpfreaks.com/topic/155373-solved-how-to-strip-our-accent-characters-from-text-string/#findComment-817555 Share on other sites More sharing options...
fatmikey Posted April 28, 2009 Author Share Posted April 28, 2009 Thanks for your help everyone! Really appreciate it. Quote Link to comment https://forums.phpfreaks.com/topic/155373-solved-how-to-strip-our-accent-characters-from-text-string/#findComment-821178 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.