ocpaul20 Posted January 30, 2012 Share Posted January 30, 2012 I have a string of mixed Japanese or Chinese and Western characters and I would really like to leave the western characters and remove the others. Is there any way to detect which UTF-8 characters are Japanese/Chinese and to remove them? I have tried to use this which I found on another thread but my version of PHP (5.3) will not allow the /u in the following echo "<BR>".preg_replace('/[^\u4E00-\u9FFF]+/', '', $string); I have tried using various combinations of /p (these, plus others) echo "<BR>".preg_replace('/\p{Bopomofo}+/u', '', $string); echo "<BR>".preg_replace('/\p{Hiragana}+/u', '', $string); and the /x{} option tells me the numbers are to large echo "<BR>".preg_replace('/[^\x{4E00}-\x{9FFF}]+/', '', $string); My string for this example is this but any combination of western and Japanese/Chinese characters are possible in the title. 2011.12.06 19:00-20:00 / ãµãã„ã¡ãƒ©ã‚¤ãƒ–カメラ (Live Fukushima Nuclear Plant Cam) | Uploaded: 06 Dec 2011 Thanks for any help. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.