Jump to content

PHP - Detecting and removing Japanese/Chinese characters in a title string


ocpaul20

Recommended Posts

I have a string of mixed Japanese or Chinese and Western characters and I would really like to leave the western characters and remove the others. Is there any way to detect which UTF-8 characters are Japanese/Chinese and to remove them?

 

I have tried to use this which I found on another thread but my version of PHP (5.3) will not allow the /u in the following

echo "<BR>".preg_replace('/[^\u4E00-\u9FFF]+/', '', $string);

 

I have tried using various combinations of /p (these, plus others)

echo "<BR>".preg_replace('/\p{Bopomofo}+/u', '', $string);
echo "<BR>".preg_replace('/\p{Hiragana}+/u', '', $string);

and the /x{} option tells me the numbers are to large

echo "<BR>".preg_replace('/[^\x{4E00}-\x{9FFF}]+/', '', $string);

 

My string for this example is this but any combination of western and Japanese/Chinese characters are possible in the title.

2011.12.06 19:00-20:00 / ãµãã„ã¡ãƒ©ã‚¤ãƒ–カメラ (Live Fukushima Nuclear Plant Cam) | Uploaded: 06 Dec 2011

 

Thanks for any help.

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.