Jump to content

Japanese+Regex+PHP getting Kanji characters


exo_duz

Recommended Posts

Hi all,

 

Been scratching my head about this all week.

 

Is there a way to create a Regex statement to get all Kanji characters in Japanese? I am creating a PHP website which uses pretty URLs and would like to get this up in the URI. The only thing is that with the mb_ereg_replace() function which is a multibyte regex function it can only pick up the Hiragana and Katakana.

 

http://jp2.php.net/mb_ereg_replace

 

According to that, in the first example you can do that but not the Kanji. Is there a way to do it?

 

Thanks a lot for your help.

Thanks to effigy for his input I figured it out.

 

For all those having trouble I did this:

 

According to the website which contains all the Japanese Unicode Lib.

http://www.rikai.com/library/kanjitables/kanji_codes.unicode.shtml

 

//convert japanese characters
			$url = mb_convert_kana($url, "asKHV");

			//remove all symbols
			//table provided at http://www.rikai.com/library/kanjitables/kanji_codes.unicode.shtml
			$pattern = '/[^\wぁ-ゔァ-ヺー\x{4E00}-\x{9FAF}_\-]+/u';
			$url = preg_replace($pattern, '+', $url);

 

Just in case anyone ever needs to do this.

 

The function will convert all the characters first using the mb_convert_kanahttp://php.net/mb_kana_convert function then will remove all Japanese Symbols and only leave Hiragana, Katakana and Kanji.

 

Hope this helps anyone having this problem.

Nice. I'm bookmarking this thread. I went back and looked at the project I was working on (its currently on hold), and I hadn't come up with a solution that worked yet, so its good to know this. I spent a fair bit of time on Japanese sites trying to find if anyone Japanese had come up with a solution, and I didn't find anything that worked particularly well.

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.