Jump to content

Japanese+Regex+PHP getting Kanji characters


Recommended Posts

Hi all,

 

Been scratching my head about this all week.

 

Is there a way to create a Regex statement to get all Kanji characters in Japanese? I am creating a PHP website which uses pretty URLs and would like to get this up in the URI. The only thing is that with the mb_ereg_replace() function which is a multibyte regex function it can only pick up the Hiragana and Katakana.

 

http://jp2.php.net/mb_ereg_replace

 

According to that, in the first example you can do that but not the Kanji. Is there a way to do it?

 

Thanks a lot for your help.

Thanks to effigy for his input I figured it out.

 

For all those having trouble I did this:

 

According to the website which contains all the Japanese Unicode Lib.

http://www.rikai.com/library/kanjitables/kanji_codes.unicode.shtml

 

//convert japanese characters
			$url = mb_convert_kana($url, "asKHV");

			//remove all symbols
			//table provided at http://www.rikai.com/library/kanjitables/kanji_codes.unicode.shtml
			$pattern = '/[^\wぁ-ゔァ-ヺー\x{4E00}-\x{9FAF}_\-]+/u';
			$url = preg_replace($pattern, '+', $url);

 

Just in case anyone ever needs to do this.

 

The function will convert all the characters first using the mb_convert_kanahttp://php.net/mb_kana_convert function then will remove all Japanese Symbols and only leave Hiragana, Katakana and Kanji.

 

Hope this helps anyone having this problem.

Nice. I'm bookmarking this thread. I went back and looked at the project I was working on (its currently on hold), and I hadn't come up with a solution that worked yet, so its good to know this. I spent a fair bit of time on Japanese sites trying to find if anyone Japanese had come up with a solution, and I didn't find anything that worked particularly well.

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.