Can a "character code" be distinguished for non-English, in the 1-65,536 range?

Quantainne · January 17, 2008

Hello coders,

When I pull Japanese phrases out of a Mysql dB (with utf set as Mysql's mode, and php's internal encoding is also utf, with php_mbstring.dll set as an extension in php.ini), it's helpful to separate the results into two -- traditional characters with more strokes in one group, separated from the simpler ones and English in the other, according to their code number. I haven't been able to find if and where php does this, though Flash's ActionScript does this for a string by calling the charCodeAt( ) method on the string.

If I had a statement or if-test that would segregate ones below 13,000 from ones above 13,000, that would be tremendous. A typical example is the character 癖, which ActionScript and browsers recognize as the character with code 30,294. In case there was any gibberish above, it would be what a browser renders from the string 癖.

Any advice would save this project ...

Sign In

Can a "character code" be distinguished for non-English, in the 1-65,536 range?

Recommended Posts

Quantainne

Link to comment

Share on other sites

Archived

Browse

Activity

Important Information