Jump to content

Recommended Posts

Heya. I know most people use Regex for validating things, but for a simple form these look like they'd come in handy:

 

http://us.php.net/manual/en/ref.ctype.php

 

I don't think I've ever seen these used in any open source application I've dug through, and I wonder why that is?

Sure, ctype comes in handy for validation. I guess it largely depends on what you are validating I suppose.

But definitely useful (and like most other built in php functions, faster than regex too).

 

As for the frequency of use in open source applications, beats me.

All I know is that they have been designed to do a specific task, and do it well.

Yes, I use ctype. It's faster than the corresponding regex, like ctype_digit instead of \d , for testing a form field for all digits.

 

Granted, there is one caveat with stuff like regex shorthand character classes (like \d, \w) and ctype functions... and that is the locale settings (see setlocale).

In regex, we see \d (and consequently ctype_digit) as checking for 0-9.. However, depending on your default locale, this may not be the case.

 

For instance, if I check my default locale:

echo setlocale(LC_ALL, '0');

On my system, this returns:

LC_COLLATE=C;LC_CTYPE=English_United States.1252;LC_MONETARY=C;LC_NUMERIC=C;LC_TIME=C

Notice how the different settings are for the most part are equal to 'C' (if I understand correctly, this corresponds to the equivalent of how C / C++ handles those data types). In my case, notice CTYPE is equal to English_United States.1252. This is the kick in the teeth for ctype and regex shorthand character classes, as all of a sudden, \d or ctype_digit() checks for more than simply 0-9 (exponents are included).

 

Likewise, in PCRE, \w is at the very least equal to a-zA-Z0-9_ but due to my locale, this also returns exponents, not to mention accented characters.

 

Consider the following:

 

 

$str = '123erTuÊu³mÖx¹';

echo (ctype_alnum($str))? $str . ' passes ctype_alnum()!' . "<br />\n" :  $str . ' does NOT pass ctype_alnum()' . "<br />\n";

$str2 = '1235³730¹';

echo (ctype_digit($str2))? $str2 . ' passes ctype_digit()!' . "<br />\n":  $str2 . ' does NOT pass ctype_digit()' . "<br />\n";

$str3 = '12aFÃ35³73_0¹';

echo (preg_match('#^\w+$#', $str3))? $str3 . ' passes regex \w!' . "<br />\n":  $str3 . ' does NOT pass ctype_digit()' . "<br />\n";

 

 

On my system, the output results are as follows:

123erTuÊu³mÖx¹ passes ctype_alnum()!
1235³730¹ passes ctype_digit()!
12aFÃ35³73_0¹ passes regex \w!

 

Clearly, this goes against what ctype_alnum(), ctype_digit() and \w stands for (no thanks to the locale settings).. If resorting to such functionality, one quick way is to set the LC_CTYPE aspect from your default setting to 'C':

 

 

setlocale(LC_CTYPE, 'C');

$str = '123erTuÊu³mÖx¹';

echo (ctype_alnum($str))? $str . ' passes ctype_alnum()!' . "<br />\n" :  $str . ' does NOT pass ctype_alnum()' . "<br />\n";

$str2 = '1235³730¹';

echo (ctype_digit($str2))? $str2 . ' passes ctype_digit()!' . "<br />\n":  $str2 . ' does NOT pass ctype_digit()' . "<br />\n";

$str3 = '12aFÃ35³73_0¹';

echo (preg_match('#^\w+$#', $str3))? $str3 . ' passes regex \w!' . "<br />\n":  $str3 . ' does NOT pass ctype_digit()' . "<br />\n";

 

 

Now those $str variables do not validate... With LC_CTYPE now set to 'C', \d and ctype_digit() will in fact only check for 0-9, ctype_alnum() will only check for alphanumerics, and \w will now only check for a-zA-Z0-9_

Granted, results will most likely vary from locale to locale. In my case, I have to keep all this in the back of my mind if I am going to depend on those things to check for what is expected of them.

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.