Jump to content

Replace Characters Not Listed In A Whitelist Array


matthewtbaker

Recommended Posts

Hi,

 

I have created an array with symbols that I'd like to allow in a string of text.

 

$symbols[] = '>'; // Greater Than or Open Angle Bracket
$symbols[] = '<'; // Less Than or Close Angle Bracket
$symbols[] = '/'; // Forward Slash
$symbols[] = '\\'; // Back Slash
$symbols[] = '&'; // Ampersand
$symbols[] = '£'; // Pound Sterling
$symbols[] = '$'; // Dollar
$symbols[] = '\"'; // Quotation Marks
$symbols[] = '\''; // Apostrophe
$symbols[] = '#'; // Hash
$symbols[] = ' '; // Space
$symbols[] = '.'; // Period

 

I would now like to use something like str_replace or preg_replace (whichever is best) to replace any character that is not included within the array $symbols with "".

 

I know there are other ways to filter characters but it isn't a suitable solution considering who will be maintaining it... :-\

 

I'm hoping for something like; but obviously a working way.

$myText = str_replace(!$symbols, "", $myText;
$myText = preg_replace(!$symbols, "", $myText;

 

 

Thank you for your help.

Link to comment
Share on other sites

Further to my original question I have found a similar solution but it does not work properly.

 

$symbols = array(); //WHITELIST OF SYMBOLS

 $symbols[] = 'a-z'; // Lowercase A to Z
 $symbols[] = 'A-Z'; // Uppercase A to Z
 $symbols[] = '0-9'; // Numbers 0 to 9

 $symbols[] = '>'; // Greater Than or Open Angle Bracket
 $symbols[] = '<'; // Less Than or Close Angle Bracket
 $symbols[] = '/'; // Forward Slash
 $symbols[] = '\\'; // Back Slash
 $symbols[] = '&'; // Ampersand
 $symbols[] = '£'; // Pound Sterling
 $symbols[] = '$'; // Dollar
 $symbols[] = '\"'; // Quotation Marks
 $symbols[] = '\''; // Apostrophe
 $symbols[] = '#'; // Hash
 $symbols[] = ' '; // Space
 $symbols[] = '\.'; // Period
 $symbols[] = '\,'; // Comma
 $symbols[] = '_'; // Underscore
 $symbols[] = '-'; // Hyphen or Dash
 $symbols[] = '@'; // At
 $symbols[] = '%'; // Percentage
 $symbols[] = '?'; // Question Mark
 $symbols[] = '*'; // Asterisk
 $symbols[] = '('; // Open Bracket
 $symbols[] = ')'; // Close Bracket
 $symbols[] = '+'; // Plus
 $symbols[] = '='; // Equals
 $symbols[] = '{'; // Open Brace
 $symbols[] = '}'; // Close Brace
 $symbols[] = '['; // Open Bracket
 $symbols[] = ']'; // Close Bracket
 $symbols[] = '~'; // Tilde

 $myText = preg_replace("/[^" . $symbols . "]^" . $symbols . "]?/iu", "", $myText);

 

It does not keep all of my white list symbols.

Edited by matthewtbaker
Link to comment
Share on other sites

Thanks Maniac!

 

I have had some problems with my:

 

$symbols[] = 'a-z'; // Lowercase A to Z
$symbols[] = 'A-Z'; // Uppercase A to Z
$symbols[] = '0-9'; // Numbers 0 to 9

 

So have had to list each letter individually. Do you know how I can just have a simple a-z, 0-9 in my array?

 

Much appreciated.

 

 

Also if you could be explain why the implode is needed for the line to work I'd be most grateful.

 

Matt

Link to comment
Share on other sites

listing them individually is the easiest way, but at this point you're allowing most of the ascii range. Might as well make a blacklist, or specify your ranged by hand using an ascii chart.

 

If you use an array as a string, it becomes the word "Array." Implode turns an array into a string so you can actually see the values.

Link to comment
Share on other sites

listing them individually is the easiest way, but at this point you're allowing most of the ascii range. Might as well make a blacklist, or specify your ranged by hand using an ascii chart.

 

If you use an array as a string, it becomes the word "Array." Implode turns an array into a string so you can actually see the values.

 

I think it's preg_quote that's messing up the ranges (\-).

 

IMO, using a blacklist is a bad idea unless you're limiting all input to the ASCII range, and even then it would be pretty big.

 

A solution could be to keep ranges in a separate array, and not preg_quote those.

Link to comment
Share on other sites

You could avoid listing them out by-hand by using the range function:

$symbols = array(); //WHITELIST OF SYMBOLS

$symbols += range('a', 'z'); // Lowercase A to Z
$symbols += range('A', 'Z'); // Uppercase A to Z
$symbols += range('0', '9'); // Numbers 0 to 9

$symbols[] = '>'; // Greater Than or Open Angle Bracket
$symbols[] = '<'; // Less Than or Close Angle Bracket
...

 

Link to comment
Share on other sites

You could avoid listing them out by-hand by using the range function:

$symbols = array(); //WHITELIST OF SYMBOLS

$symbols += range('a', 'z'); // Lowercase A to Z
$symbols += range('A', 'Z'); // Uppercase A to Z
$symbols += range('0', '9'); // Numbers 0 to 9

$symbols[] = '>'; // Greater Than or Open Angle Bracket
$symbols[] = '<'; // Less Than or Close Angle Bracket
...

 

 

$symbols += range('a', 'z');

 

This is spot on, many many thanks! Finally 3 days of constant code hacking has ended with clean fresh code. Amen.

Link to comment
Share on other sites

I wasn't aware you could add to an array like that. (+=) That actually works? I've always used array_merge. Or is that new?

 

$symbols = array(); //WHITELIST OF SYMBOLS

$symbols += range('a', 'z'); // Lowercase A to Z
$symbols += range('A', 'Z'); // Uppercase A to Z
$symbols += range('0', '9'); // Numbers 0 to 9

$symbols[] = '>'; // Greater Than or Open Angle Bracket
$symbols[] = '<'; // Less Than or Close Angle Bracket
print_r($symbols);

When I run that code, I get:

Array ( [0] => a [1] => b [2] => c [3] => d [4] => e [5] => f [6] => g [7] => h [8] => i [9] => j [10] => k [11] => l [12] => m [13] => n [14] => o [15] => p [16] => q [17] => r [18] => s [19] => t [20] => u [21] => v [22] => w [23] => x [24] => y [25] => z [26] => > [27] => < )

Edited by Jessica
Link to comment
Share on other sites

You can use + with array's. I always forget it's not the same as array_merge though, it's a union. array_merge is what you'd need here.

$symbols = array(); //WHITELIST OF SYMBOLS

$symbols = array_merge($symbols, range('a', 'z')); // Lowercase A to Z
$symbols = array_merge($symbols, range('A', 'Z')); // Uppercase A to Z
$symbols = array_merge($symbols, range('0', '9')); // Numbers 0 to 9

$symbols[] = '>'; // Greater Than or Open Angle Bracket
$symbols[] = '<'; // Less Than or Close Angle Bracket
print_r($symbols);

Edited by kicken
Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.