matthewtbaker Posted October 8, 2012 Share Posted October 8, 2012 Hi, I have created an array with symbols that I'd like to allow in a string of text. $symbols[] = '>'; // Greater Than or Open Angle Bracket $symbols[] = '<'; // Less Than or Close Angle Bracket $symbols[] = '/'; // Forward Slash $symbols[] = '\\'; // Back Slash $symbols[] = '&'; // Ampersand $symbols[] = '£'; // Pound Sterling $symbols[] = '$'; // Dollar $symbols[] = '\"'; // Quotation Marks $symbols[] = '\''; // Apostrophe $symbols[] = '#'; // Hash $symbols[] = ' '; // Space $symbols[] = '.'; // Period I would now like to use something like str_replace or preg_replace (whichever is best) to replace any character that is not included within the array $symbols with "". I know there are other ways to filter characters but it isn't a suitable solution considering who will be maintaining it... :-\ I'm hoping for something like; but obviously a working way. $myText = str_replace(!$symbols, "", $myText; $myText = preg_replace(!$symbols, "", $myText; Thank you for your help. Quote Link to comment https://forums.phpfreaks.com/topic/269216-replace-characters-not-listed-in-a-whitelist-array/ Share on other sites More sharing options...
matthewtbaker Posted October 8, 2012 Author Share Posted October 8, 2012 (edited) Further to my original question I have found a similar solution but it does not work properly. $symbols = array(); //WHITELIST OF SYMBOLS $symbols[] = 'a-z'; // Lowercase A to Z $symbols[] = 'A-Z'; // Uppercase A to Z $symbols[] = '0-9'; // Numbers 0 to 9 $symbols[] = '>'; // Greater Than or Open Angle Bracket $symbols[] = '<'; // Less Than or Close Angle Bracket $symbols[] = '/'; // Forward Slash $symbols[] = '\\'; // Back Slash $symbols[] = '&'; // Ampersand $symbols[] = '£'; // Pound Sterling $symbols[] = '$'; // Dollar $symbols[] = '\"'; // Quotation Marks $symbols[] = '\''; // Apostrophe $symbols[] = '#'; // Hash $symbols[] = ' '; // Space $symbols[] = '\.'; // Period $symbols[] = '\,'; // Comma $symbols[] = '_'; // Underscore $symbols[] = '-'; // Hyphen or Dash $symbols[] = '@'; // At $symbols[] = '%'; // Percentage $symbols[] = '?'; // Question Mark $symbols[] = '*'; // Asterisk $symbols[] = '('; // Open Bracket $symbols[] = ')'; // Close Bracket $symbols[] = '+'; // Plus $symbols[] = '='; // Equals $symbols[] = '{'; // Open Brace $symbols[] = '}'; // Close Brace $symbols[] = '['; // Open Bracket $symbols[] = ']'; // Close Bracket $symbols[] = '~'; // Tilde $myText = preg_replace("/[^" . $symbols . "]^" . $symbols . "]?/iu", "", $myText); It does not keep all of my white list symbols. Edited October 8, 2012 by matthewtbaker Quote Link to comment https://forums.phpfreaks.com/topic/269216-replace-characters-not-listed-in-a-whitelist-array/#findComment-1383650 Share on other sites More sharing options...
ManiacDan Posted October 8, 2012 Share Posted October 8, 2012 Use the second example and do: $myText = preg_replace("/[^" . preg_quote(implode('',$symbols), '/') . "]/i", "", $myText); Quote Link to comment https://forums.phpfreaks.com/topic/269216-replace-characters-not-listed-in-a-whitelist-array/#findComment-1383667 Share on other sites More sharing options...
matthewtbaker Posted October 8, 2012 Author Share Posted October 8, 2012 Thanks Maniac! I have had some problems with my: $symbols[] = 'a-z'; // Lowercase A to Z $symbols[] = 'A-Z'; // Uppercase A to Z $symbols[] = '0-9'; // Numbers 0 to 9 So have had to list each letter individually. Do you know how I can just have a simple a-z, 0-9 in my array? Much appreciated. Also if you could be explain why the implode is needed for the line to work I'd be most grateful. Matt Quote Link to comment https://forums.phpfreaks.com/topic/269216-replace-characters-not-listed-in-a-whitelist-array/#findComment-1383678 Share on other sites More sharing options...
ManiacDan Posted October 8, 2012 Share Posted October 8, 2012 listing them individually is the easiest way, but at this point you're allowing most of the ascii range. Might as well make a blacklist, or specify your ranged by hand using an ascii chart. If you use an array as a string, it becomes the word "Array." Implode turns an array into a string so you can actually see the values. Quote Link to comment https://forums.phpfreaks.com/topic/269216-replace-characters-not-listed-in-a-whitelist-array/#findComment-1383680 Share on other sites More sharing options...
xyph Posted October 8, 2012 Share Posted October 8, 2012 listing them individually is the easiest way, but at this point you're allowing most of the ascii range. Might as well make a blacklist, or specify your ranged by hand using an ascii chart. If you use an array as a string, it becomes the word "Array." Implode turns an array into a string so you can actually see the values. I think it's preg_quote that's messing up the ranges (\-). IMO, using a blacklist is a bad idea unless you're limiting all input to the ASCII range, and even then it would be pretty big. A solution could be to keep ranges in a separate array, and not preg_quote those. Quote Link to comment https://forums.phpfreaks.com/topic/269216-replace-characters-not-listed-in-a-whitelist-array/#findComment-1383682 Share on other sites More sharing options...
matthewtbaker Posted October 8, 2012 Author Share Posted October 8, 2012 Thanks for the information chaps. I'm going to stick with the White-List approach as I don't want to the risk missing a character. Quote Link to comment https://forums.phpfreaks.com/topic/269216-replace-characters-not-listed-in-a-whitelist-array/#findComment-1383683 Share on other sites More sharing options...
kicken Posted October 8, 2012 Share Posted October 8, 2012 You could avoid listing them out by-hand by using the range function: $symbols = array(); //WHITELIST OF SYMBOLS $symbols += range('a', 'z'); // Lowercase A to Z $symbols += range('A', 'Z'); // Uppercase A to Z $symbols += range('0', '9'); // Numbers 0 to 9 $symbols[] = '>'; // Greater Than or Open Angle Bracket $symbols[] = '<'; // Less Than or Close Angle Bracket ... Quote Link to comment https://forums.phpfreaks.com/topic/269216-replace-characters-not-listed-in-a-whitelist-array/#findComment-1383685 Share on other sites More sharing options...
ManiacDan Posted October 8, 2012 Share Posted October 8, 2012 I think it's preg_quote that's messing up the ranges (\-).It definitely is. Kicken's solution is the most correct, if OP is really just trying to whitelist 75% of the ascii chart and disallow diacritical marks and utf-8 characters. Quote Link to comment https://forums.phpfreaks.com/topic/269216-replace-characters-not-listed-in-a-whitelist-array/#findComment-1383686 Share on other sites More sharing options...
xyph Posted October 8, 2012 Share Posted October 8, 2012 Well, there's 256 valid ASCII characters, so he's nowhere near half-way. Quote Link to comment https://forums.phpfreaks.com/topic/269216-replace-characters-not-listed-in-a-whitelist-array/#findComment-1383691 Share on other sites More sharing options...
matthewtbaker Posted October 8, 2012 Author Share Posted October 8, 2012 You could avoid listing them out by-hand by using the range function: $symbols = array(); //WHITELIST OF SYMBOLS $symbols += range('a', 'z'); // Lowercase A to Z $symbols += range('A', 'Z'); // Uppercase A to Z $symbols += range('0', '9'); // Numbers 0 to 9 $symbols[] = '>'; // Greater Than or Open Angle Bracket $symbols[] = '<'; // Less Than or Close Angle Bracket ... $symbols += range('a', 'z'); This is spot on, many many thanks! Finally 3 days of constant code hacking has ended with clean fresh code. Amen. Quote Link to comment https://forums.phpfreaks.com/topic/269216-replace-characters-not-listed-in-a-whitelist-array/#findComment-1383696 Share on other sites More sharing options...
Jessica Posted October 8, 2012 Share Posted October 8, 2012 (edited) I wasn't aware you could add to an array like that. (+=) That actually works? I've always used array_merge. Or is that new? $symbols = array(); //WHITELIST OF SYMBOLS $symbols += range('a', 'z'); // Lowercase A to Z $symbols += range('A', 'Z'); // Uppercase A to Z $symbols += range('0', '9'); // Numbers 0 to 9 $symbols[] = '>'; // Greater Than or Open Angle Bracket $symbols[] = '<'; // Less Than or Close Angle Bracket print_r($symbols); When I run that code, I get: Array ( [0] => a [1] => b [2] => c [3] => d [4] => e [5] => f [6] => g [7] => h [8] => i [9] => j [10] => k [11] => l [12] => m [13] => n [14] => o [15] => p [16] => q [17] => r [18] => s [19] => t [20] => u [21] => v [22] => w [23] => x [24] => y [25] => z [26] => > [27] => < ) Edited October 8, 2012 by Jessica Quote Link to comment https://forums.phpfreaks.com/topic/269216-replace-characters-not-listed-in-a-whitelist-array/#findComment-1383697 Share on other sites More sharing options...
kicken Posted October 8, 2012 Share Posted October 8, 2012 (edited) You can use + with array's. I always forget it's not the same as array_merge though, it's a union. array_merge is what you'd need here. $symbols = array(); //WHITELIST OF SYMBOLS $symbols = array_merge($symbols, range('a', 'z')); // Lowercase A to Z $symbols = array_merge($symbols, range('A', 'Z')); // Uppercase A to Z $symbols = array_merge($symbols, range('0', '9')); // Numbers 0 to 9 $symbols[] = '>'; // Greater Than or Open Angle Bracket $symbols[] = '<'; // Less Than or Close Angle Bracket print_r($symbols); Edited October 8, 2012 by kicken Quote Link to comment https://forums.phpfreaks.com/topic/269216-replace-characters-not-listed-in-a-whitelist-array/#findComment-1383707 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.