matthewtbaker Posted October 8, 2012 Share Posted October 8, 2012 Hi, I have created an array with symbols that I'd like to allow in a string of text. $symbols[] = '>'; // Greater Than or Open Angle Bracket $symbols[] = '<'; // Less Than or Close Angle Bracket $symbols[] = '/'; // Forward Slash $symbols[] = '\\'; // Back Slash $symbols[] = '&'; // Ampersand $symbols[] = '£'; // Pound Sterling $symbols[] = '$'; // Dollar $symbols[] = '\"'; // Quotation Marks $symbols[] = '\''; // Apostrophe $symbols[] = '#'; // Hash $symbols[] = ' '; // Space $symbols[] = '.'; // Period I would now like to use something like str_replace or preg_replace (whichever is best) to replace any character that is not included within the array $symbols with "". I know there are other ways to filter characters but it isn't a suitable solution considering who will be maintaining it... :-\ I'm hoping for something like; but obviously a working way. $myText = str_replace(!$symbols, "", $myText; $myText = preg_replace(!$symbols, "", $myText; Thank you for your help. Link to comment https://forums.phpfreaks.com/topic/269216-replace-characters-not-listed-in-a-whitelist-array/ Share on other sites More sharing options...
matthewtbaker Posted October 8, 2012 Author Share Posted October 8, 2012 Further to my original question I have found a similar solution but it does not work properly. $symbols = array(); //WHITELIST OF SYMBOLS $symbols[] = 'a-z'; // Lowercase A to Z $symbols[] = 'A-Z'; // Uppercase A to Z $symbols[] = '0-9'; // Numbers 0 to 9 $symbols[] = '>'; // Greater Than or Open Angle Bracket $symbols[] = '<'; // Less Than or Close Angle Bracket $symbols[] = '/'; // Forward Slash $symbols[] = '\\'; // Back Slash $symbols[] = '&'; // Ampersand $symbols[] = '£'; // Pound Sterling $symbols[] = '$'; // Dollar $symbols[] = '\"'; // Quotation Marks $symbols[] = '\''; // Apostrophe $symbols[] = '#'; // Hash $symbols[] = ' '; // Space $symbols[] = '\.'; // Period $symbols[] = '\,'; // Comma $symbols[] = '_'; // Underscore $symbols[] = '-'; // Hyphen or Dash $symbols[] = '@'; // At $symbols[] = '%'; // Percentage $symbols[] = '?'; // Question Mark $symbols[] = '*'; // Asterisk $symbols[] = '('; // Open Bracket $symbols[] = ')'; // Close Bracket $symbols[] = '+'; // Plus $symbols[] = '='; // Equals $symbols[] = '{'; // Open Brace $symbols[] = '}'; // Close Brace $symbols[] = '['; // Open Bracket $symbols[] = ']'; // Close Bracket $symbols[] = '~'; // Tilde $myText = preg_replace("/[^" . $symbols . "]^" . $symbols . "]?/iu", "", $myText); It does not keep all of my white list symbols. Link to comment https://forums.phpfreaks.com/topic/269216-replace-characters-not-listed-in-a-whitelist-array/#findComment-1383650 Share on other sites More sharing options...
ManiacDan Posted October 8, 2012 Share Posted October 8, 2012 Use the second example and do: $myText = preg_replace("/[^" . preg_quote(implode('',$symbols), '/') . "]/i", "", $myText); Link to comment https://forums.phpfreaks.com/topic/269216-replace-characters-not-listed-in-a-whitelist-array/#findComment-1383667 Share on other sites More sharing options...
matthewtbaker Posted October 8, 2012 Author Share Posted October 8, 2012 Thanks Maniac! I have had some problems with my: $symbols[] = 'a-z'; // Lowercase A to Z $symbols[] = 'A-Z'; // Uppercase A to Z $symbols[] = '0-9'; // Numbers 0 to 9 So have had to list each letter individually. Do you know how I can just have a simple a-z, 0-9 in my array? Much appreciated. Also if you could be explain why the implode is needed for the line to work I'd be most grateful. Matt Link to comment https://forums.phpfreaks.com/topic/269216-replace-characters-not-listed-in-a-whitelist-array/#findComment-1383678 Share on other sites More sharing options...
ManiacDan Posted October 8, 2012 Share Posted October 8, 2012 listing them individually is the easiest way, but at this point you're allowing most of the ascii range. Might as well make a blacklist, or specify your ranged by hand using an ascii chart. If you use an array as a string, it becomes the word "Array." Implode turns an array into a string so you can actually see the values. Link to comment https://forums.phpfreaks.com/topic/269216-replace-characters-not-listed-in-a-whitelist-array/#findComment-1383680 Share on other sites More sharing options...
xyph Posted October 8, 2012 Share Posted October 8, 2012 listing them individually is the easiest way, but at this point you're allowing most of the ascii range. Might as well make a blacklist, or specify your ranged by hand using an ascii chart. If you use an array as a string, it becomes the word "Array." Implode turns an array into a string so you can actually see the values. I think it's preg_quote that's messing up the ranges (\-). IMO, using a blacklist is a bad idea unless you're limiting all input to the ASCII range, and even then it would be pretty big. A solution could be to keep ranges in a separate array, and not preg_quote those. Link to comment https://forums.phpfreaks.com/topic/269216-replace-characters-not-listed-in-a-whitelist-array/#findComment-1383682 Share on other sites More sharing options...
matthewtbaker Posted October 8, 2012 Author Share Posted October 8, 2012 Thanks for the information chaps. I'm going to stick with the White-List approach as I don't want to the risk missing a character. Link to comment https://forums.phpfreaks.com/topic/269216-replace-characters-not-listed-in-a-whitelist-array/#findComment-1383683 Share on other sites More sharing options...
kicken Posted October 8, 2012 Share Posted October 8, 2012 You could avoid listing them out by-hand by using the range function: $symbols = array(); //WHITELIST OF SYMBOLS $symbols += range('a', 'z'); // Lowercase A to Z $symbols += range('A', 'Z'); // Uppercase A to Z $symbols += range('0', '9'); // Numbers 0 to 9 $symbols[] = '>'; // Greater Than or Open Angle Bracket $symbols[] = '<'; // Less Than or Close Angle Bracket ... Link to comment https://forums.phpfreaks.com/topic/269216-replace-characters-not-listed-in-a-whitelist-array/#findComment-1383685 Share on other sites More sharing options...
ManiacDan Posted October 8, 2012 Share Posted October 8, 2012 I think it's preg_quote that's messing up the ranges (\-).It definitely is. Kicken's solution is the most correct, if OP is really just trying to whitelist 75% of the ascii chart and disallow diacritical marks and utf-8 characters. Link to comment https://forums.phpfreaks.com/topic/269216-replace-characters-not-listed-in-a-whitelist-array/#findComment-1383686 Share on other sites More sharing options...
xyph Posted October 8, 2012 Share Posted October 8, 2012 Well, there's 256 valid ASCII characters, so he's nowhere near half-way. Link to comment https://forums.phpfreaks.com/topic/269216-replace-characters-not-listed-in-a-whitelist-array/#findComment-1383691 Share on other sites More sharing options...
matthewtbaker Posted October 8, 2012 Author Share Posted October 8, 2012 You could avoid listing them out by-hand by using the range function: $symbols = array(); //WHITELIST OF SYMBOLS $symbols += range('a', 'z'); // Lowercase A to Z $symbols += range('A', 'Z'); // Uppercase A to Z $symbols += range('0', '9'); // Numbers 0 to 9 $symbols[] = '>'; // Greater Than or Open Angle Bracket $symbols[] = '<'; // Less Than or Close Angle Bracket ... $symbols += range('a', 'z'); This is spot on, many many thanks! Finally 3 days of constant code hacking has ended with clean fresh code. Amen. Link to comment https://forums.phpfreaks.com/topic/269216-replace-characters-not-listed-in-a-whitelist-array/#findComment-1383696 Share on other sites More sharing options...
Jessica Posted October 8, 2012 Share Posted October 8, 2012 I wasn't aware you could add to an array like that. (+=) That actually works? I've always used array_merge. Or is that new? $symbols = array(); //WHITELIST OF SYMBOLS $symbols += range('a', 'z'); // Lowercase A to Z $symbols += range('A', 'Z'); // Uppercase A to Z $symbols += range('0', '9'); // Numbers 0 to 9 $symbols[] = '>'; // Greater Than or Open Angle Bracket $symbols[] = '<'; // Less Than or Close Angle Bracket print_r($symbols); When I run that code, I get: Array ( [0] => a [1] => b [2] => c [3] => d [4] => e [5] => f [6] => g [7] => h [8] => i [9] => j [10] => k [11] => l [12] => m [13] => n [14] => o [15] => p [16] => q [17] => r [18] => s [19] => t [20] => u [21] => v [22] => w [23] => x [24] => y [25] => z [26] => > [27] => < ) Link to comment https://forums.phpfreaks.com/topic/269216-replace-characters-not-listed-in-a-whitelist-array/#findComment-1383697 Share on other sites More sharing options...
kicken Posted October 8, 2012 Share Posted October 8, 2012 You can use + with array's. I always forget it's not the same as array_merge though, it's a union. array_merge is what you'd need here. $symbols = array(); //WHITELIST OF SYMBOLS $symbols = array_merge($symbols, range('a', 'z')); // Lowercase A to Z $symbols = array_merge($symbols, range('A', 'Z')); // Uppercase A to Z $symbols = array_merge($symbols, range('0', '9')); // Numbers 0 to 9 $symbols[] = '>'; // Greater Than or Open Angle Bracket $symbols[] = '<'; // Less Than or Close Angle Bracket print_r($symbols); Link to comment https://forums.phpfreaks.com/topic/269216-replace-characters-not-listed-in-a-whitelist-array/#findComment-1383707 Share on other sites More sharing options...
Recommended Posts
Archived
This topic is now archived and is closed to further replies.