Jump to content

Figuring out censor.php


SaranacLake

Recommended Posts

So I have been reviewing my codebase that i stepped away from for a couple of years.  Surprisingly (or not), so far I understand almost all of my code.

But I have a function that **** out dirty words, and there is one section of the code that isn't making sense.  (I added var_dumps, but I'm just not seeing what it is doing.)

Below is slightly simplified code - since my real code uses the database plus a bunch of words I can't post here!

I have added comments where I'm hung up...

<?php
  // Initialize Variables.
  $patternsArray = array();
  $replacementsArray = array();
  $tempArray = array();
  $matchesArray = array();
	  
  // Un-censored text.
  $text = "Cassie, slipped on some glass and cut her ass!<br>
           'Goshdamnit! Who left this damn mess out?' she cried.<br>
           'Well, SHOOT, you should be more careful!' Mary laughed.<br>
           'Hello! Why don't you clean up after yourself!' Cassie screamed.<br>
           'Go to HELL!' Mary yelled back.<br>
           Then suddenly, blood started shooting everywhere.";
	  // Create Replacements.
  $replacementsArray[0] = '***';                  // ass
  $replacementsArray[1] = '****';                 // damn
  $replacementsArray[2] = '****';                 // shoot
  $replacementsArray[3] = '****';                 // hell
  
  
  // Create Patterns.
  $patternsArray[0] = "/\b(ass)(s?)\b/i";       // Word pattern
  $patternsArray[1] = "/damn/i";                // Substring pattern
  $patternsArray[2] = "/\b(shoot)(s?)\b/i";     // Word pattern
  $patternsArray[3] = "/\b(hell)(s?)\b/i";      // Word pattern
	  // Sort Arrays.
//  ksort($patternsArray);
//  ksort($replacementsArray);
	  // What does this do????
  foreach ($patternsArray as $pattern){
    preg_match_all($pattern, $text, $tempArray, PREG_PATTERN_ORDER);            // (1)
//    preg_match_all($pattern, $text, $matchesArray, PREG_PATTERN_ORDER);       // (2)
	    $matchesArray = array_merge($matchesArray, $tempArray[0]);                  // (3)
    /* If I comment (1) and (3) and uncomment (2),
     * then var_dump($matchesArray) displays nothing.
     * Not realy sure what this block of code does...
     */
  }
	    // Clean text.
    $cleanText = preg_replace($patternsArray, $replacementsArray, $text, $limit=-1, $allCount);
	  ksort($matchesArray);
  
  echo "<b>DIRTY TEXT:</b><br>
        $text";
  
  var_dump($patternsArray);
  var_dump($replacementsArray);
  var_dump($tempArray);
  var_dump($matchesArray);
  
  echo "<b>CLEAN TEXT:</b><br>
        $cleanText";

?>
	

 

In the test I run locally - again I couldn't really post that here - my function seems to work, but I don't see how the parts i commented above work/help get the end results I want.

The goal of reading all of my code line-by-line is to UNDERSTAND it, so that when I go to code my final module, I know my head is in the game.

Thanks.

 

Edited by SaranacLake
Link to comment
Share on other sites

Is it just me or did you add more code and then try to understand what that code does? Your question seems to be about what's happening with the foreach loop, but that loop does absolutely nothing useful. It doesn't need to be there at all. The only parts that matter here are (1) the pattern and replacement arrays and (2) the preg_replace().

Link to comment
Share on other sites

20 minutes ago, requinix said:

Is it just me or did you add more code and then try to understand what that code does? Your question seems to be about what's happening with the foreach loop, but that loop does absolutely nothing useful. It doesn't need to be there at all. The only parts that matter here are (1) the pattern and replacement arrays and (2) the preg_replace().

I don't follow your observation.

I used a foreach loop to loop through my $patternsArray - which contains the regex used to match a given bad word - and then inside the loop I use that given bad word regex in...

	    preg_match_all($pattern, $text, $tempArray, PREG_PATTERN_ORDER);            // (1)
    $matchesArray = array_merge($matchesArray, $tempArray[0]);                  // (3)
	

...to create the $matchesArray which I use to log each bad word a user used and I log that in my database.  (That code is not listed above.)

 

So this line of code actually *** out the ad words...

	    // Clean text.
    $cleanText = preg_replace($patternsArray, $replacementsArray, $text, $limit=-1, $allCount);
	

 

But I was asking about the code under "  // What does this do????" because I do not understand why I was saving the bad words in $tempArray and then using array_merge to get things into $matcehsArray

 

Follow me now?

 

P.S.  I wrote this function in 2015, and while I unit-tested all of my code and it was working, i don't undrstand what I was doing above, and wonder if there is a better way?

Edited by SaranacLake
Link to comment
Share on other sites

13 minutes ago, SaranacLake said:

But I was asking about the code under "  // What does this do????" because I do not understand why I was saving the bad words in $tempArray and then using array_merge to get things into $matcehsArray

I don't know why you did that either. It doesn't accomplish anything. $tempArray and $matchesArray don't do anything, the code using them doesn't do anything, and all of that is pointless.

Maybe it was leftover debugging code?

Link to comment
Share on other sites

Here is an updated file that will show where I'm getting confused...

	<?php
  // Initialize Variables.
  $patternsArray = array();
  $replacementsArray = array();
  $tempArray = array();
  $matchesArray = array();
	  
  // Un-censored text.
  $text = "Cassie, slipped on some glass and cut her ass!<br>
           'Goshdamnit! Who left this damn mess out?' she cried.<br>
           'Well, SHOOT, you should be more careful!' Mary laughed.<br>
           'Hello! Why don't you clean up after yourself!' Cassie screamed.<br>
           'Go to HELL!' Mary yelled back.<br>
           Then suddenly, blood started shooting everywhere.XXX";
	    
  // Create Patterns.
  $patternsArray[0] = "/\b(ass)(s?)\b/i";       // Word pattern
  $patternsArray[1] = "/damn/i";                // Substring pattern
  $patternsArray[2] = "/\b(shoot)(s?)\b/i";     // Word pattern
  $patternsArray[3] = "/\b(hell)(s?)\b/i";      // Word pattern
//  $patternsArray[4] = "/XXX/i";                 // Substring pattern
	  // Create Replacements.
  $replacementsArray[0] = '***';                  // ass
  $replacementsArray[1] = '****';                 // damn
  $replacementsArray[2] = '****';                 // shoot
  $replacementsArray[3] = '****';                 // hell
//  $replacementsArray[4] = '***';                  // XXX
	  // Sort Arrays.
  ksort($patternsArray);
  ksort($replacementsArray);
	  // What does this do????
  foreach ($patternsArray as $pattern){
    preg_match_all($pattern, $text, $tempArray, PREG_PATTERN_ORDER);            // (1)
//    preg_match_all($pattern, $text, $matchesArray, PREG_PATTERN_ORDER);       // (2)
	    $matchesArray = array_merge($matchesArray, $tempArray[0]);                  // (3)
    /* If I comment (1) and (3) and uncomment (2),
     * then var_dump($matchesArray) displays nothing.
     * Not realy sure what this block of code does...
     */
  }
	    // Clean text.
    $cleanText = preg_replace($patternsArray, $replacementsArray, $text, $limit=-1, $allCount);
  
  echo "<b>DIRTY TEXT:</b><br>
        $text";
  
  var_dump($patternsArray);
  var_dump($replacementsArray);
  var_dump($tempArray);
  var_dump($matchesArray);
  
  echo "<b>CLEAN TEXT:</b><br>
        $cleanText";
	?>
	

 

Run that code as is, and for var_dump($tempArray) you will see...

	array 0 => array 0 => 'HELL'
	array 1 => array 0 => 'HELL'
	

 

Uncomment these two lines...

	//  $patternsArray[4] = "/XXX/i";                 // Substring pattern
 
	 
	//  $replacementsArray[4] = '***';                  // XXX
	

 

And then for var_dump($tempArray) you will see...

	array 0 => array 0 => 'XXX'
	

 

In either case, the $matchesArray seems to yield the results I really care about.

 

So why is $tempArray changing values but it doesn't seem to impact $matchesArray?

And, therefore, what does this line of code really do?

	    preg_match_all($pattern, $text, $tempArray, PREG_PATTERN_ORDER);            // (1)
	

 

Link to comment
Share on other sites

7 minutes ago, requinix said:

I don't know why you did that either. It doesn't accomplish anything. $tempArray and $matchesArray don't do anything, the code using them doesn't do anything, and all of that is pointless.

Maybe it was leftover debugging code?

I just explained why I am creating $matchesArray...

I use that to gather the "bad words" from a given user post and log them into my database under their user ID.

I should have added the comment //What does this do?

I know what the loop does, what i don't understand is this...

 

    /* If I comment (1) and (3) and uncomment (2),
     * then var_dump($matchesArray) displays nothing.
     * Not really sure what this block of code does why this code behaves as it does...
     */

Link to comment
Share on other sites

Do you really want an explanation? Or are you just trying to figure out why it's not working the way you want?

Because if you just want to make this work, there is a much easier way of doing the whole thing: use preg_replace_callback() to run your own code every time it finds a match. Also means you don't need the replacement array since you're just replacing the word with a suitable number of asterisks, and str_repeat() can create them easily enough.

Link to comment
Share on other sites

26 minutes ago, requinix said:

Do you really want an explanation? Or are you just trying to figure out why it's not working the way you want?

Yes, I am trying to understand how my code works...

 

26 minutes ago, requinix said:

Because if you just want to make this work, there is a much easier way of doing the whole thing: use preg_replace_callback() to run your own code every time it finds a match. Also means you don't need the replacement array since you're just replacing the word with a suitable number of asterisks, and str_repeat() can create them easily enough.

Thanks for the code.  Have been looking at it, but am not entirely following it even after loking things up at php.net...

1.) What is $matches?  And where does it come from?

 

2.) What is "use"?

 

3.) Please explain what this is doing...

	function($matches) use (&$matchesArray) {
	

 

4.) I am guessing that $matches is an array.  But what is this doing?

	list($word) = $matches;
	

I am used to seeing list used like this...

	$info = array('coffee', 'brown', 'caffeine');
	list($drink, $color, $power) = $info;
	

 

5.) Am not getting tis...

	        list($word) = $matches;
        $matchesArray[] = $word;
	

 

Link to comment
Share on other sites

P.S.  I'm always interested in learning better ways to do things, but I'm not sure your code will help, because I need to capture each bad word and ultimately write it into a table, which is why I created a $matchesArray.

It doesn't look like your code would allow me to grab a bad word at a time and then insert it into a table, but I could be wrong.

(Fwiw, my code does work, I just don't understand how the couple of lines mentioned above actually work.)

Link to comment
Share on other sites

1 hour ago, SaranacLake said:

1.) What is $matches?  And where does it come from?

Read the documentation. The answer is there. If you looked once, look again.

 

1 hour ago, SaranacLake said:

2.) What is "use"?

Also in the documentation. Find the section that talks about user-defined functions.

 

1 hour ago, SaranacLake said:

3.) Please explain what this is doing...


	function($matches) use (&$matchesArray) {
	

This is the same question as 1 and 2.

 

1 hour ago, SaranacLake said:

4.) I am guessing that $matches is an array.  But what is this doing?


	list($word) = $matches;
	

list() has documentation.

 

1 hour ago, SaranacLake said:

I am used to seeing list used like this...


	$info = array('coffee', 'brown', 'caffeine');
	list($drink, $color, $power) = $info;
	

And my code is doing the same thing.

 

1 hour ago, SaranacLake said:

5.) Am not getting tis...


	        list($word) = $matches;
        $matchesArray[] = $word;
	

 

Read the documentation for list() to know what the first line does. Read the documentation for arrays to know what the second line does.

 

Frankly, I find it hard to believe you have these kinds of questions after more than 600 posts on this forum. They are very basic features of PHP, and even if you don't recognize some of them, you should know how to find out what they are without having someone tell you to read the documentation 5 times.
And I suspect that after you do learn about how my code works, you'll also have found your answers to how your code worked.

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.