Jump to content

Why Are These Functions Causing MASSIVE Memory Problems? Please Help!


steven fullman

Recommended Posts

Hi,

 

I have a script with some options.

 

I use regex to replace patterns in strings, but I seem to be using them incorrectly, because they very quickly break my max_memory_limit (by several orders of magnitude)

 

This is strange, because I'm dealing with maybe 10 simultaneous strings of 500 words max.

 

I'm clearly causing some kind of overly recursive syntax, but I can't see how...

 

Any help you could give me to programme this better (or tell me where I'm going wrong) would be most appreciated :)

 



$string = "this would be about 500 words long";
$parts = $string; // $parts would normally be a substring of $string;

wp_wordmash($parts);
wp_synonymize($string);
wp_keyword2url($string);

//html stuff follows here...


function wp_wordmash($parts) {

$wordlist = file_get_contents('dictionary.txt', true);
$dictionary = explode(",", $wordlist);
$htmldictionary = array();	
foreach($dictionary as $dicword) {
	$htmldictionary[] = wp_htmlcode($dicword);
	$htmldictionary_u[] = wp_htmlcode(strtoupper($dicword));
	$htmldictionary_u1[] = wp_htmlcode(ucfirst($dicword));
	$htmldictionary_ucwords[] = wp_htmlcode(ucwords($dicword));
} 	
for($i=0;$i<count($dictionary);$i++){

	$parts = preg_replace("/\b$dictionary[$i]\b/", $htmldictionary[$i], $parts);
	$parts = preg_replace("/\b" . strtoupper($dictionary[$i]) . "\b/", $htmldictionary_u[$i], $parts);
	$parts = preg_replace("/\b" . ucfirst($dictionary[$i]) . "\b/", $htmldictionary_u1[$i], $parts);
	$parts = preg_replace("/\b" . ucwords($dictionary[$i]) . "\b/", $htmldictionary_ucwords[$i], $parts);
}	
return $parts;
}

function wp_htmlcode($string) { 

$buffer= NULL;	
for($i=0;$i<strlen($string);$i++) { 
	$buffer .= "&#" . ord($string{$i}) . ";"; 
} 	
return $buffer;
}

function wp_synonymize($string){

$buffer=$string;
$synonymfile = file_get_contents('synonyms.txt', true);
$synonyms = explode("\n", $synonymfile);
for($i=0;$i<count($synonyms);$i++){
	$synonymlist = explode(",", $synonyms[$i]);
	$oldword = $synonymlist[0];
	$synonym = $synonymlist[1];
	$synonym = str_replace("\r", '', $synonym);
	$buffer = preg_replace("/\b$oldword\b/", $synonym, $buffer);
	$buffer = preg_replace("/\b" . strtoupper($oldword) . "\b/", strtoupper($synonym), $buffer);
	$buffer = preg_replace("/\b" . ucfirst($oldword) . "\b/", ucfirst($synonym), $buffer);
	$buffer = preg_replace("/\b" . ucwords($oldword) . "\b/", ucwords($synonym), $buffer);
	}
return $buffer;
}

function wp_keyword2url($string){

$buffer=$string;
$keyword2urlfile = file_get_contents('keyword2url.txt', true);
$keywords = explode("\n", $keyword2urlfile);
for($i=0;$i<count($keywords);$i++){
	$keywordlist = explode(",", $keywords[$i]);
	$keyword = $keywordlist[0];
	$url = $keywordlist[1];
	$url = str_replace("\r", '', $url);
	$buffer = preg_replace("/\b$keyword\b/", '<a href = "' . $url . '">' . $keyword . '</a>', $buffer);
	$buffer = preg_replace("/\b" . strtoupper($keyword) . "\b/", '<a href = "' . $url . '">' . strtoupper($keyword) . '</a>', $buffer);
	$buffer = preg_replace("/\b" . ucfirst($keyword) . "\b/", '<a href = "' . $url . '">' . ucfirst($keyword) . '</a>', $buffer);
	$buffer = preg_replace("/\b" . ucwords($keyword) . "\b/", '<a href = "' . $url . '">' . ucwords($keyword) . '</a>', $buffer);
	}
return $buffer;
}

 

As I say, the string passed to these functions is typically < 500 words.

 

I've also included the comparison files (dictionary.txt, synonyms.txt and keyword2URL.txt)...HERE

 

 

I hope you can help...I'm 99% certain I'm using preg_replace() wrong...because if I substitute it with str_replace() then my memory issues disappear.

 

Problem is, I like preg_replace because it gives me the word border functionality.

 

I'm just obviously doing it wrong!

 

Any thoughts?

 

Kind regards,

Steve

 

P.S. Please feel free to mock & laugh at me...as long as you can show me a better way!  :D

 

And if you need any more info, please ask

 

 

Link to comment
Share on other sites

preg stuff uses more resources and should be used sparingly, explode and str_replace are better alternatives for looping data

 

Im assuming the dictionary has over 500 words in it so:

 

 for($i=0;$i<count($dictionary);$i++){
      
      $parts = preg_replace("/\b$dictionary[$i]\b/", $htmldictionary[$i], $parts);
      $parts = preg_replace("/\b" . strtoupper($dictionary[$i]) . "\b/", $htmldictionary_u[$i], $parts);
      $parts = preg_replace("/\b" . ucfirst($dictionary[$i]) . "\b/", $htmldictionary_u1[$i], $parts);
      $parts = preg_replace("/\b" . ucwords($dictionary[$i]) . "\b/", $htmldictionary_ucwords[$i], $parts);
   } 

 

Is executing for each word. Change it all to str_replace

Link to comment
Share on other sites

Thanks dreamwest,

 

One of the reasons I'm using preg is that I can specify word borders...str_replace seems too limited in that way (i.e. I want to match the EXACT string only, not the string within other words).

 

I've tried using spaces to distinguish the exact pattern, but I fall over at the beginning and end of sentences...and using commas, etc...

 

Is there a way around these limitations?

 

preg stuff uses more resources and should be used sparingly, explode and str_replace are better alternatives for looping data

 

Im assuming the dictionary has over 500 words in it so:

 

 for($i=0;$i<count($dictionary);$i++){
      
      $parts = preg_replace("/\b$dictionary[$i]\b/", $htmldictionary[$i], $parts);
      $parts = preg_replace("/\b" . strtoupper($dictionary[$i]) . "\b/", $htmldictionary_u[$i], $parts);
      $parts = preg_replace("/\b" . ucfirst($dictionary[$i]) . "\b/", $htmldictionary_u1[$i], $parts);
      $parts = preg_replace("/\b" . ucwords($dictionary[$i]) . "\b/", $htmldictionary_ucwords[$i], $parts);
   } 

 

Is executing for each word. Change it all to str_replace

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.