Jump to content

Recommended Posts

hi, the following code replaces all keywords with a linked keyword.  However, I would like to use REGEX in order to avoid the manipulation of keywords when they fall within links.

 

ex: if the keyword is orange

 

<br>orange</br> would become

 

<br><a href = "somelink">orange</a></br>

 

however I'd like to avoid the following

 

original string:

<a href = "http://somedomain.com/orange">yadayada</a>

 

string after preg_replace:

<a href = "http://somedomain.com/<a href = "somelink">orange</a>">yadayada</a>

 

code:

global $mainframe;

	$content = $article -> text;		
	$current = JURI::current();

	$keywords = $this -> params->def('keywords');
	$words = explode(",", $keywords);

	$keywordLinks = $this -> params->def('links');
	$Links = explode(",", $keywordLinks);






$number = count($words);

for($i = 0; $i < $number; $i++)
{
$Links[$i] = '<a href = "../plugins/content/keyword.php?destination='.urlencode($Links[$i]).'&location='.urlencode($current).'&keyword='.$words[$i].'">'.$words[$i].'</a>';



}

for($i = 0; $i < $number; $i++)
{
$words[$i] = '/\b'.$words[$i].'\b/';
/*$words[$i] = str_replace(' ', '', $words[$i]);*/
}

$content  = preg_replace($words, $Links, $content);


$article -> text = $content;

actually, I think it's the slash in the urls which is throwing the current pattern off.

 

 

reviews/payroll-relief-from-accountantsworld.html

 

payroll would be replaced in the above string, maybe it's due to the slash preceding the keyword(payroll)??

Nothing to do with the slash. But I got the solution for you. I've utilized a negative lookahead:

 

   for($i = 0; $i < $number; $i++)
   {
   $words[$i] = '/' . preg_quote($words[$i], '/') . '(?![^<]*?>)/';
   /*$words[$i] = str_replace(' ', '', $words[$i]);*/
   }

It searches for the keyword, and when found, checks the following characters. If any other character than < is found 0 or more times, immediately followed by a >, the whole thing fails to match (= the keyword is inside a HTML tag, i.e. between < and >). But if a < is found before a > (to say it loosely), the pattern does match, resulting in a replaced keyword.

 

Basically, keywords found between a set of <> aren't replaced. Hope that's what you're looking for :)

 

 

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.