Jump to content

[SOLVED] Auto link first instance of keywords


imekul

Recommended Posts

Hi all!

 

Here's what I'm trying to do.  I have a blog (www.famteam.com/today/), and I want to add automatic hyperlinks to different names.  Say I mention the name "Luke" in the blog.  I want that to be automatically changed to [iurl=#]Luke[/iurl].  But I only want this to happen for the first instance of the keywords, regardless of where in the page the keyword first pops up.

 

Now, here's my current setup:

$look_for is an array with the words to search for (names, such as Luke, Mark, et cetera)

$replace_with is another array, with the corresponding replacement names (that is, a hyperlinked "Luke," hyperlinked "Mark," etc)

$mainbody is the text of each post in the blog.

 

Then I use this:  $mainbody = preg_replace($look_for, $replace_with, $mainbody, 1);

 

This works perfectly.  The problem is, I pull 50 posts from the database, and so the preg_replace function is getting run 50 times.  As a result of that, obviously, it's re-replacing the keywords in each post.  Basically, what I would like to happen is for the autolinks to only occur once per keyword for the entire page.  The method I have now has them occur once per post.

 

Does this make sense?  Basically, I just want each keyword to only be linked one time, in its first instance, throughout the entire page. 

 

Thanks for any help you can provide! :)

Link to comment
Share on other sites

Ah, good idea. :)

 

Basically, say I have these two following sentences (each being pulled from a MySQL table):

 

#1 - Well, Mark's Super Bowl picks were definitely closer than Luke's.

#2 - Sorry, Mark, I can't go along with you on that one.  I'm going to have to side with Luke.  Nathan, on the other hand, has a really interesting prediction

 

Imagine each is a separate post in a blog (to see what I'm talking about, go to www.famteam.com/today/).  Anyway, when it's all said and done, I want those to be formatted to look like:

 

#1 - Well, Mark's Super Bowl picks were definitely closer than Luke's.

#2 - Sorry, Mark, I can't go along with you on that one.  I'm going to have to side with Luke.  Nathan, on the other hand, has a really interesting prediction.

 

The first instance of each keyword would get a link, and then the rest of the times it comes up, it's just plain text.

 

Hope this makes it a little clearer as to what I'm trying to accomplish!

Link to comment
Share on other sites

ahhh, I think i gotcha

 

<?php
$posts=array(
  "Well, Mark's Super Bowl picks were definitely closer than Luke's.",
  "Sorry, Mark, I can't go along with you on that one.  I'm going to have to side with Luke.  Nathan, on the other hand, has a really interesting prediction"
);

$wordlinks=array(
"Mark" => "http://www.mark.com",
"Luke" => "http://www.luke.com",
"Nathan" => "http://www.nathan.com"
);

function DoWordLink($match)
{
global $wordlinks;

$rpl=$match[1];
if(isset($wordlinks[$rpl]))
{
	$rpl="<A HREF=$wordlinks[$rpl]>$rpl</A>";
	unset($wordlinks[$match[1]]);
}
return $rpl;
}

foreach($wordlinks as $key => $val)
$wl[]=$key;
$pm="/((?<=[\s^])(?:" . implode('|',$wl) .")(?=[\.\!\?\,\'\s$]?))/m";

header("Content-type: text/plain");


echo "$pm\n\n";
$ctr=0;
foreach($posts as $key => $mainbody)
{
echo ++$ctr . ") $mainbody\n";
$mainbody=preg_replace_callback($pm, 'DoWordLink', $mainbody)  ;
echo ++$ctr . ") $mainbody\n";
}
?>	

 

came back with

/((?<=[\s^])(?:Mark|Luke|Nathan)(?=[\.\!\?\,\'\s$]?))/m

 

1) Well, Mark's Super Bowl picks were definitely closer than Luke's.

2) Well, <A HREF=http://www.mark.com>Mark</A>'s Super Bowl picks were definitely closer than <A HREF=http://www.luke.com>Luke</A>'s.

3) Sorry, Mark, I can't go along with you on that one.  I'm going to have to side with Luke.  Nathan, on the other hand, has a really interesting prediction

4) Sorry, Mark, I can't go along with you on that one.  I'm going to have to side with Luke.  <A HREF=http://www.nathan.com>Nathan</A>, on the other hand, has a really interesting prediction

 

The first line was just there for checking the preg_match pattern, but it looks like it works :)

Question is do u understand how this works?

 

 

Link to comment
Share on other sites

Hey, I think that works! :) Wow -- thanks.

 

Now, how would I add more posts to the array?  Of course, when it's in use, I'll use a while loop, and each time, the body of the post (the text to be searched) will be $mainbody.  So can I just replace $posts with $mainbody?  Or do I need to do something else in order to have it work with each individual post, or...? 

 

Also, just to experiment, I tried adding a third value, a third sentence, to the $posts array, but that one didn't seem to be searched like the other two were.

 

I could very well be missing something obvious, and, to be honest, I don't understand most of the code you posted.  I understand the basics of the preg_replace, but I'm afraid trying to read through your function is just making my head swim a little bit... ;) I'd love to learn this better, though; it looks like there's a tremendous amount of potential with this type of thing.

 

Thanks for taking the time to write it out, though!  It looks like it's very close to what I'm looking for.

Link to comment
Share on other sites

1st Step ) Build your wordlinks array

$wordlinks are the keywords and urls, simple

so u can build wordlinks anyway ya want, since this was an example. I just used a predefined list.

you can use any other method to build this list, db, file, what have u.

$wordlinks=array(
"Mark" => "http://www.mark.com",
"Luke" => "http://www.luke.com",
"Nathan" => "http://www.nathan.com"
);

 

2nd Step) Build the regex pattern

Simply move the keys, from the wordlinks into an array of it's own.

and build a regex pattern with those words

I shudda used, array_keys instead of a foreach loop.

 

the pattern special notes:

(?<=[\s^]) - defines a prefix of whitespace or start of string (excluded from capture)

(?:implode('|',$wl)) - creates another group (with our wordlinks as alternatives)

(?=[\.\!\?\,\'\s$]?) - defines a suffix of either . ! ? , ' whitespace or end of string

$wl=array_keys($wordlinks);
$pm="/((?<=[\s^])(?:" . implode('|',$wl) .")(?=[\.\!\?\,\'\s$]?))/m";

 

the function DoWordLink, checks the existance of a keyword in our wordlink array, if it exists

it replaces the word, with the link, and removes the word from the list (so next call, it fails the check)

 

yes, u can use it with a db

 

like so

 

$res=mysql_query("Select who,when,comment from posts where id=$thisblogid");
while($row=mysql_fetch_assoc($res))
{
  $comment=preg_replace_callback($pm, 'DoWordLink', $comment);
?>
From: <?=$row['who']?><br>
When: <?=$row['when']?><br>
Comment: <?=$comment?><br><br>
<?
}

 

now if ya need to use the wordlinks over and over for some reason.

you copy $wordlinks into a tmp array, and restore it when done

 

$tmp=$wordlinks;
$comment=preg_replace_callback($pm, 'DoWordLink', $comment);
$wordlinks=$tmp;

 

$posts is irrelevant, just some sample text, thrown in, so to give the demo something to work with

Link to comment
Share on other sites

That seems to work great!  The only problem I can see is that when a person's name (a keyword) is the first word in a post (that is, when the sentence would start out "Luke, are you going to..."), then that keyword is skipped over, is not autolinked.

 

So as a simple example, if the post contained "Luke Luke," the end result would be "Luke Luke."  And, as far as I can tell, this only applies to words at the beginning of a post. 

 

I think it's something to do with the regular expressions and the delimiters, but, honestly, $pm="/((?<=[\s^])(?:" . implode('|',$wl) .")(?=[\.\!\?\,\'\s$]?))/m"; doesn't make a whole lot of sense to me.

 

Methinks it's time to learn regular expressions. ;)

Link to comment
Share on other sites

first things first

 

Problem 1:

So as a simple example, if the post contained "Luke Luke," the end result would be "Luke Luke."  And, as far as I can tell, this only applies to words at the beginning of a post.

 

Problem 2:

is it simple to make the names case-insensitive?  So LUKE, Luke, luKe, and luke would all be linked?

 

I did some rearrangine of the pattern, added the case insensitivity flag as well.

 

$pm="/((?<=\s|^)(?:" . implode('|',$wl) .")(?=\.|\!|\?|\,|\'|\s|$))/im"

this seems to work better :)

 

 

Problem 2 continuation.

 

We need to adjust word links keys to lowercase,

$wordlinks=array(
"mark" => "http://www.mark.com",
"luke" => "http://www.luke.com",
"nathan" => "http://www.nathan.com"
);

 

than use strtolower in our check, cuz Luke, lUke, LUKE will report as different indexes, so we standardize them as lowercase.

so we want to keep the original match string, and a new key index string which is all lowercase :)

function DoWordLink($match)
{
global $wordlinks;

$rpl=$match[1];
if(isset($wordlinks[$lrpl=strtolower($rpl)]))
{
	$rpl="<A HREF=$wordlinks[$lrpl]>$rpl</A>";
	unset($wordlinks[$match[1]]);
}
return $rpl;
}

Link to comment
Share on other sites

Wow, that works great!  Everything seems to be perfect, with a small exception.  It seems like when there is a keyword with punctuation after it (i.e., Luke, or Luke. or Luke!), then each and every instance of that word is linked.

 

Honestly, it's not a huge deal, because it's rarer for a word like that to come up.  Out of curiosity, though, I'm wondering why that is, if you happen to know.

 

Either way, thanks a million for all of the help.  It's been great!  I'm going to try to dissect the $pm="/((?<=\s|^)(?:" . implode('|',$wl) .")(?=\.|\!|\?|\,|\'|\s|$))/im"; line and try to make some sense of it. ;D

Link to comment
Share on other sites

Learning regex, reges has always giving me headaches as well.

then I started using Expresso.

it's a regex development tool, it can help u build analyze/study and learn regex.

It's one of those tools, ya cant do without when getting started with regex.

 

expresso and php patterns.

php regex has added delimeters and options

/(actual regex patthern)/regex options

 

so when using expresso, remember to add/remove these from/to php/expresso. :)

Link to comment
Share on other sites

Terrific!  Thanks for the lead, buddy. :)

 

By the way, any idea how I could make that change on the quirk with the keywords being followed by certain punctuation marks (like , and . and !) always being linked instead of just the first instance?

 

Thanks again.  I'm going to check out Expresso right now.

Link to comment
Share on other sites

i couldnt replicate the problem u describe.

 

i did find a problem in the code tho, in the function to be precise. it wasnt removing the reference from the array.

 

function DoWordLink($match)
{
global $wordlinks;

$rpl=$match[1];
if(isset($wordlinks[$lrpl=strtolower($rpl)]))
{
	$rpl="<A HREF=$wordlinks[$lrpl]>$rpl</A>";
	unset($wordlinks[$lrpl]);
}
return $rpl;
}

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.