Trouble with Regular expresspions

iamX · March 13, 2006

Hay there,

Hope someone can help me with the following.

I'm creating a script that indexes links on pages to create easy sitemaps.
I have already stripped tags except <a>

and I use preg_match_all to collect all the links with the following:

$matched = preg_match_all("{m=more&id=(.*?)>(.*?)</a>}", $txtonly, $match);

but now the problem.:
Some links have a graphic before them with the same m=more&id=###> link.

But since the <img> tags has been stripped it's empty so when rebuild it results in:
<a href=sitemap.php?m=more&id=###></a> and an
<a href=sitemap.php?m=more&id=###>Link name</a>

I tried replacing the second (.*?) in (.+?) but then the result is:
<a href=sitemap.php?m=more&id=###></a><a href=sitemap.php?m=more&id=###>Link name</a>
(double result)

No result when matching for [a-zA-Z0-9]

Am I missing something or approaching this the wrong way?

wickning1 · March 13, 2006

$matched = preg_match_all("{m=more&id=(.*?)>([^<]+?)</a>}", $txtonly, $match);

Try that.

iamX · March 13, 2006

[!--quoteo(post=354497:date=Mar 13 2006, 03:26 PM:name=wickning1)--][div class=\'quotetop\']QUOTE(wickning1 @ Mar 13 2006, 03:26 PM) [snapback]354497[/snapback][/div][div class=\'quotemain\'][!--quotec--]
$matched = preg_match_all("{m=more&id=(.*?)>([^<]+?)</a>}", $txtonly, $match);

Try that.
[/quote]

Thanks for replying wicknick1 :)

unfortunatly I still get the double results.
so one without linkname (where the <img> was) and one with the correct link as following:

<a href=sitemap.php?m=more&id=###></a><a href=sitemap.php?m=more&id=###>Linkname</a>

Maybe a way to seperate id=### and the link name?
So when an ID is the same, it only results one ID ?

iamX · March 13, 2006

Aah I found it!

The greedy first (.*?) was the troublemaker.

My code came out like this:

$matched = preg_match_all("{m=more&id=([A-Z0-9]+?)>([^<]+?)</a>}", $txtonly, $match);

No double Href's anymore :)

Sign In

Trouble with Regular expresspions

Recommended Posts

iamX

Link to comment

Share on other sites

wickning1

Link to comment

Share on other sites

iamX

Link to comment

Share on other sites

iamX

Link to comment

Share on other sites

Archived

Browse

Activity

Important Information