iamX Posted March 13, 2006 Share Posted March 13, 2006 Hay there, Hope someone can help me with the following. I'm creating a script that indexes links on pages to create easy sitemaps.I have already stripped tags except <a> and I use preg_match_all to collect all the links with the following:$matched = preg_match_all("{m=more&id=(.*?)>(.*?)</a>}", $txtonly, $match);but now the problem.:Some links have a graphic before them with the same m=more&id=###> link.But since the <img> tags has been stripped it's empty so when rebuild it results in:<a href=sitemap.php?m=more&id=###></a> and an <a href=sitemap.php?m=more&id=###>Link name</a>I tried replacing the second (.*?) in (.+?) but then the result is:<a href=sitemap.php?m=more&id=###></a><a href=sitemap.php?m=more&id=###>Link name</a>(double result)No result when matching for [a-zA-Z0-9]Am I missing something or approaching this the wrong way? Link to comment https://forums.phpfreaks.com/topic/4827-trouble-with-regular-expresspions/ Share on other sites More sharing options...
wickning1 Posted March 13, 2006 Share Posted March 13, 2006 $matched = preg_match_all("{m=more&id=(.*?)>([^<]+?)</a>}", $txtonly, $match);Try that. Link to comment https://forums.phpfreaks.com/topic/4827-trouble-with-regular-expresspions/#findComment-16981 Share on other sites More sharing options...
iamX Posted March 13, 2006 Author Share Posted March 13, 2006 [!--quoteo(post=354497:date=Mar 13 2006, 03:26 PM:name=wickning1)--][div class=\'quotetop\']QUOTE(wickning1 @ Mar 13 2006, 03:26 PM) [snapback]354497[/snapback][/div][div class=\'quotemain\'][!--quotec--]$matched = preg_match_all("{m=more&id=(.*?)>([^<]+?)</a>}", $txtonly, $match);Try that.[/quote]Thanks for replying wicknick1 :)unfortunatly I still get the double results. so one without linkname (where the <img> was) and one with the correct link as following:<a href=sitemap.php?m=more&id=###></a><a href=sitemap.php?m=more&id=###>Linkname</a>Maybe a way to seperate id=### and the link name?So when an ID is the same, it only results one ID ? Link to comment https://forums.phpfreaks.com/topic/4827-trouble-with-regular-expresspions/#findComment-17028 Share on other sites More sharing options...
iamX Posted March 13, 2006 Author Share Posted March 13, 2006 Aah I found it! The greedy first (.*?) was the troublemaker. My code came out like this:$matched = preg_match_all("{m=more&id=([A-Z0-9]+?)>([^<]+?)</a>}", $txtonly, $match);No double Href's anymore :) Link to comment https://forums.phpfreaks.com/topic/4827-trouble-with-regular-expresspions/#findComment-17067 Share on other sites More sharing options...
Recommended Posts
Archived
This topic is now archived and is closed to further replies.