eeart Posted January 4, 2011 Share Posted January 4, 2011 I have 2 regex questions related to hyperlinks: 1. I am trying to add a class to hyperlinks with a certain URL. What I have now is this: $pattern = "/<a href=[\'\"](" . $url . ")[\'\"]>(.*)<\/a>/is"; $replace = "<a href=\"$1\" class=\"". $class_name ."\">$2</a>"; $html = preg_replace($pattern, $replace, $html); But this only works if the original <a> tag has no special attributes, such as an id, style, target, title or another class. If it already has a class I want to replace it with my new class. 2. To find all hyperlinks in a text I am using the following preg_match_all: preg_match_all( '#<a\s (?= [^>]* href=" (?P<href> [^"]*) ")|) (??= [^>]* title=" (?P<title> [^"]*) ")|) (??= [^>]* class=" (?P<class> [^"]*) ")|) (??= [^>]* target=" (?P<target>[^"]*) ")|) [^>]*> (?P<text>[^<]*) </a> #xi', $html, $matches, PREG_SET_ORDER ); This works well, but it doesn't find links with single quotes. So it'll find <a href="something"> but not <a href='something'> Can someone help me with this? I have been struggling with this for a while. Thank you. Quote Link to comment Share on other sites More sharing options...
sasa Posted January 5, 2011 Share Posted January 5, 2011 1. <?php $class_name = 'my_class'; $html = '<a id="23" href="sa\'sa.com" class="yyy" bla bla>xx</a>'; $pattern1 = '/(<a [^>]*) class=([\'"])[^\2]+\2([^>]*>)/is'; $replace1 = '\1\3'; $pattern2 = '/(<a [^>]*href=([\'"])[^\2]+\2)([^>]*)/is'; $replace2 = '\1 class="'. $class_name .'"\3'; $html = preg_replace($pattern1, $replace1, $html); echo $html = preg_replace($pattern2, $replace2, $html); ?> Quote Link to comment Share on other sites More sharing options...
eeart Posted January 5, 2011 Author Share Posted January 5, 2011 Thank you sasa, but I don't want to remove all existing classes and I only want to add the new class to hyperlinks with a specific URL. The purpose is to mark hyperlinks with a broken or bad URL. So for example a text may have 20 hyperlinks and I only want to add a class "broken_link" to all hyperlinks with URL http://www.badlink.com Quote Link to comment Share on other sites More sharing options...
johnny86 Posted January 7, 2011 Share Posted January 7, 2011 Maybe you shoud consider using simplehtmldom for that? Quote Link to comment Share on other sites More sharing options...
eeart Posted January 7, 2011 Author Share Posted January 7, 2011 Maybe you shoud consider using simplehtmldom for that? That HTML Dom Parser is brilliant! Thank you so much, johnny86. This does exactly what I need! Quote Link to comment Share on other sites More sharing options...
sasa Posted January 7, 2011 Share Posted January 7, 2011 change $pattern2 to $pattern2 = '/(<a [^>]*href=([\'"])'. $url . '\2)([^>]*)/is'; Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.