mac_gabe Posted November 9, 2010 Share Posted November 9, 2010 Hi, I have a page of html and I'm trying to use preg_replace to select a bunch of text out of the whole page. I've done it successfully once, to select some other text off that page, but in this example I end up selecting everything (i.e. too much). My source html page looks something like this a bunch of html <a href="category-accentors.php" class="blog-category-link-enabled">Accentors (2)</a><br /> <a href="category-african-barbets.php" class="blog-category-link-enabled">African Barbets (3)</a><br /> ...loads more links and a few divs... <a href="category-xenops.php" class="blog-category-link-enabled">Xenops (2)</a><br /> more html I want to select everything from <a href="category-accentors.php" ... to ... Xenops (2)</a><br /> and discard the rest and place it in a new php/html page. The (2) after Xenops is a variable number. This is the preg_replace pattern I'm using: $pattern_eng_bird_cat= '/\<a href="category-accentors\.php"(.*?)Xenops \((\d+)\)\<\/a\>\<br \/\>/'; $replace_eng_bird_cat= '<a href="category-accentors.php"$1Xenops ($2)</a><br />'; $eng_bird_cat= preg_replace($pattern_eng_bird_cat, $replace_eng_bird_cat, $categories); // should return list of English bird names and links from Accentors to Xenops echo $eng_bird_cat; I'm new to this and have tried searching and following as many links as poss but just can't work out where I'm going wrong. Any help gratefully received. Quote Link to comment Share on other sites More sharing options...
mac_gabe Posted November 9, 2010 Author Share Posted November 9, 2010 Oh wait - I think I've just seen a problem. I forgot to put something in the pattern to search for the unwanted html. Now I've got: $pattern_eng_bird_cat= '/(.*?)\<a href="category-accentors\.php"(.*?)Xenops \((\d+)\)\<\/a\>\<br \/\>(.*?)/'; $replace_eng_bird_cat= '<a href="category-accentors.php"$2Xenops ($3)</a><br />'; which is slightly better - it excludes the initial unwanted html - but still returns the final unwanted html. Quote Link to comment Share on other sites More sharing options...
mac_gabe Posted November 9, 2010 Author Share Posted November 9, 2010 I've removed the variable number to simplify things. This still doesn't work (it returns everything after Xenops, in addition to everything from Accentors to Xenops) $pattern_eng_bird_cat= '/(.*?)\<a href="category-accentors\.php"(.*?)Xenops(.*?)/'; $replace_eng_bird_cat= '<a href="category-accentors.php"$2Xenops</a><br />'; Quote Link to comment Share on other sites More sharing options...
mac_gabe Posted November 9, 2010 Author Share Posted November 9, 2010 OK, I've finally worked it out. Trial and error is a wonderful thing! I removed the ? marks so (.*) instead of (.*?) and it works like a dream I only started putting ? marks in because it worked better with them in another search, no real idea what they do or why, other than one is "greedy" and the other isn't. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.