prosper99 Posted February 16, 2010 Share Posted February 16, 2010 Hi Guys/Gals, Writing a basic website page scraper and have run into a (mental) road block. I need to scrape a basic webpage and search the contents for a specific string, once I find that string I need to extract a 10 digit number that sits on the same line where the string was found.. I'm not able to call sed to do this, so is there away with just PHP? Here is an example of a scrape code; I use this to pull the site... $url = 'http://anywebsite.com/list/sus/'; $output = file_get_contents($url); Now, in that output there will be a line like the one below with the string I need to search for "matching text". Is there a way to just extract the 10 digit number and assign to a variable? href="http://site.website.com/list/tor/1234567890.html">matching text/a> - <span class="p"> pic</span Thanks for any assistance you can provide.. Cheers, Rob Quote Link to comment Share on other sites More sharing options...
premiso Posted February 16, 2010 Share Posted February 16, 2010 You will want to use preg_match. preg_match('~.com/list/tor/([0-9]{10}).html~', $match); $num = $match[1]; Untested, but should get the 10 digit number, pending any mistakes I may have made. Quote Link to comment Share on other sites More sharing options...
prosper99 Posted February 16, 2010 Author Share Posted February 16, 2010 Thank You for the quick response, works great! Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.