smith.james0 Posted April 28, 2012 Share Posted April 28, 2012 Hi I have tried to write a script to collect web links from a html page put them in an array and compare it with an array of urls from my db. I am using this expression to find the urls in the page /href="([^\s"]+)/ it works but it returns http://www.google.com when I want google.com I have search the net but most of the expressions I came across don't work or return image urls as well. Can anyone help? James Quote Link to comment Share on other sites More sharing options...
xyph Posted April 28, 2012 Share Posted April 28, 2012 Use an HTML parser like DOMDocument to grab all of the anchor tags in a page. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.