xor83 Posted March 2, 2009 Share Posted March 2, 2009 How can I extract URLs from webpage and want that all url should be from some specific site only like "www.abc.com/32432/file.zip" it should search abc.com and the extenstion can zip,rar,001 any help? Link to comment https://forums.phpfreaks.com/topic/147633-extract-url/ Share on other sites More sharing options...
br0ken Posted March 2, 2009 Share Posted March 2, 2009 Use file_get_contents($url) to get the HTML of the website in question. Then loop through that content looking for <a tags. Each time you find an anchor tag, use substr() and strpos() to extract the link. Do this for all links adding each one into an array. Once you've done this, go through each link checking to see whether it is from the domain you're looking for. You can use the parse_url() function to do this or you can use a combination of strpos() and substr() If you need more information on the functions given above please see these links: http://uk2.php.net/substr http://uk.php.net/strpos http://uk2.php.net/function.file-get-contents http://uk3.php.net/function.parse-url Link to comment https://forums.phpfreaks.com/topic/147633-extract-url/#findComment-775011 Share on other sites More sharing options...
a-scripts.com Posted March 2, 2009 Share Posted March 2, 2009 i think regular expressions are your friends for this ... using preg_match_all() and some fancy pattern you can get exactly what you need Link to comment https://forums.phpfreaks.com/topic/147633-extract-url/#findComment-775122 Share on other sites More sharing options...
premiso Posted March 2, 2009 Share Posted March 2, 2009 <?php $string = '<a href="http://www.abc.com/rand/file.r01"> <a href="http://www.abc.com/rand/file.r02"> <a href="http://www.abc.com/rand/file.r03">'; preg_match_all('~www.abc.com/(.+?)"~is', $string, $matches); echo "<pre>" . print_r($matches, 1) . "</pre>"; die(); ?> Rough example. But should work. Outputs: Array ( [0] => Array ( [0] => www.abc.com/rand/file.r01" [1] => www.abc.com/rand/file.r02" [2] => www.abc.com/rand/file.r03" ) [1] => Array ( [0] => rand/file.r01 [1] => rand/file.r02 [2] => rand/file.r03 ) ) Link to comment https://forums.phpfreaks.com/topic/147633-extract-url/#findComment-775125 Share on other sites More sharing options...
xor83 Posted March 3, 2009 Author Share Posted March 3, 2009 thanx all for your rpl premiso: I have tested many sample but none of them is working the way I want but the code you gave me is working exactly the way I wanted thax guru... Link to comment https://forums.phpfreaks.com/topic/147633-extract-url/#findComment-775318 Share on other sites More sharing options...
Recommended Posts
Archived
This topic is now archived and is closed to further replies.