xsuck91 Posted December 17, 2011 Share Posted December 17, 2011 i need some help to scrape a link from specified page. for example if i have a page like this http://br.4ce.info/ i want to scrape all link on that page and i want to show all link in that page on my wordpress widget in another blog ? can you help me with this ? dont use iframe i think better using cURL thanks Quote Link to comment https://forums.phpfreaks.com/topic/253369-link-scraping/ Share on other sites More sharing options...
paparts Posted December 17, 2011 Share Posted December 17, 2011 Here is how I use to crawl websites and extract the links, I think you can use this: <?php $input = @file_get_contents('http://www.icpep.org'); $regexp = "<a\s[^>]*href=(\"??)([^\" >]*?)\\1[^>]*>(.*)<\/a>"; if(preg_match_all("/$regexp/siU", $input, $matches, PREG_SET_ORDER)) { foreach($matches as $match) { $urlregex = "^(https?|ftp)\:\/\/([a-z0-9+!*(),;?&=\$_.-]+(\:[a-z0-9+!*(),;?&=\$_.-]+)?@)?[a-z0-9+\$_-]+(\.[a-z0-9+\$_-]+)*(\:[0-9]{2,5})?(\/([a-z0-9+\$_-]\.?)+)*\/?(\?[a-z+&\$_.-][a-z0-9;:@/&%=+\$_.-]*)?(#[a-z_.-][a-z0-9+\$_.-]*)?\$"; if (eregi($urlregex, $match[2])) { echo trim($match[2])."<br />"; } } } ?> Quote Link to comment https://forums.phpfreaks.com/topic/253369-link-scraping/#findComment-1298792 Share on other sites More sharing options...
QuickOldCar Posted December 17, 2011 Share Posted December 17, 2011 The above code will only fetch the link itself and not the title of the link..or if was an image. Plus would not handle any self links. If your goal is to just display exactly what is on that page but not using an iframe. <?php $input = @file_get_contents('http://br.4ce.info/'); if(!$input){ echo "No Recommended Sites"; } else { echo $input; } ?> This will not work for all pages, but for your example I believe is the easiest route. I do have piles of code for getting links in many different ways, fixing relative links, parsing images/links/data. Using DOM or something like simplehtmldom would be good ways. Quote Link to comment https://forums.phpfreaks.com/topic/253369-link-scraping/#findComment-1298884 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.