qam47 Posted May 17, 2015 Share Posted May 17, 2015 I am trying to fetch the html source code ... let me explain: 1: on the news page of the website there are headlines links. 2: go inside that links and fetch the html of that page. 3: with simple dom only fetch the article image. 4: output all the images. but i am stuck in the 2nd part. here is the code: <?php require('simple_html_dom.php'); $url = 'http://www.goal.com/en/news/archive/1/'; $site_url = 'http://www.goal.com/'; $html = file_get_html($url); $links = array(); // List Title Links of on the page........... // -------------------------------------------- foreach($html->find('.imgBox') as $tl) { $url_inner = $tl->find('a', 0)->href; // Inside of the Title Links // -------------------------------------------- $innerpage = file_get_contents($site_url . $url_inner); $html_innerpage = file_get_html($innerpage); echo $html_innerpage; } ?> Quote Link to comment https://forums.phpfreaks.com/topic/296358-file_get_html-and-dom-stuck-in-problem-help-needed/ Share on other sites More sharing options...
Ch0cu3r Posted May 17, 2015 Share Posted May 17, 2015 (edited) For step 2 all you need to do is pass $site_url . $url_inner to file_get_html() $html_innerpage = file_get_html($url_inner to file_get_html); // get the article html For step 3 you use $tl->find() to find the article image ( .article-image ). For step 4 echo the image. Edited May 17, 2015 by Ch0cu3r Quote Link to comment https://forums.phpfreaks.com/topic/296358-file_get_html-and-dom-stuck-in-problem-help-needed/#findComment-1512060 Share on other sites More sharing options...
qam47 Posted May 17, 2015 Author Share Posted May 17, 2015 Thanks for the reply but I still am confused. here is what i did so far: <?php require('simple_html_dom.php'); $url = 'http://www.goal.com/en/news/archive/1/'; $site_url = 'http://www.goal.com/'; $html = file_get_html($url); $links = array(); // List Title Links of on the page........... // -------------------------------------------- foreach($html->find('.imgBox') as $tl) { $url_inner = $tl->find('a', 0)->href; // Inside of the Title Links // -------------------------------------------- $innerpage = file_get_html($site_url . $url_inner); $images = $tl->find('. article-image', 0); $item['image'] = $images; $allimg[] = $item; } foreach($allimg as $tl){ echo ' <item> <images>' . $tl['image'] . '</images> </item> '; } ?> Quote Link to comment https://forums.phpfreaks.com/topic/296358-file_get_html-and-dom-stuck-in-problem-help-needed/#findComment-1512072 Share on other sites More sharing options...
Ch0cu3r Posted May 17, 2015 Share Posted May 17, 2015 Sorry I meant to use $innerpage->find('.article-image') not $lt->find('.article-image') However I think step 2 actually means to get the thumbnail image shown with the headline link and not the actual image from the article of the headline? In which case the foreach loop needs to be // get headline link in <div class="imgBox"> foreach($html->find('.imgBox') as $lt) { // get the article url from the anchor tag href attribute $headline_link = $lt->find('a', 0)->href; // get the image url from the image tag src attribute $headline_image = $lt->find('img', 0)->src; $allimg[]['image'] = $headline_image; } Maybe you need to clarify what step 2 means. Quote Link to comment https://forums.phpfreaks.com/topic/296358-file_get_html-and-dom-stuck-in-problem-help-needed/#findComment-1512079 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.