Mau-Ro Posted August 1, 2016 Share Posted August 1, 2016 (edited) My script don't work correctly. get more results than expected. I' m finding a solutions by 1 week ! here an example by more than a week <?php include 'simple_html_dom.php'; $city = ('trento,rovereto'); $keyword = ('bar,hotel,cinema,geometra'); $array_city_pre = explode(',', $city); $array_keyword_pre = explode(',', $keyword); $list_pre = array("http://$city3.virgilio.it/ricercalocale/searchbu?ref=LOCAL_VIRGILIO&fz=0&usr=1&sgg=0&qs=&dove=&cl=022205&ll=&geoll=&viall=&stxt=$keyword3"); foreach ($array_city_pre as $city3) foreach ($array_keyword_pre as $keyword3) { $list_pre[] = "http://$city3.virgilio.it/ricercalocale/searchbu?ref=LOCAL_VIRGILIO&fz=0&usr=1&sgg=0&qs=&dove=&cl=022205&ll=&geoll=&viall=&stxt=$keyword3"; if (!empty($list_pre)) { foreach ($list_pre as $url_pre) { if (!empty($url_pre)) { $html_pre = file_get_html($url_pre); if (!empty($html_pre)) { foreach ($html_pre->find('//*[@id="search-content"]/section[2]/div/h1/span[1]') as $pre_page); { $page_1 = ($pre_page->innertext / 20); $page = ceil($page_1); $max_page = 100; if ($page >= $max_page) { $page_finale = $max_page; } if ($page < $max_page) { $page_finale = $page; } /*stampo un output per il debug*/ echo "citta:$city3 key:$keyword3 pagina:$page_finale</br>"; $list = array("http://$city3.virgilio.it/ricercalocale/searchbu?ref=LOCAL_VIRGILIO&fz=0&usr=1&sgg=0&qs=&dove=&cl=022205&ll=&geoll=&viall=&stxt=$keyword3&page=$number"); if (!empty($list)) { foreach (range(1, $page_finale) as $number) { $list[] = "http://$city3.virgilio.it/ricercalocale/searchbu?ref=LOCAL_VIRGILIO&fz=0&usr=1&sgg=0&qs=&dove=&cl=022205&ll=&geoll=&viall=&stxt=$keyword3&page=$number"; } if (!empty($list)) { foreach ($list as $url) { echo "$url</br>"; } } } } } } } } } ?> Here the output analysis of ' error http://www.federweb.com/NETBEANS/FSCRAPER/OUTPUT-en.html please give me some inspirations or help... very thanks ! Mauro IT Edited August 1, 2016 by Mau-Ro Quote Link to comment https://forums.phpfreaks.com/topic/301716-help-me-php-simple_htm_dom-for-recursive-scraping/ Share on other sites More sharing options...
dalecosp Posted August 2, 2016 Share Posted August 2, 2016 (edited) There are some problems with your script, as near as I can tell at first glance.1. The array $list_pre was initialized with URL that included an empty variable. Either just initialize an empty array, or make sure that $keyword3 has a value when the array is initialized. 2. The first "foreach" loop isn't followed by a bracket. I'm not sure what the parser will do with that; it might not matter, but for consistency and legibility I would recommend that every foreach use brackets. 3. You have an errant semicolon here: foreach ($html_pre->find('//*[@id="search-content"]/section[2]/div/h1/span[1]') as $pre_page); It's possible that's the source of the issue. Edited August 2, 2016 by dalecosp Quote Link to comment https://forums.phpfreaks.com/topic/301716-help-me-php-simple_htm_dom-for-recursive-scraping/#findComment-1535471 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.