dilbertone Posted November 26, 2010 Share Posted November 26, 2010 hello dear php-friends i currently work on a little parser project i have to find solutions for the a. fetching part b. parser part here we go - the target urls: see the overview: http://dms-schule.bildung.hessen.de/index.html http://dms-schule.bildung.hessen.de/suchen/suche_schul_db.html Search by pressing the button "type" and then choose all schools with the mouse! Results 2400 schools Here i can provide some "more help for getting the target!" - btw: see some details for this target-server: http://dms-schule.bildung.hessen.de/suchen/suche_schul_db.html?show_school=9009 http://dms-schule.bildung.hessen.de/suchen/suche_schul_db.html?show_school=9742 http://dms-schule.bildung.hessen.de/suchen/suche_schul_db.html?show_school=9871 well - you see i have to itterate over the sites - with a function /(a loop) http://dms-schule.bildung.hessen.de/suchen/suche_schul_db.html?show_school=1000 to 10000 BTW - after fetching the page i have to see which one are empty - those ones do not need to be parsed! Well - i want to do this with curl-multi since this is the most advanced way to do this: I see i have an array that can be filled -... but i have to think about the string-concatenation - i guess that i have make some sophisticated string concatenation. this one does not fit - for($i=1;$i<=$match[1];$i++) { $url = "http://www.example.com/page?page={$i}"; and besides this i have an array - i c an fill the array. can you help me how to run in a loop with <?php /************************************\ * Multi interface in PHP with curl * * Requires PHP 5.0, Apache 2.0 and * * Curl * ************************************* * Writen By Cyborg 19671897 * * Bugfixed by Jeremy Ellman * \***********************************/ $urls = array( "http://www.google.com/", "http://www.altavista.com/", "http://www.yahoo.com/" ); $mh = curl_multi_init(); foreach ($urls as $i => $url) { $conn[$i]=curl_init($url); curl_setopt($conn[$i],CURLOPT_RETURNTRANSFER,1);//return data as string curl_setopt($conn[$i],CURLOPT_FOLLOWLOCATION,1);//follow redirects curl_setopt($conn[$i],CURLOPT_MAXREDIRS,2);//maximum redirects curl_setopt($conn[$i],CURLOPT_CONNECTTIMEOUT,10);//timeout curl_multi_add_handle ($mh,$conn[$i]); } do { $n=curl_multi_exec($mh,$active); } while ($active); foreach ($urls as $i => $url) { $res[$i]=curl_multi_getcontent($conn[$i]); curl_multi_remove_handle($mh,$conn[$i]); curl_close($conn[$i]); } curl_multi_close($mh); print_r($res); ?> Link to comment https://forums.phpfreaks.com/topic/219945-setup-of-curl-multi-looping-over-a-bunch-of-sites-how-to-adress-the-array/ Share on other sites More sharing options...
requinix Posted November 26, 2010 Share Posted November 26, 2010 Question: Do the people running that hessen.de site know you're going to take information from it? Have they specifically told you it's okay? well - you see i have to itterate over the sites - with a function /(a loop) http://dms-schule.bildung.hessen.de/suchen/suche_schul_db.html?show_school=1000 to 10000 BTW - after fetching the page i have to see which one are empty - those ones do not need to be parsed! That is a horrible idea. Get a list of schools from the site - one way or another. Link to comment https://forums.phpfreaks.com/topic/219945-setup-of-curl-multi-looping-over-a-bunch-of-sites-how-to-adress-the-array/#findComment-1140074 Share on other sites More sharing options...
Recommended Posts
Archived
This topic is now archived and is closed to further replies.