dodgeitorelse Posted October 7, 2011 Share Posted October 7, 2011 Hi guys, I have a php file that will go to a site and scrape the data I need. However this site is setup to use pagination so when I try to scrape all the players names I have to do separate queries to search each page. Is there a way to find out by using code how many pages there are and query all the pages at same time? I use this code <?php //first page //turn error reporting on libxml_use_internal_errors(true); //get data from this page $dom = new DOMDocument; $dom->loadHTMLFile('http://www.gametracker.com/server_info/76.73.3.42:1716/top_players/?searchipp=50#search'); $xpath = new DOMXPath($dom); // Get the total player count $rows2 = $xpath->query('//div[@class="block774"]/div'); // Get the rows from the search list $rows = $xpath->query('//table[@class="table_lst table_lst_spn"]/tr'); for ($i=1; $i<$rows->length-1; $i++) { $row = $rows->item($i); // Get the columns for a row $cols = $row->getElementsByTagName('td'); // Get the player rank (1st column) echo 'Rank:'.trim($cols->item(0)->textContent).PHP_EOL; // Get the player name (2nd column) echo 'Name:'.trim($cols->item(1)->textContent).PHP_EOL; // Get the player score (3rd column, actually 4th but number 3 is hidden) echo 'Score:'.trim($cols->item(3)->textContent).PHP_EOL; echo "<br />"; } ?> <?php //secondpage //turn error reporting on libxml_use_internal_errors(true); //get data from this page $dom = new DOMDocument; $dom->loadHTMLFile('http://www.gametracker.com/server_info/76.73.3.42:1716/top_players/?searchipp=50&searchpge=2#search'); $xpath = new DOMXPath($dom); // Get the rows from the search list $rows = $xpath->query('//table[@class="table_lst table_lst_spn"]/tr'); for ($i=1; $i<$rows->length-1; $i++) { $row = $rows->item($i); // Get the columns for a row $cols = $row->getElementsByTagName('td'); // Get the player rank (1st column) echo 'Rank:'.trim($cols->item(0)->textContent).PHP_EOL; // Get the player name (2nd column) echo 'Name:'.trim($cols->item(1)->textContent).PHP_EOL; // Get the player score (3rd column) echo 'Score:'.trim($cols->item(3)->textContent).PHP_EOL; echo "<br />"; } ?> I also have to go to that website first to see how many pages there are so I can have enough queries. Quote Link to comment Share on other sites More sharing options...
jcbones Posted October 7, 2011 Share Posted October 7, 2011 Why can't you scrape the links, and then use those to continue? Quote Link to comment Share on other sites More sharing options...
Psycho Posted October 7, 2011 Share Posted October 7, 2011 Look for this: <a href="/server_info/76.73.3.42:1716/top_players/?searchipp=50&searchpge=2#search"> > </a> That is the > button for the next page. The searchpage value will be variable. But, if that link exists, that means there is another page. When there are no more pages the > on the page is not a link <span> > </span> So, as long as that link exists you know there is another page. So, just create a while() loop to continue getting the next page as long as that link exists. Quote Link to comment Share on other sites More sharing options...
dodgeitorelse Posted October 7, 2011 Author Share Posted October 7, 2011 ok thank you both, I will give this a try. I will let you know results. Thanks Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.