kinitex Posted September 11, 2012 Share Posted September 11, 2012 http://www.forexfbi.com/best-forex-brokers-comparison/ I am trying to grab the top 5 rows under the header, which are sorted by column 1 (rating) from highest to lowest. Problem is my script grabs the first 5 rows before the sort takes place, so it's obviously not getting the 5 top rated rows. I am using tablesorter.com to sort. <table id="t15" class="tablesorter {sortlist: [[1,1]]}"> http://www.forexfbi.com/bestbrokers.php Here is my script I'm using to get the code: <?php include("bestbrokers_array.php"); require('simple_html_dom.php'); $table = array(); //$html->load_file('http://www.forexfbi.com/best-forex-brokers-comparison/'); $html = file_get_html('http://www.forexfbi.com/best-forex-brokers-comparison/'); $num=1; foreach($html->find('tr') as $row) { if($num <= 6) { $rating = $row->find('td',1)->innertext; $name = $row->find('td',0)->plaintext; $table[$name][$rating] = true; $num++; } } html_show_array($table); ?> and bestbrokers_array.php <?php function do_offset($level){ $offset = ""; // offset for subarry for ($i=1; $i<$level;$i++){ $offset = $offset . "<td></td>"; } return $offset; } function show_array($array, $level, $sub){ if (is_array($array) == 1){ // check if input is an array foreach($array as $key_val => $value) { $offset = ""; if (is_array($value) == 1){ // array is multidimensional echo "<tr>"; $offset = do_offset($level); echo $offset . "<td>" . $key_val . "</td>"; show_array($value, $level+1, 1); } else{ // (sub)array is not multidim if ($sub != 1){ // first entry for subarray echo "<tr nosub>"; $offset = do_offset($level); } $sub = 0; echo $offset . "<td main ".$sub." width=\"120\">" . $key_val . "</td>"; echo "</tr>\n"; } } //foreach $array } else{ // argument $array is not an array return; } } function html_show_array($array){ echo "<table cellspacing=\"0\" border=\"2\">\n"; show_array($array, 1, 0); echo "</table>\n"; } ?> Quote Link to comment Share on other sites More sharing options...
Psycho Posted September 11, 2012 Share Posted September 11, 2012 If the content, before JS run, is not sorted then you will need to grab ALL the rows, dump into a multi-dimensional array, then sort it yourself. I'd use a usort() to sort the records to my liking. Quote Link to comment Share on other sites More sharing options...
kinitex Posted September 11, 2012 Author Share Posted September 11, 2012 If the content, before JS run, is not sorted then you will need to grab ALL the rows, dump into a multi-dimensional array, then sort it yourself. I'd use a usort() to sort the records to my liking. I'm not a good enough coder yet to implement that. I was lucky enough just to scrape this script together from tidbits of code around the web and a few tweaks. There has to be a simple solution here that I'm not seeing. Quote Link to comment Share on other sites More sharing options...
Christian F. Posted September 11, 2012 Share Posted September 11, 2012 That is the simple solution, unless you want to write a JS parser to interact with a DOMdocument object? Quote Link to comment Share on other sites More sharing options...
kinitex Posted September 11, 2012 Author Share Posted September 11, 2012 Ok so how do I go about implementing this? Quote Link to comment Share on other sites More sharing options...
Psycho Posted September 11, 2012 Share Posted September 11, 2012 Ok so how do I go about implementing this? . . . grab ALL the rows, dump into a multi-dimensional array, then sort it yourself. I'd use a usort() to sort the records to my liking. Are you expecting us to write the code for you? You already figured out how to get the first 6 records, so extend that to get all the records into an array. Then look into asort(), or another array sorting function, as to how to sort the result. So at least make an attempt - then if you have some specific problems post back with what specific step you are trying to achieve, what you have implemented and what is happening. Quote Link to comment Share on other sites More sharing options...
kinitex Posted September 11, 2012 Author Share Posted September 11, 2012 I've got this... <?php require('simple_html_dom.php'); function scraping_table() { // create HTML DOM $html = file_get_html('http://www.forexfbi.com/best-forex-brokers-comparison/'); foreach($html->find('tr') as $row) { // get name $item['name'] = $row->find('td',0)->plaintext; // get rating $item['rating'] = $row->find('td',1)->innertext; $ret[] = $item; } // clean up memory $html->clear(); unset($html); return $ret; } $ret = scraping_table(); asort($ret,SORT_NUMERIC); echo '<table><thead><tr><th>Broker</th><th>Rating</th></tr></thead><tbody>'; foreach($ret as $v) { echo '<tr>'; echo '<td>'.$v['name'].'</td>'; echo '<td>'.$v['rating'].'</td>'; echo '</tr>'; } echo '</tbody></table>'; ?> Which returns like this http://www.forexfbi.com/bestbrokers2.php which obviously isn't right. It is sorting, im just not sure in what way lol it seems to be kinda random. Quote Link to comment Share on other sites More sharing options...
kinitex Posted September 12, 2012 Author Share Posted September 12, 2012 Still lookin for help on this issue, thanks! Quote Link to comment Share on other sites More sharing options...
Psycho Posted September 12, 2012 Share Posted September 12, 2012 Which returns like this http://www.forexfbi.com/bestbrokers2.php For the love of all that is pure and good in this world, put some freaking line breaks in your output. All of your output in that page (starting from the table tag) is on a single line. How do you debug HTML issues? Do something like this: foreach($ret as $v) { echo "<tr>\n"; echo "<td>{$v['name']}</td>\n"; echo "<td>{$v['rating']}</td>\n"; echo "</tr>\n"; } So, as to your problem. You need to analyze the content you are getting and determine how you need to manipulate it. All you are doing is grabbing the raw HTML and trying to use it as data. You need to strip out the HTML and any unnecessary content. You are wanting to sort on rating, correct? Well, the cell that has the rating has a TON of code for the images of the stars including plenty of JavaScript events. You only need/want the rating so you can use that for ordering the results. So, first thing I would do is use strip_tags() on the values. Then see if the result is what you need to properly sort. The second problem is that asort() will not work on a milti-dimensional array. You can either use usort() as I suggested or an alternative that works for muti-dimensional arrays or you could use a different trick since you are only using two values. Put the name as the key and the rating as the value. Then you can and should use asort. This probably won't be perfect, but it will get you closer function scraping_table($url) { // create HTML DOM require_once('simple_html_dom.php'); $html = file_get_html($url); $result = array(); foreach($html->find('tr') as $row) { $name = strip_tags(trim($row->find('td',0)->plaintext)); $rating = strip_tags(trim($row->find('td',1)->innertext)); $result[$name] = $rating; } // clean up memory $html->clear(); unset($html); //Sort and return the results asort($result, SORT_NUMERIC); return $result; } $data = scraping_table('[url=http://www.forexfbi.com/best-forex-brokers-comparison/');]http://www.forexfbi.com/best-forex-brokers-comparison/');[/url] echo "<table>\n"; echo " <thead>\n" echo " <tr><th>Broker</th><th>Rating</th></tr>\n"; echo " </thead>\n"; echo " <tbody>\n"; foreach($data as $name => $rating) { echo "<tr>\n"; echo "<td>{$name}</td>\n"; echo "<td>{$rating}</td>\n"; echo "</tr>\n"; } echo " </tbody>\n"; echo "</table>\n"; Quote Link to comment Share on other sites More sharing options...
Christian F. Posted September 12, 2012 Share Posted September 12, 2012 Try with a larger set of source data, as the current set of 2 items doesn't really say a whole lot about the sorting. Quote Link to comment Share on other sites More sharing options...
kinitex Posted September 12, 2012 Author Share Posted September 12, 2012 Ok I'm willing to pay someone to finish this off for me. Theres a few other small features I'd like added that I have no idea how to implement. Please pm with if your interested. Quote Link to comment Share on other sites More sharing options...
kinitex Posted September 12, 2012 Author Share Posted September 12, 2012 This would be so much easier if I could just wait for the sort to take place before dom parser grabs the rows but I can't figure that out either. Quote Link to comment Share on other sites More sharing options...
Psycho Posted September 12, 2012 Share Posted September 12, 2012 This would be so much easier if I could just wait for the sort to take place before dom parser grabs the rows but I can't figure that out either. It doesn't work like that! The dom class is reading the contents of the HTML - it does not try and render the content and then run JavaScript. What you view in your browser is the modified content that browser has changed due to the running of JavaScript. So, what you are suggesting is not an option. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.