Jump to content

Grabbing HTML Table Rows That Are Sorted, But Results Aren't Sorted


kinitex

Recommended Posts

http://www.forexfbi.com/best-forex-brokers-comparison/

 

I am trying to grab the top 5 rows under the header, which are sorted by column 1 (rating) from highest to lowest. Problem is my script grabs the first 5 rows before the sort takes place, so it's obviously not getting the 5 top rated rows. I am using tablesorter.com to sort.

 

<table id="t15" class="tablesorter {sortlist: [[1,1]]}">

 

http://www.forexfbi.com/bestbrokers.php

 

Here is my script I'm using to get the code:

 

<?php 
include("bestbrokers_array.php");
require('simple_html_dom.php'); 

$table = array(); 

//$html->load_file('http://www.forexfbi.com/best-forex-brokers-comparison/');
$html = file_get_html('http://www.forexfbi.com/best-forex-brokers-comparison/'); 

$num=1;
foreach($html->find('tr') as $row) { 
    if($num <= 6)
{
$rating = $row->find('td',1)->innertext; 
    $name = $row->find('td',0)->plaintext; 

    

    $table[$name][$rating] = true; 
$num++;
}
} 

html_show_array($table);

?> 

 

and bestbrokers_array.php

 

<?php
function do_offset($level){
    $offset = "";             // offset for subarry 
    for ($i=1; $i<$level;$i++){
    $offset = $offset . "<td></td>";
    }
    return $offset;
}

function show_array($array, $level, $sub){
    if (is_array($array) == 1){          // check if input is an array
       foreach($array as $key_val => $value) {
           $offset = "";
           if (is_array($value) == 1){   // array is multidimensional
           echo "<tr>";
           $offset = do_offset($level);
           echo $offset . "<td>" . $key_val . "</td>";
           show_array($value, $level+1, 1);
           }
           else{                        // (sub)array is not multidim
           if ($sub != 1){          // first entry for subarray
               echo "<tr nosub>";
               $offset = do_offset($level);
           }
           $sub = 0;
           echo $offset . "<td main ".$sub." width=\"120\">" . $key_val . 
               "</td>"; 
           echo "</tr>\n";
           }
       } //foreach $array
    }  
    else{ // argument $array is not an array
        return;
    }
}

function html_show_array($array){
  echo "<table cellspacing=\"0\" border=\"2\">\n";
  show_array($array, 1, 0);
  echo "</table>\n";
}
?> 

 

Link to comment
Share on other sites

If the content, before JS run, is not sorted then you will need to grab ALL the rows, dump into a multi-dimensional array, then sort it yourself. I'd use a usort() to sort the records to my liking.

 

I'm not a good enough coder yet to implement that. I was lucky enough just to scrape this script together from tidbits of code around the web and a few tweaks. There has to be a simple solution here that I'm not seeing.

Link to comment
Share on other sites

Ok so how do I go about implementing this?

 

. . .  grab ALL the rows, dump into a multi-dimensional array, then sort it yourself. I'd use a usort() to sort the records to my liking.

 

Are you expecting us to write the code for you? You already figured out how to get the first 6 records, so extend that to get all the records into an array. Then look into asort(), or another array sorting function, as to how to sort the result. So at least make an attempt - then if you have some specific problems post back with what specific step you are trying to achieve, what you have implemented and what is happening.

Link to comment
Share on other sites

I've got this...

 

<?php 

require('simple_html_dom.php'); 

function scraping_table() {
    

// create HTML DOM
$html = file_get_html('http://www.forexfbi.com/best-forex-brokers-comparison/');


foreach($html->find('tr') as $row) { 

        // get name
        $item['name'] = $row->find('td',0)->plaintext; 
        // get rating
        $item['rating'] = $row->find('td',1)->innertext;

        
        $ret[] = $item;

    }

    // clean up memory
    $html->clear();
    unset($html);

    return $ret;

}


$ret = scraping_table();
asort($ret,SORT_NUMERIC);


echo '<table><thead><tr><th>Broker</th><th>Rating</th></tr></thead><tbody>';
foreach($ret as $v) {
    echo '<tr>';
    echo '<td>'.$v['name'].'</td>';

    echo '<td>'.$v['rating'].'</td>';
    echo '</tr>';
}
echo '</tbody></table>';



?> 

 

Which returns like this

 

http://www.forexfbi.com/bestbrokers2.php

 

which obviously isn't right. It is sorting, im just not sure in what way lol it seems to be kinda random.

Link to comment
Share on other sites

Which returns like this

 

http://www.forexfbi.com/bestbrokers2.php

 

For the love of all that is pure and good in this world, put some freaking line breaks in your output. All of your output in that page (starting from the table tag) is on a single line. How do you debug HTML issues?

 

Do something like this:

foreach($ret as $v) {
    echo "<tr>\n";
    echo "<td>{$v['name']}</td>\n";
     echo "<td>{$v['rating']}</td>\n";
    echo "</tr>\n";
}

 

So, as to your problem. You need to analyze the content you are getting and determine how you need to manipulate it. All you are doing is grabbing the raw HTML and trying to use it as data. You need to strip out the HTML and any unnecessary content. You are wanting to sort on rating, correct? Well, the cell that has the rating has a TON of code for the images of the stars including plenty of JavaScript events. You only need/want the rating so you can use that for ordering the results.

 

So, first thing I would do is use strip_tags() on the values. Then see if the result is what you need to properly sort. The second problem is that asort() will not work on a milti-dimensional array. You can either use usort() as I suggested or an alternative that works for muti-dimensional arrays or you could use a different trick since you are only using two values. Put the name as the key and the rating as the value. Then you can and should use asort.

 

This probably won't be perfect, but it will get you closer

function scraping_table($url)
{
    // create HTML DOM
    require_once('simple_html_dom.php'); 
    $html = file_get_html($url);

    $result = array();
    foreach($html->find('tr') as $row)
    { 
        $name = strip_tags(trim($row->find('td',0)->plaintext));
        $rating = strip_tags(trim($row->find('td',1)->innertext));
        $result[$name] = $rating;
    }

    // clean up memory
    $html->clear();
    unset($html);

    //Sort and return the results
    asort($result, SORT_NUMERIC);
    return $result;
}


$data = scraping_table('[url=http://www.forexfbi.com/best-forex-brokers-comparison/');]http://www.forexfbi.com/best-forex-brokers-comparison/');[/url]

echo "<table>\n";
echo "  <thead>\n"
echo "    <tr><th>Broker</th><th>Rating</th></tr>\n";
echo "  </thead>\n";
echo "  <tbody>\n";
foreach($data as $name => $rating) 
{
    echo "<tr>\n";
    echo "<td>{$name}</td>\n";
    echo "<td>{$rating}</td>\n";
    echo "</tr>\n";
}
echo "  </tbody>\n";
echo "</table>\n";

Link to comment
Share on other sites

This would be so much easier if I could just wait for the sort to take place before dom parser grabs the rows but I can't figure that out either.

 

It doesn't work like that! The dom class is reading the contents of the HTML - it does not try and render the content and then run JavaScript. What you view in your browser is the modified content that browser has changed due to the running of JavaScript. So, what you are suggesting is not an option.

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.