Jump to content

Recommended Posts

I am fetching an HTML table using the Simple HTML DOM Parser. I get $col[0], $col[1], $col[3] and $col[4] for each row in the HTML table, these are columns. This is ran through a foreach to loop through every row. For testing purposes, I output the following:

 

$col[0] --- $col[1] --- $col[2] --- $col[3]

 

11:23:26 PM ---- MEDICAL ---- 41ST ST ---- E19

11:23:26 PM ---- MEDICAL ---- 41ST ST ---- M12

10:45:26 PM ---- MEDICAL ---- MAIN ST ---- E20

10:45:26 PM ---- MEDICAL ---- MAIN ST ---- M62

10:45:26 PM ---- MEDICAL ---- MAIN ST ---- T20

09:49:19 PM ---- RESCUE ---- BROADWAY AVE ---- E27

 

The first three colums are always identical for the grouping. The third column is what is different. I want to combine the data like so:

 

11:23:26 PM ---- MEDICAL ---- 41ST ST ---- E19, M12

10:45:26 PM ---- MEDICAL ---- MAIN ST ---- E20, M62, T20

09:49:19 PM ---- RESCUE ---- BROADWAY AVE ---- E27

 

How would I detect that "if columns 1, 2 and 3 are the same, then take the data from column 4 for each row"? Note that the number of rows could fluctuate between 1 to around 10-15 on average. Each row specifies a different apparatus/unit (this is a fire department listing of incidents), but this wastes a lot of vertical space and is overly redundant.

 

My specific code is below:

 



<?php

require 'simple_html_dom.php';

$url = "http://www.website.com/html_table.html";

$sHtml = file_get_contents($url);

$oHTML = str_get_html($sHtml);
$oTRs = $oHTML->find('table tr');
$aData = array();

foreach($oTRs as $oTR) {
$aRow = array();
$oTDs = $oTR->find('td');

foreach($oTDs as $oTD) {
$aRow[] = trim($oTD->plaintext);
}

echo $aRow[0];
echo " ---- ";
echo $aRow[1];
echo " ---- ";
echo $aRow[2];
echo " ---- ";
echo $aRow[3];
echo "<br>\n";

$aData[] = $aRow;
}

?>

Edited by gamefreak13

I can't edit the title of the thread, but this does not involve any MySQL whatsoever at this point. I will eventually be inserting the data into a MySQL database, but that has nothing to do with my question, so please disregard the MySQL title.

In an effort to make this easier for those helping, this may just be a simple matter of working with PHP Arrays. Based on the above code, I removed all the ECHO's and changed it to print_r($aRow); and got this array. Now if only I knew how to work with arrays. :(

Array
(
)
Array
(
    [0] => 1:01:27 AM
    [1] => MEDICAL
    [2] => MAIN ST
    [3] => E3
)
Array
(
    [0] => 1:01:27 AM
    [1] => MEDICAL
    [2] => MAIN ST
    [3] => M36
)
Array
(
    [0] => 12:35:51 AM
    [1] => RESCUE
    [2] => BROADWAY AVE
    [3] => E20
)
Array
(
    [0] => 12:35:51 AM
    [1] => RESCUE
    [2] => BROADWAY AVE
    [3] => M31
)
Array
(
    [0] => 12:35:51 AM
    [1] => RESCUE
    [2] => BROADWAY AVE
    [3] => M20
)
Array
(
    [0] => 12:16:45 AM
    [1] => MEDICAL
    [2] => 2ND ST
    [3] => E17
)
Array
(
    [0] => 12:16:45 AM
    [1] => MEDICAL
    [2] => 2ND ST
    [3] => M62
)

Edited by gamefreak13

You really need to learn arrays as they are the primary data type of PHP and you will be using them extensively.

 

The idea is to use the identifying data as the keys to arrays of arrays. Your unique data would be time and address.

 

Here's an example of one way to do it.

<?php

// The data we'll use
$data = array(
    array(),
    array('1:01:27 AM', 'MEDICAL', 'MAIN ST', 'E3'),
    array('1:01:27 AM', 'MEDICAL', 'MAIN ST', 'M36'),
    array('12:35:51 AM', 'RESCUE', 'BROADWAY AVE', 'E20'),
    array('12:35:51 AM', 'RESCUE', 'BROADWAY AVE', 'M31'),
    array('1:01:27 AM', 'RESCUE', 'MAIN ST', 'R51')
);



/* Data type to use

array(
    'time' => array(
        'location' => array(
            'call_type' => array(
                'apparatus' => array()
            )
        )
    )
);

*/

// Print data before
print("Before: " . print_r($data, true));

// Final data array
$mappedData = array();

// Iterate through the data
foreach($data as $entry)
{
    if (empty($entry))
        continue; // skip empty entry

    if (count($entry) != 4)
        continue; // skip invalid data

    // Copy the array into meaningful variable names
    list($time, $call_type, $location, $apparatus) = $entry;

    // Is there a mapped entry for this time?
    if (empty($mappedData[$time]))
    {
        // No, so where this is brand new, create a whole new
        // entry in mapped data
        $mappedData[$time] = array(
            $location => array(
                $call_type => array(
                    $apparatus
                )
            )
        );
    }
    else
    {
        // There's a mapped entry for this time.
        // So what about location?
        if (empty($mappedData[$time][$location]))
        {
            // No mapped entry for this location, so add
            // a new call type and the apparatus
            $mappedData[$time][$location] = array(
                $call_type => array($apparatus));
        }
        else
        {
            // There's an entry for this time and location,
            // so add a new apparatus entry
            $mappedData[$time][$location][$call_type][] = $apparatus;
        }
    }
}

// Print what the array looks like now
print("\nAfter: " . print_r($mappedData, true));
print("\n\nPretty:\n");

// Pretty-up the display
// Iterate over the mapped data
// First the time is a key to the location array
foreach ($mappedData as $time=>$location)
{
    // The address is the key to the call_type array
    foreach ($location as $address => $call_type)
    {
        // The call_type is the key to the apparatus array
        foreach ($call_type as $call => $apparatus)
        {
            // Create a comma-separated string out of the apparatus array
            print("$time -- $address -- $call -- " . implode(', ', $apparatus) . "\n");
        }
    }
}

When I run this at the command line, I get

Before: Array
(
    [0] => Array
        (
        )
    [1] => Array
        (
            [0] => 1:01:27 AM
            [1] => MEDICAL
            [2] => MAIN ST
            [3] => E3
        )
    [2] => Array
        (
            [0] => 1:01:27 AM
            [1] => MEDICAL
            [2] => MAIN ST
            [3] => M36
        )
    [3] => Array
        (
            [0] => 12:35:51 AM
            [1] => RESCUE
            [2] => BROADWAY AVE
            [3] => E20
        )
    [4] => Array
        (
            [0] => 12:35:51 AM
            [1] => RESCUE
            [2] => BROADWAY AVE
            [3] => M31
        )
    [5] => Array
        (
            [0] => 1:01:27 AM
            [1] => RESCUE
            [2] => MAIN ST
            [3] => R51
        )
)

After: Array
(
    [1:01:27 AM] => Array
        (
            [MAIN ST] => Array
                (
                    [MEDICAL] => Array
                        (
                            [0] => E3
                            [1] => M36
                        )
                    [RESCUE] => Array
                        (
                            [0] => R51
                        )
                )
        )
    [12:35:51 AM] => Array
        (
            [BROADWAY AVE] => Array
                (
                    [RESCUE] => Array
                        (
                            [0] => E20
                            [1] => M31
                        )
                )
        )
)


Pretty:
1:01:27 AM -- MAIN ST -- MEDICAL -- E3, M36
1:01:27 AM -- MAIN ST -- RESCUE -- R51
12:35:51 AM -- BROADWAY AVE -- RESCUE -- E20, M31

You say you intend to store the data in a database table. I hope you are NOT considering storing values like "E20, M31" in a single field.

 

This should do it

$data = array (
    array('11:23:26 PM' , 'MEDICAL' , '41ST ST' , 'E19'),
    array('11:23:26 PM' , 'MEDICAL' , '41ST ST' , 'M12'),
    array('10:45:26 PM' , 'MEDICAL' , 'MAIN ST' , 'E20'),
    array('10:45:26 PM' , 'MEDICAL' , 'MAIN ST' , 'M62'),
    array('10:45:26 PM' , 'MEDICAL' , 'MAIN ST' , 'T20'),
    array('09:49:19 PM' , 'RESCUE' , 'BROADWAY AVE' , 'E27')
);

$new = array();

foreach ($data as $arr) {
    $key = join('|', array_slice($arr,0,3));
    if (isset($new[$key])) {
        $new[$key][3][] = $arr[3];
    } else {
        $new[$key] = array ($arr[0], $arr[1], $arr[2], array($arr[3]));
    }
}

echo '<pre>';
foreach ($new as $arr) {
    printf("%-12s | %-12s | %-15s | %s\n", $arr[0], $arr[1], $arr[2], join(', ', $arr[3]));
}
echo '</pre>';

/****************** RESULT ***********************************

11:23:26 PM  | MEDICAL      | 41ST ST         | E19, M12
10:45:26 PM  | MEDICAL      | MAIN ST         | E20, M62, T20
09:49:19 PM  | RESCUE       | BROADWAY AVE    | E27

 
**************************************************************/
Thank you both for your replies. In my effort to simplify/remove things for this thread, it seems I've made things more difficult.

 

Below Barand's code intermingled with my code that returns the array of data. I've done something horribly wrong, and can't get it to work.

 

Below is the entire PHP code (note the require_once of simple_html_dom.php though) and the raw HTML output when the PHP code is ran. If you need the common simple_html_dom.php file, it can be downloaded at http://sourceforge.net/projects/simplehtmldom/files/

 

Please replace all X's in the following URL for the actual source data (and therefore PHP array). PLEASE, do not post the actual link. I don't need search engines picking it up, and this forum doesn't allow editing of posts. 

 

And yes Barand, I intend to insert the list of units (e.g. "E21, M31, MR44") together as a text string in MySQL.

 



<?php

require 'simple_html_dom.php';

$sHtml = file_get_contents("http://axpxpxsx.sxaxnxdxixexgxox.gxoxvx/sxdxfxixrxexdxixsxpxaxtxcxhx/");

$oHTML = str_get_html($sHtml);
$oTRs = $oHTML->find('table tr');

foreach($oTRs as $oTR) {
$fdarray = array();
$oTDs = $oTR->find('td');

foreach($oTDs as $oTD) {
$fdarray[] = trim($oTD->plaintext);
}

}

print_r($fdarray);
echo "\n\n\n\n\n<hr>\n\n\n\n\n\n";

$new = array();

foreach ($fdarray as $arr) {
$key = join('|', array_slice($arr,0,3));
if (isset($new[$key])) {
$new[$key][3][] = $arr[3];
} else {
$new[$key] = array ($arr[0], $arr[1], $arr[2], array($arr[3]));
}
}

foreach ($new as $arr) {
echo "\n\n\n\n\n\n\n<hr>\n\n";
echo $arr[0]." --- ".$arr[1]." --- ".$arr[2]." --- ".join(', ', $arr[3])."<br>\n";
}

?>


Array
(
[0] => 6/8/2013 2:00:17 PM
[1] => Medical
[2] => TRIANA ST
[3] =>
[4] => M36
)

(note the [3] really was empty and is normal, that table column was empty for that specific row)



<hr>





<br />
<b>Warning</b>: array_slice() expects parameter 1 to be array, string given in <b>/home/mysite/public_html/firedept/testing.php</b> on line <b>26</b><br />
<br />
<b>Warning</b>: join() [<a href='function.join'>function.join</a>]: Invalid arguments passed in <b>/home/mysite/public_html/firedept/testing.php</b> on line <b>26</b><br />
<br />
<b>Warning</b>: array_slice() expects parameter 1 to be array, string given in <b>/home/mysite/public_html/firedept/testing.php</b> on line <b>26</b><br />
<br />
<b>Warning</b>: join() [<a href='function.join'>function.join</a>]: Invalid arguments passed in <b>/home/mysite/public_html/firedept/testing.php</b> on line <b>26</b><br />
<br />
<b>Warning</b>: array_slice() expects parameter 1 to be array, string given in <b>/home/mysite/public_html/firedept/testing.php</b> on line <b>26</b><br />
<br />
<b>Warning</b>: join() [<a href='function.join'>function.join</a>]: Invalid arguments passed in <b>/home/mysite/public_html/firedept/testing.php</b> on line <b>26</b><br />
<br />
<b>Warning</b>: array_slice() expects parameter 1 to be array, string given in <b>/home/mysite/public_html/firedept/testing.php</b> on line <b>26</b><br />
<br />
<b>Warning</b>: join() [<a href='function.join'>function.join</a>]: Invalid arguments passed in <b>/home/mysite/public_html/firedept/testing.php</b> on line <b>26</b><br />
<br />
<b>Warning</b>: array_slice() expects parameter 1 to be array, string given in <b>/home/mysite/public_html/firedept/testing.php</b> on line <b>26</b><br />
<br />
<b>Warning</b>: join() [<a href='function.join'>function.join</a>]: Invalid arguments passed in <b>/home/mysite/public_html/firedept/testing.php</b> on line <b>26</b><br />







<hr>

6 --- / --- 8 --- /, i, A, , <br>

Edited by gamefreak13

And yes Barand, I intend to insert the list of units (e.g. "E21, M31, MR44") together as a text string in MySQL.

 

Don't. Instead of

 

 

+--------------+--------------+-----------------+---------------+
|  Time        |   Type       |  Address        |  Codes        |
+--------------+--------------+-----------------+---------------+
| 11:23:26 PM  | MEDICAL      | 41ST ST         | E19, M12      |
| 10:45:26 PM  | MEDICAL      | MAIN ST         | E20, M62, T20 |
| 09:49:19 PM  | RESCUE       | BROADWAY AVE    | E27           |
+--------------+--------------+-----------------+---------------+

normalize your data correctly like this

 

Table1
+----+--------------+--------------+-----------------+
| Id |  Time        |   Type       |  Address        |
+----+--------------+--------------+-----------------+
|  1 | 11:23:26 PM  | MEDICAL      | 41ST ST         |
|  2 | 10:45:26 PM  | MEDICAL      | MAIN ST         |
|  3 | 09:49:19 PM  | RESCUE       | BROADWAY AVE    |
+----+--------------+--------------+-----------------+

Table2
+----+--------------+----------+
| Id |  Table1_id   |  Code    |
+----+--------------+----------+
|  1 |      1       |  E19     |
|  2 |      1       |  M12     |
|  3 |      2       |  E20     |
|  4 |      2       |  M62     |
|  5 |      2       |  T20     |
|  6 |      3       |  E27     |
+----+--------------+----------+
 

 

Thank you both for your replies. In my effort to simplify/remove things for this thread, it seems I've made things more difficult.
 
Below Barand's code intermingled with my code that returns the array of data. I've done something horribly wrong, and can't get it to work.
 
Below is the entire PHP code (note the require_once of simple_html_dom.php though) and the raw HTML output when the PHP code is ran. If you need the common simple_html_dom.php file, it can be downloaded at http://sourceforge.net/projects/simplehtmldom/files/
 
Please replace all X's in the following URL for the actual source data (and therefore PHP array). PLEASE, do not post the actual link. I don't need search engines picking it up, and this forum doesn't allow editing of posts. 
 
And yes Barand, I intend to insert the list of units (e.g. "E21, M31, MR44") together as a text string in MySQL.
 
<?php

require 'simple_html_dom.php';

$sHtml = file_get_contents("http://axpxpxsx.sxaxnxdxixexgxox.gxoxvx/sxdxfxixrxexdxixsxpxaxtxcxhx/");

$oHTML = str_get_html($sHtml);
$oTRs = $oHTML->find('table tr');

foreach($oTRs as $oTR) {
	$fdarray = array();
	$oTDs = $oTR->find('td');

	foreach($oTDs as $oTD) {
		$fdarray[] = trim($oTD->plaintext);
	}

}

print_r($fdarray);
echo "\n\n\n\n\n<hr>\n\n\n\n\n\n";

$new = array();

foreach ($fdarray as $arr) {
    $key = join('|', array_slice($arr,0,3));
    if (isset($new[$key])) {
        $new[$key][3][] = $arr[3];
    } else {
        $new[$key] = array ($arr[0], $arr[1], $arr[2], array($arr[3]));
    }
}

foreach ($new as $arr) {
	echo "\n\n\n\n\n\n\n<hr>\n\n";
	echo $arr[0]." --- ".$arr[1]." --- ".$arr[2]." --- ".join(', ', $arr[3])."<br>\n";
}

?>
Array
(
    [0] => 6/8/2013 2:00:17 PM
    [1] => Medical
    [2] => TRIANA ST
    [3] => 
    [4] => M36
)

(note the [3] really was empty and is normal, that table column was empty for that specific row)



<hr>





<br />
<b>Warning</b>:  array_slice() expects parameter 1 to be array, string given in <b>/home/mysite/public_html/firedept/testing.php</b> on line <b>26</b><br />
<br />
<b>Warning</b>:  join() [<a href='function.join'>function.join</a>]: Invalid arguments passed in <b>/home/mysite/public_html/firedept/testing.php</b> on line <b>26</b><br />
<br />
<b>Warning</b>:  array_slice() expects parameter 1 to be array, string given in <b>/home/mysite/public_html/firedept/testing.php</b> on line <b>26</b><br />
<br />
<b>Warning</b>:  join() [<a href='function.join'>function.join</a>]: Invalid arguments passed in <b>/home/mysite/public_html/firedept/testing.php</b> on line <b>26</b><br />
<br />
<b>Warning</b>:  array_slice() expects parameter 1 to be array, string given in <b>/home/mysite/public_html/firedept/testing.php</b> on line <b>26</b><br />
<br />
<b>Warning</b>:  join() [<a href='function.join'>function.join</a>]: Invalid arguments passed in <b>/home/mysite/public_html/firedept/testing.php</b> on line <b>26</b><br />
<br />
<b>Warning</b>:  array_slice() expects parameter 1 to be array, string given in <b>/home/mysite/public_html/firedept/testing.php</b> on line <b>26</b><br />
<br />
<b>Warning</b>:  join() [<a href='function.join'>function.join</a>]: Invalid arguments passed in <b>/home/mysite/public_html/firedept/testing.php</b> on line <b>26</b><br />
<br />
<b>Warning</b>:  array_slice() expects parameter 1 to be array, string given in <b>/home/mysite/public_html/firedept/testing.php</b> on line <b>26</b><br />
<br />
<b>Warning</b>:  join() [<a href='function.join'>function.join</a>]: Invalid arguments passed in <b>/home/mysite/public_html/firedept/testing.php</b> on line <b>26</b><br />







<hr>

6 --- / --- 8 --- /, i, A, , <br>

 

 

Your original array had element [3] as the one that was to be grouped. You now say it is element [4]. Just change my code to use 4 instead of 3.

I've done that. I get several errors. Below is the code with the [3]'s changed to [4]'s. Note the last line of the output is "6 --- / --- 1 --- c, R, C, <br>". I can't make heads or tails of the output other than that it looks like the first 3 parts are from the 6/10/2013 date.

<?php

require 'simple_html_dom.php';

$sHtml = file_get_contents("http://axpxpxsx.sxaxnxdxixexgxox.gxoxvx/sxdxfxixrxexdxixsxpxaxtxcxhx/");

$oHTML = str_get_html($sHtml);
$oTRs = $oHTML->find('table tr');

foreach($oTRs as $oTR) {
	$fdarray = array();
	$oTDs = $oTR->find('td');

	foreach($oTDs as $oTD) {
		$fdarray[] = trim($oTD->plaintext);
	}

}

print_r($fdarray);
echo "\n\n\n\n\n<hr>\n\n\n\n\n\n";

$new = array();

foreach ($fdarray as $arr) {
    $key = join('|', array_slice($arr,0,3));
    if (isset($new[$key])) {
        $new[$key][4][] = $arr[4];
    } else {
        $new[$key] = array ($arr[0], $arr[1], $arr[2], array($arr[4]));
    }
}

foreach ($new as $arr) {
	echo "\n\n\n\n\n\n\n<hr>\n\n";
	echo $arr[0]." --- ".$arr[1]." --- ".$arr[2]." --- ".join(', ', $arr[4])."<br>\n";
}

?>

And the output is..

Array
(
    [0] => 6/10/2013 9:44:50 PM
    [1] => Medical
    [2] => DANDRIDGE LN
    [3] => FAIRCROSS PL/Dead End
    [4] => M12
)





<hr>





<br />
<b>Warning</b>:  array_slice() expects parameter 1 to be array, string given in <b>/home/mysite/public_html/test.php</b> on line <b>26</b><br />
<br />
<b>Warning</b>:  join() [<a href='function.join'>function.join</a>]: Invalid arguments passed in <b>/home/mysite/public_html/test.php</b> on line <b>26</b><br />
<br />
<b>Warning</b>:  array_slice() expects parameter 1 to be array, string given in <b>/home/mysite/public_html/test.php</b> on line <b>26</b><br />
<br />
<b>Warning</b>:  join() [<a href='function.join'>function.join</a>]: Invalid arguments passed in <b>/home/mysite/public_html/test.php</b> on line <b>26</b><br />
<br />
<b>Warning</b>:  array_slice() expects parameter 1 to be array, string given in <b>/home/mysite/public_html/test.php</b> on line <b>26</b><br />
<br />
<b>Warning</b>:  join() [<a href='function.join'>function.join</a>]: Invalid arguments passed in <b>/home/mysite/public_html/test.php</b> on line <b>26</b><br />
<br />
<b>Warning</b>:  array_slice() expects parameter 1 to be array, string given in <b>/home/mysite/public_html/test.php</b> on line <b>26</b><br />
<br />
<b>Warning</b>:  join() [<a href='function.join'>function.join</a>]: Invalid arguments passed in <b>/home/mysite/public_html/test.php</b> on line <b>26</b><br />
<br />
<b>Warning</b>:  array_slice() expects parameter 1 to be array, string given in <b>/home/mysite/public_html/test.php</b> on line <b>26</b><br />
<br />
<b>Warning</b>:  join() [<a href='function.join'>function.join</a>]: Invalid arguments passed in <b>/home/mysite/public_html/test.php</b> on line <b>26</b><br />







<hr>

6 --- / --- 1 --- c, R, C, <br>

 

<?php
foreach ($fdarray as $arr) {
    $key = join('|', array_slice($arr,0,3));
    if (isset($new[$key])) {
        $new[$key][4][] = $arr[4];
    } else {
        $new[$key] = array ($arr[0], $arr[1], $arr[2], array($arr[4]));
    }
}
?>

 

That bit of code assumes that $fdarray is multi-dimensional, but your code before that only creates a single dimensional array. Update the code above to create a multi-dimensional array as is expected and it will work.

You were clearing the $fdarray each time around the loop so you only finished up with the last one. I am now storing the $fdarray s into the $data array for processing.

$oHTML = str_get_html($sHtml);
$oTRs = $oHTML->find('table tr');

$data = array();
foreach($oTRs as $oTR) {
    $fdarray = array();
    $oTDs = $oTR->find('td');
    if ($oTDs) {
        foreach($oTDs as $oTD) {
            $fdarray[] = trim($oTD->plaintext);
        }
        $data[] = $fdarray;
    }
}

echo '<pre>',print_r($data, true),'</pre>';

echo "<hr>";

$new = array();

foreach ($data as $arr) {
    $key = join('|', array_slice($arr,0,3));
    if (isset($new[$key])) {
        $new[$key][3][] = $arr[4];
    } else {
        $new[$key] = array ($arr[0], $arr[1], $arr[2], array($arr[4]));
    }
}

foreach ($new as $arr) {
    echo "<hr>";
    echo $arr[0]." --- ".$arr[1]." --- ".$arr[2]." --- ".join(', ', $arr[3])."<br>\n";
}
This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.