gamefreak13 Posted June 8, 2013 Share Posted June 8, 2013 (edited) I am fetching an HTML table using the Simple HTML DOM Parser. I get $col[0], $col[1], $col[3] and $col[4] for each row in the HTML table, these are columns. This is ran through a foreach to loop through every row. For testing purposes, I output the following: $col[0] --- $col[1] --- $col[2] --- $col[3] 11:23:26 PM ---- MEDICAL ---- 41ST ST ---- E19 11:23:26 PM ---- MEDICAL ---- 41ST ST ---- M12 10:45:26 PM ---- MEDICAL ---- MAIN ST ---- E20 10:45:26 PM ---- MEDICAL ---- MAIN ST ---- M62 10:45:26 PM ---- MEDICAL ---- MAIN ST ---- T20 09:49:19 PM ---- RESCUE ---- BROADWAY AVE ---- E27 The first three colums are always identical for the grouping. The third column is what is different. I want to combine the data like so: 11:23:26 PM ---- MEDICAL ---- 41ST ST ---- E19, M12 10:45:26 PM ---- MEDICAL ---- MAIN ST ---- E20, M62, T20 09:49:19 PM ---- RESCUE ---- BROADWAY AVE ---- E27 How would I detect that "if columns 1, 2 and 3 are the same, then take the data from column 4 for each row"? Note that the number of rows could fluctuate between 1 to around 10-15 on average. Each row specifies a different apparatus/unit (this is a fire department listing of incidents), but this wastes a lot of vertical space and is overly redundant. My specific code is below: <?php require 'simple_html_dom.php'; $url = "http://www.website.com/html_table.html"; $sHtml = file_get_contents($url); $oHTML = str_get_html($sHtml); $oTRs = $oHTML->find('table tr'); $aData = array(); foreach($oTRs as $oTR) { $aRow = array(); $oTDs = $oTR->find('td'); foreach($oTDs as $oTD) { $aRow[] = trim($oTD->plaintext); } echo $aRow[0]; echo " ---- "; echo $aRow[1]; echo " ---- "; echo $aRow[2]; echo " ---- "; echo $aRow[3]; echo "<br>\n"; $aData[] = $aRow; } ?> Edited June 8, 2013 by gamefreak13 Quote Link to comment Share on other sites More sharing options...
gamefreak13 Posted June 8, 2013 Author Share Posted June 8, 2013 I can't edit the title of the thread, but this does not involve any MySQL whatsoever at this point. I will eventually be inserting the data into a MySQL database, but that has nothing to do with my question, so please disregard the MySQL title. Quote Link to comment Share on other sites More sharing options...
gamefreak13 Posted June 8, 2013 Author Share Posted June 8, 2013 (edited) In an effort to make this easier for those helping, this may just be a simple matter of working with PHP Arrays. Based on the above code, I removed all the ECHO's and changed it to print_r($aRow); and got this array. Now if only I knew how to work with arrays. Array ( ) Array ( [0] => 1:01:27 AM [1] => MEDICAL [2] => MAIN ST [3] => E3 ) Array ( [0] => 1:01:27 AM [1] => MEDICAL [2] => MAIN ST [3] => M36 ) Array ( [0] => 12:35:51 AM [1] => RESCUE [2] => BROADWAY AVE [3] => E20 ) Array ( [0] => 12:35:51 AM [1] => RESCUE [2] => BROADWAY AVE [3] => M31 ) Array ( [0] => 12:35:51 AM [1] => RESCUE [2] => BROADWAY AVE [3] => M20 ) Array ( [0] => 12:16:45 AM [1] => MEDICAL [2] => 2ND ST [3] => E17 ) Array ( [0] => 12:16:45 AM [1] => MEDICAL [2] => 2ND ST [3] => M62 ) Edited June 8, 2013 by gamefreak13 Quote Link to comment Share on other sites More sharing options...
boompa Posted June 8, 2013 Share Posted June 8, 2013 You really need to learn arrays as they are the primary data type of PHP and you will be using them extensively. The idea is to use the identifying data as the keys to arrays of arrays. Your unique data would be time and address. Here's an example of one way to do it. <?php // The data we'll use $data = array( array(), array('1:01:27 AM', 'MEDICAL', 'MAIN ST', 'E3'), array('1:01:27 AM', 'MEDICAL', 'MAIN ST', 'M36'), array('12:35:51 AM', 'RESCUE', 'BROADWAY AVE', 'E20'), array('12:35:51 AM', 'RESCUE', 'BROADWAY AVE', 'M31'), array('1:01:27 AM', 'RESCUE', 'MAIN ST', 'R51') ); /* Data type to use array( 'time' => array( 'location' => array( 'call_type' => array( 'apparatus' => array() ) ) ) ); */ // Print data before print("Before: " . print_r($data, true)); // Final data array $mappedData = array(); // Iterate through the data foreach($data as $entry) { if (empty($entry)) continue; // skip empty entry if (count($entry) != 4) continue; // skip invalid data // Copy the array into meaningful variable names list($time, $call_type, $location, $apparatus) = $entry; // Is there a mapped entry for this time? if (empty($mappedData[$time])) { // No, so where this is brand new, create a whole new // entry in mapped data $mappedData[$time] = array( $location => array( $call_type => array( $apparatus ) ) ); } else { // There's a mapped entry for this time. // So what about location? if (empty($mappedData[$time][$location])) { // No mapped entry for this location, so add // a new call type and the apparatus $mappedData[$time][$location] = array( $call_type => array($apparatus)); } else { // There's an entry for this time and location, // so add a new apparatus entry $mappedData[$time][$location][$call_type][] = $apparatus; } } } // Print what the array looks like now print("\nAfter: " . print_r($mappedData, true)); print("\n\nPretty:\n"); // Pretty-up the display // Iterate over the mapped data // First the time is a key to the location array foreach ($mappedData as $time=>$location) { // The address is the key to the call_type array foreach ($location as $address => $call_type) { // The call_type is the key to the apparatus array foreach ($call_type as $call => $apparatus) { // Create a comma-separated string out of the apparatus array print("$time -- $address -- $call -- " . implode(', ', $apparatus) . "\n"); } } } When I run this at the command line, I get Before: Array ( [0] => Array ( ) [1] => Array ( [0] => 1:01:27 AM [1] => MEDICAL [2] => MAIN ST [3] => E3 ) [2] => Array ( [0] => 1:01:27 AM [1] => MEDICAL [2] => MAIN ST [3] => M36 ) [3] => Array ( [0] => 12:35:51 AM [1] => RESCUE [2] => BROADWAY AVE [3] => E20 ) [4] => Array ( [0] => 12:35:51 AM [1] => RESCUE [2] => BROADWAY AVE [3] => M31 ) [5] => Array ( [0] => 1:01:27 AM [1] => RESCUE [2] => MAIN ST [3] => R51 ) ) After: Array ( [1:01:27 AM] => Array ( [MAIN ST] => Array ( [MEDICAL] => Array ( [0] => E3 [1] => M36 ) [RESCUE] => Array ( [0] => R51 ) ) ) [12:35:51 AM] => Array ( [BROADWAY AVE] => Array ( [RESCUE] => Array ( [0] => E20 [1] => M31 ) ) ) ) Pretty: 1:01:27 AM -- MAIN ST -- MEDICAL -- E3, M36 1:01:27 AM -- MAIN ST -- RESCUE -- R51 12:35:51 AM -- BROADWAY AVE -- RESCUE -- E20, M31 Quote Link to comment Share on other sites More sharing options...
Barand Posted June 8, 2013 Share Posted June 8, 2013 You say you intend to store the data in a database table. I hope you are NOT considering storing values like "E20, M31" in a single field. This should do it $data = array ( array('11:23:26 PM' , 'MEDICAL' , '41ST ST' , 'E19'), array('11:23:26 PM' , 'MEDICAL' , '41ST ST' , 'M12'), array('10:45:26 PM' , 'MEDICAL' , 'MAIN ST' , 'E20'), array('10:45:26 PM' , 'MEDICAL' , 'MAIN ST' , 'M62'), array('10:45:26 PM' , 'MEDICAL' , 'MAIN ST' , 'T20'), array('09:49:19 PM' , 'RESCUE' , 'BROADWAY AVE' , 'E27') ); $new = array(); foreach ($data as $arr) { $key = join('|', array_slice($arr,0,3)); if (isset($new[$key])) { $new[$key][3][] = $arr[3]; } else { $new[$key] = array ($arr[0], $arr[1], $arr[2], array($arr[3])); } } echo '<pre>'; foreach ($new as $arr) { printf("%-12s | %-12s | %-15s | %s\n", $arr[0], $arr[1], $arr[2], join(', ', $arr[3])); } echo '</pre>'; /****************** RESULT *********************************** 11:23:26 PM | MEDICAL | 41ST ST | E19, M12 10:45:26 PM | MEDICAL | MAIN ST | E20, M62, T20 09:49:19 PM | RESCUE | BROADWAY AVE | E27 **************************************************************/ Quote Link to comment Share on other sites More sharing options...
gamefreak13 Posted June 8, 2013 Author Share Posted June 8, 2013 (edited) Thank you both for your replies. In my effort to simplify/remove things for this thread, it seems I've made things more difficult. Below Barand's code intermingled with my code that returns the array of data. I've done something horribly wrong, and can't get it to work. Below is the entire PHP code (note the require_once of simple_html_dom.php though) and the raw HTML output when the PHP code is ran. If you need the common simple_html_dom.php file, it can be downloaded at http://sourceforge.net/projects/simplehtmldom/files/ Please replace all X's in the following URL for the actual source data (and therefore PHP array). PLEASE, do not post the actual link. I don't need search engines picking it up, and this forum doesn't allow editing of posts. And yes Barand, I intend to insert the list of units (e.g. "E21, M31, MR44") together as a text string in MySQL. <?php require 'simple_html_dom.php'; $sHtml = file_get_contents("http://axpxpxsx.sxaxnxdxixexgxox.gxoxvx/sxdxfxixrxexdxixsxpxaxtxcxhx/"); $oHTML = str_get_html($sHtml); $oTRs = $oHTML->find('table tr'); foreach($oTRs as $oTR) { $fdarray = array(); $oTDs = $oTR->find('td'); foreach($oTDs as $oTD) { $fdarray[] = trim($oTD->plaintext); } } print_r($fdarray); echo "\n\n\n\n\n<hr>\n\n\n\n\n\n"; $new = array(); foreach ($fdarray as $arr) { $key = join('|', array_slice($arr,0,3)); if (isset($new[$key])) { $new[$key][3][] = $arr[3]; } else { $new[$key] = array ($arr[0], $arr[1], $arr[2], array($arr[3])); } } foreach ($new as $arr) { echo "\n\n\n\n\n\n\n<hr>\n\n"; echo $arr[0]." --- ".$arr[1]." --- ".$arr[2]." --- ".join(', ', $arr[3])."<br>\n"; } ?> Array ( [0] => 6/8/2013 2:00:17 PM [1] => Medical [2] => TRIANA ST [3] => [4] => M36 ) (note the [3] really was empty and is normal, that table column was empty for that specific row) <hr> <br /> <b>Warning</b>: array_slice() expects parameter 1 to be array, string given in <b>/home/mysite/public_html/firedept/testing.php</b> on line <b>26</b><br /> <br /> <b>Warning</b>: join() [<a href='function.join'>function.join</a>]: Invalid arguments passed in <b>/home/mysite/public_html/firedept/testing.php</b> on line <b>26</b><br /> <br /> <b>Warning</b>: array_slice() expects parameter 1 to be array, string given in <b>/home/mysite/public_html/firedept/testing.php</b> on line <b>26</b><br /> <br /> <b>Warning</b>: join() [<a href='function.join'>function.join</a>]: Invalid arguments passed in <b>/home/mysite/public_html/firedept/testing.php</b> on line <b>26</b><br /> <br /> <b>Warning</b>: array_slice() expects parameter 1 to be array, string given in <b>/home/mysite/public_html/firedept/testing.php</b> on line <b>26</b><br /> <br /> <b>Warning</b>: join() [<a href='function.join'>function.join</a>]: Invalid arguments passed in <b>/home/mysite/public_html/firedept/testing.php</b> on line <b>26</b><br /> <br /> <b>Warning</b>: array_slice() expects parameter 1 to be array, string given in <b>/home/mysite/public_html/firedept/testing.php</b> on line <b>26</b><br /> <br /> <b>Warning</b>: join() [<a href='function.join'>function.join</a>]: Invalid arguments passed in <b>/home/mysite/public_html/firedept/testing.php</b> on line <b>26</b><br /> <br /> <b>Warning</b>: array_slice() expects parameter 1 to be array, string given in <b>/home/mysite/public_html/firedept/testing.php</b> on line <b>26</b><br /> <br /> <b>Warning</b>: join() [<a href='function.join'>function.join</a>]: Invalid arguments passed in <b>/home/mysite/public_html/firedept/testing.php</b> on line <b>26</b><br /> <hr> 6 --- / --- 8 --- /, i, A, , <br> Edited June 8, 2013 by gamefreak13 Quote Link to comment Share on other sites More sharing options...
Barand Posted June 9, 2013 Share Posted June 9, 2013 And yes Barand, I intend to insert the list of units (e.g. "E21, M31, MR44") together as a text string in MySQL. Don't. Instead of +--------------+--------------+-----------------+---------------+ | Time | Type | Address | Codes | +--------------+--------------+-----------------+---------------+ | 11:23:26 PM | MEDICAL | 41ST ST | E19, M12 | | 10:45:26 PM | MEDICAL | MAIN ST | E20, M62, T20 | | 09:49:19 PM | RESCUE | BROADWAY AVE | E27 | +--------------+--------------+-----------------+---------------+ normalize your data correctly like this Table1 +----+--------------+--------------+-----------------+ | Id | Time | Type | Address | +----+--------------+--------------+-----------------+ | 1 | 11:23:26 PM | MEDICAL | 41ST ST | | 2 | 10:45:26 PM | MEDICAL | MAIN ST | | 3 | 09:49:19 PM | RESCUE | BROADWAY AVE | +----+--------------+--------------+-----------------+ Table2 +----+--------------+----------+ | Id | Table1_id | Code | +----+--------------+----------+ | 1 | 1 | E19 | | 2 | 1 | M12 | | 3 | 2 | E20 | | 4 | 2 | M62 | | 5 | 2 | T20 | | 6 | 3 | E27 | +----+--------------+----------+ Quote Link to comment Share on other sites More sharing options...
Barand Posted June 10, 2013 Share Posted June 10, 2013 Thank you both for your replies. In my effort to simplify/remove things for this thread, it seems I've made things more difficult. Below Barand's code intermingled with my code that returns the array of data. I've done something horribly wrong, and can't get it to work. Below is the entire PHP code (note the require_once of simple_html_dom.php though) and the raw HTML output when the PHP code is ran. If you need the common simple_html_dom.php file, it can be downloaded at http://sourceforge.net/projects/simplehtmldom/files/ Please replace all X's in the following URL for the actual source data (and therefore PHP array). PLEASE, do not post the actual link. I don't need search engines picking it up, and this forum doesn't allow editing of posts. And yes Barand, I intend to insert the list of units (e.g. "E21, M31, MR44") together as a text string in MySQL. <?php require 'simple_html_dom.php'; $sHtml = file_get_contents("http://axpxpxsx.sxaxnxdxixexgxox.gxoxvx/sxdxfxixrxexdxixsxpxaxtxcxhx/"); $oHTML = str_get_html($sHtml); $oTRs = $oHTML->find('table tr'); foreach($oTRs as $oTR) { $fdarray = array(); $oTDs = $oTR->find('td'); foreach($oTDs as $oTD) { $fdarray[] = trim($oTD->plaintext); } } print_r($fdarray); echo "\n\n\n\n\n<hr>\n\n\n\n\n\n"; $new = array(); foreach ($fdarray as $arr) { $key = join('|', array_slice($arr,0,3)); if (isset($new[$key])) { $new[$key][3][] = $arr[3]; } else { $new[$key] = array ($arr[0], $arr[1], $arr[2], array($arr[3])); } } foreach ($new as $arr) { echo "\n\n\n\n\n\n\n<hr>\n\n"; echo $arr[0]." --- ".$arr[1]." --- ".$arr[2]." --- ".join(', ', $arr[3])."<br>\n"; } ?> Array ( [0] => 6/8/2013 2:00:17 PM [1] => Medical [2] => TRIANA ST [3] => [4] => M36 ) (note the [3] really was empty and is normal, that table column was empty for that specific row) <hr> <br /> <b>Warning</b>: array_slice() expects parameter 1 to be array, string given in <b>/home/mysite/public_html/firedept/testing.php</b> on line <b>26</b><br /> <br /> <b>Warning</b>: join() [<a href='function.join'>function.join</a>]: Invalid arguments passed in <b>/home/mysite/public_html/firedept/testing.php</b> on line <b>26</b><br /> <br /> <b>Warning</b>: array_slice() expects parameter 1 to be array, string given in <b>/home/mysite/public_html/firedept/testing.php</b> on line <b>26</b><br /> <br /> <b>Warning</b>: join() [<a href='function.join'>function.join</a>]: Invalid arguments passed in <b>/home/mysite/public_html/firedept/testing.php</b> on line <b>26</b><br /> <br /> <b>Warning</b>: array_slice() expects parameter 1 to be array, string given in <b>/home/mysite/public_html/firedept/testing.php</b> on line <b>26</b><br /> <br /> <b>Warning</b>: join() [<a href='function.join'>function.join</a>]: Invalid arguments passed in <b>/home/mysite/public_html/firedept/testing.php</b> on line <b>26</b><br /> <br /> <b>Warning</b>: array_slice() expects parameter 1 to be array, string given in <b>/home/mysite/public_html/firedept/testing.php</b> on line <b>26</b><br /> <br /> <b>Warning</b>: join() [<a href='function.join'>function.join</a>]: Invalid arguments passed in <b>/home/mysite/public_html/firedept/testing.php</b> on line <b>26</b><br /> <br /> <b>Warning</b>: array_slice() expects parameter 1 to be array, string given in <b>/home/mysite/public_html/firedept/testing.php</b> on line <b>26</b><br /> <br /> <b>Warning</b>: join() [<a href='function.join'>function.join</a>]: Invalid arguments passed in <b>/home/mysite/public_html/firedept/testing.php</b> on line <b>26</b><br /> <hr> 6 --- / --- 8 --- /, i, A, , <br> Your original array had element [3] as the one that was to be grouped. You now say it is element [4]. Just change my code to use 4 instead of 3. Quote Link to comment Share on other sites More sharing options...
gamefreak13 Posted June 11, 2013 Author Share Posted June 11, 2013 I've done that. I get several errors. Below is the code with the [3]'s changed to [4]'s. Note the last line of the output is "6 --- / --- 1 --- c, R, C, <br>". I can't make heads or tails of the output other than that it looks like the first 3 parts are from the 6/10/2013 date. <?php require 'simple_html_dom.php'; $sHtml = file_get_contents("http://axpxpxsx.sxaxnxdxixexgxox.gxoxvx/sxdxfxixrxexdxixsxpxaxtxcxhx/"); $oHTML = str_get_html($sHtml); $oTRs = $oHTML->find('table tr'); foreach($oTRs as $oTR) { $fdarray = array(); $oTDs = $oTR->find('td'); foreach($oTDs as $oTD) { $fdarray[] = trim($oTD->plaintext); } } print_r($fdarray); echo "\n\n\n\n\n<hr>\n\n\n\n\n\n"; $new = array(); foreach ($fdarray as $arr) { $key = join('|', array_slice($arr,0,3)); if (isset($new[$key])) { $new[$key][4][] = $arr[4]; } else { $new[$key] = array ($arr[0], $arr[1], $arr[2], array($arr[4])); } } foreach ($new as $arr) { echo "\n\n\n\n\n\n\n<hr>\n\n"; echo $arr[0]." --- ".$arr[1]." --- ".$arr[2]." --- ".join(', ', $arr[4])."<br>\n"; } ?> And the output is.. Array ( [0] => 6/10/2013 9:44:50 PM [1] => Medical [2] => DANDRIDGE LN [3] => FAIRCROSS PL/Dead End [4] => M12 ) <hr> <br /> <b>Warning</b>: array_slice() expects parameter 1 to be array, string given in <b>/home/mysite/public_html/test.php</b> on line <b>26</b><br /> <br /> <b>Warning</b>: join() [<a href='function.join'>function.join</a>]: Invalid arguments passed in <b>/home/mysite/public_html/test.php</b> on line <b>26</b><br /> <br /> <b>Warning</b>: array_slice() expects parameter 1 to be array, string given in <b>/home/mysite/public_html/test.php</b> on line <b>26</b><br /> <br /> <b>Warning</b>: join() [<a href='function.join'>function.join</a>]: Invalid arguments passed in <b>/home/mysite/public_html/test.php</b> on line <b>26</b><br /> <br /> <b>Warning</b>: array_slice() expects parameter 1 to be array, string given in <b>/home/mysite/public_html/test.php</b> on line <b>26</b><br /> <br /> <b>Warning</b>: join() [<a href='function.join'>function.join</a>]: Invalid arguments passed in <b>/home/mysite/public_html/test.php</b> on line <b>26</b><br /> <br /> <b>Warning</b>: array_slice() expects parameter 1 to be array, string given in <b>/home/mysite/public_html/test.php</b> on line <b>26</b><br /> <br /> <b>Warning</b>: join() [<a href='function.join'>function.join</a>]: Invalid arguments passed in <b>/home/mysite/public_html/test.php</b> on line <b>26</b><br /> <br /> <b>Warning</b>: array_slice() expects parameter 1 to be array, string given in <b>/home/mysite/public_html/test.php</b> on line <b>26</b><br /> <br /> <b>Warning</b>: join() [<a href='function.join'>function.join</a>]: Invalid arguments passed in <b>/home/mysite/public_html/test.php</b> on line <b>26</b><br /> <hr> 6 --- / --- 1 --- c, R, C, <br> Quote Link to comment Share on other sites More sharing options...
kicken Posted June 11, 2013 Share Posted June 11, 2013 <?php foreach ($fdarray as $arr) { $key = join('|', array_slice($arr,0,3)); if (isset($new[$key])) { $new[$key][4][] = $arr[4]; } else { $new[$key] = array ($arr[0], $arr[1], $arr[2], array($arr[4])); } } ?> That bit of code assumes that $fdarray is multi-dimensional, but your code before that only creates a single dimensional array. Update the code above to create a multi-dimensional array as is expected and it will work. Quote Link to comment Share on other sites More sharing options...
Barand Posted June 11, 2013 Share Posted June 11, 2013 You were clearing the $fdarray each time around the loop so you only finished up with the last one. I am now storing the $fdarray s into the $data array for processing. $oHTML = str_get_html($sHtml); $oTRs = $oHTML->find('table tr'); $data = array(); foreach($oTRs as $oTR) { $fdarray = array(); $oTDs = $oTR->find('td'); if ($oTDs) { foreach($oTDs as $oTD) { $fdarray[] = trim($oTD->plaintext); } $data[] = $fdarray; } } echo '<pre>',print_r($data, true),'</pre>'; echo "<hr>"; $new = array(); foreach ($data as $arr) { $key = join('|', array_slice($arr,0,3)); if (isset($new[$key])) { $new[$key][3][] = $arr[4]; } else { $new[$key] = array ($arr[0], $arr[1], $arr[2], array($arr[4])); } } foreach ($new as $arr) { echo "<hr>"; echo $arr[0]." --- ".$arr[1]." --- ".$arr[2]." --- ".join(', ', $arr[3])."<br>\n"; } Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.