Grab unformated text, organize into arrays and dump data into a table.

drexnefex · February 26, 2009

Hello -

I am trying to grab a series of unformatted text files, parse them, organize specific parts of them into arrays, then dump those arrays into a table in specific locations.

I've made a few attempts...at least at the grabbing/parsing part, but have no clue how to proceed.

Attempt1 - http://nwbroweather.com/weather/nwac.php

Attempt2 - http://nwbroweather.com/weather/nwac2.php

Here's a link to one of the unformatted txt files im trying to work with: http://nwac.us/products/OSOSK9

There's going to be about 8 of these files that i'm going to want to work with. They are not all structured the same.

The data in the text file contains temperatures at a few different elevations (among other things). I'd like to grab the most recent temps and dump them into a table. The most recent temps are located at the bottom of the data file.

Table would look something like this:

LocationBaseTempMidTempTopTemp

Site X30°24°19°

Any ideas would be greatly appreciated. Is there a better to accomplish this task than with PHP? Is this even possible given the unstructured target data?

Thanks.

drexnefex · February 26, 2009

Here's the code for attempt1

<pre>
<?php

$nwac[0] = OSOALP;   
$nwac[1] = OSOCMT;   
$nwac[2] = OSOHUR;   
$nwac[3] = OSOMHM;   
$nwac[4] = OSOMSR;   
$nwac[5] = OSOMTB;   
$nwac[6] = OSOSK9;   
$nwac[7] = OSOWPS;   

for($counter = 0; $counter < 7; $counter += 1)
{



$data = curl_init();
// set URL and other appropriate options
curl_setopt($data, CURLOPT_URL, "http://nwac.us/products/$nwac[$counter]");
curl_setopt($data, CURLOPT_RETURNTRANSFER, true);
curl_setopt($data, CURLOPT_TIMEOUT, 30);

// grab URL
$output = curl_exec($data);
#curl_exec($data);
curl_close($data);

### Get the header.
   preg_match('%MM/DD.+?(?=-{2,})%s', $output, $matches);
   print_r($matches);
   ### Separate the metadata from the data; if you don't,
   ### the date "1-9-2007" will be picked up as data.
   ### The separator is the line of hyphens.
   $data_pieces = preg_split('/^-{2,}\r?$/m', $output);
   $data_area = array_pop($data_pieces);
   ### Get the rows.
   preg_match_all('%^[-.\d ]+\r?$%m', $data_area, $matches);
   print_r($matches);
   ### Last row.
   print_r(array_pop($matches[0]));


}  //Part of array above.


?>
</pre>

And attempt2

<pre>
<?php

$nwac[0] = OSOALP;   
$nwac[1] = OSOCMT;   
$nwac[2] = OSOHUR;   
$nwac[3] = OSOMHM;   
$nwac[4] = OSOMSR;   
$nwac[5] = OSOMTB;   
$nwac[6] = OSOSK9;   
$nwac[7] = OSOWPS;   

for($counter = 0; $counter < 7; $counter += 1)
{

$raw = file("http://nwac.us/products/$nwac[$counter]");
$data = split("\t", $raw[30]);

echo print_r($data);

}
?>
</pre>

Sign In

Grab unformated text, organize into arrays and dump data into a table.

Recommended Posts

drexnefex

Link to comment

Share on other sites

drexnefex

Link to comment

Share on other sites

Archived

Browse

Activity

Important Information