Jump to content

[SOLVED] preg_split


drexnefex

Recommended Posts

hello.

i have a script that parses text files containing NOAA Buoy data.  recently NOAA changed the format of the header row contained in the text files.  as a result the script broke.

 

im not the author of the script though i've heavily customized it by stumbling my way through php and regex's.

 

i've been tinkering with this for a while now and am completely stumped.

 

the script is attached.

 

here's the old header that the script read:

YYYY  MM  DD  hh  mm  WD  WSPD  GST  WVHT  DPD  APD  MWD  BARO  ATMP  WTMP  DEWP  VIS  PTDY  TIDE

 

and the new format (includes an extra line)

#YY  MM DD hh mm WDIR WSPD GST  WVHT  DPD  APD MWD  PRES  ATMP  WTMP  DEWP  VIS PTDY  TIDE

#yr  mo dy hr mn degT m/s  m/s    m  sec  sec degT  hPa  degC  degC  degC  mi  hPa    ft

 

here's a link to one of the files that the script would parse:

http://www.ndbc.noaa.gov/data/realtime2/44004.txt

 

can anyone point me in the right direction?

 

[attachment deleted by admin]

Link to comment
Share on other sites

Sounds like a pretty cool project. You've got the right idea in your script here. (And kudos on the classic perl use of split too!) I don't see any commas in the data so, you can safely change this:

preg_split('/[\s,]+/', $line);

to this:

preg_split('/\s+/', $line);

All the number of fields should line up too. You just need to skip over those commented header lines at the top. One way to do this would be:

if(preg_match('/^#/', $line) {
   continue;
}

Which essentially skips lines that begin with a '#'. These are all just data points so the headers are unlikely to change (and really as far as numbers go you don't care what they are). Just stick that in at line 202:

 

Line 201:

while((!feof($fpread)) && ($numLines < $maxReadings)) {
   if(preg_match('/^#/', $line) {
      continue; // Skip header lines
   }
   $line = eregi_replace("MM","N/A",$line);
   list($YYYY,$MM,$DD,$hh,$mm,$WD,$WSPD,$GST,$WVHT,$DPD,$APD,$MWD,$BARO,$ATMP,$WTMP,$DEWP,$VIS,$PTDY,$TIDE) = preg_split("/[\s,]+/", $line);
   # Format the date to MM-DD-YYYY or DD-MM-YYYY
   if($intlDateFormat == 0) {
   $formattedDate = $MM."-".$DD."-".$YYYY;	
   } elseif($intlDateFormat != 0) {
   $formattedDate = $DD."-".$MM."-".$YYYY;
   }

 

Give that a shot and see it it helps!

Link to comment
Share on other sites

Thanks for the reply C4.

 

What does the regex '/\s+/' look for?  What is the difference between your suggestion and the original regex: '/[\s,]+/'  ?

 

I inserted your snippet into the script but got a parse error on this line (pretty sure it's the bracket causing the problem):      if(preg_match('/^#/', $line) {

I tried a bunch of different combinations of moving that bracket around but kept getting parse errors.

 

Any ideas what would cause the error?

 

Thanks for taking the time.

 

 

Link to comment
Share on other sites

Not a problem, glad to help. The way that preg_split (borrowed from perl's split) works is it splits a string in to separate fields allowing you to specify the separator in the regex. That's where the:

/\s+/

Comes in. Before you had a character class (this thing [...]) which essentially says, "match anything in here". Since you had the comma too, if you had a field with a comma in it (you don't, and you don't want it to behave like that if you do) it would think that that was another separator that you wanted to split fields with.

 

As for the parse error, that's a little strange. That code is bueno. Ah! Wait! I forgot to close that second parenthesis! It should like this:

if(preg_match('/^#/', $line)) {
   continue;
}

Give that a shot, and change:

preg_split("/[\s,]+/", $line)

to

preg_split('/\s+/', $line)

We'll see how that treats ya.

Link to comment
Share on other sites

well. that fix at least got rid of the error....browse just keeps chugging though.  i was getting CPU exceedance warnings though.

 

so something is wrong.

 

here's something that might be an issue.  the header rows, both of them, are references for what type of data and what unit of measurement is in that particular column.  example: the $WVHT refers to wave height.  the unit of measurement is in meters. 

 

later on in the script there is some conversion code (meters to feet) that reference this variable $WVHT.

 

so here's my new question:  is skipping over the two header rows which contain the type and unit measurement of data eliminate that information getting passed to the meters to feet conversion part of this script? 

 

something is causing this thing to chug forever....

 

here's the latest version, complete with your (C4) advice implemented.

 

anything stand out that might be causing this to fail?

 

[attachment deleted by admin]

Link to comment
Share on other sites

  • 3 weeks later...

I did a little rewrite... and its now working (for me on php5) - but I changed too much to know where it was going wrong - although the fsockopen() followed by an fopen() looked a bit weird.

 

There are still a number of small improvements(!) outstanding - like managing file modification dates and timezones a little better - and perhaps putting in proper tags to replace values rather than using eval.

However it's all working...

 

Let me know if you have any problems with it.

 

 

 

[attachment deleted by admin]

Link to comment
Share on other sites

hey Jason!

 

thanks for the reply.

 

thanks for taking the time to tinker with this stuff.  it's way over my head.

 

I tried the attached scripts....they all errored out.

 

got the following error:  Fatal error: Call to undefined function: array_combine() in ....../phpBuoy.php on line 448

 

wow, you really modified the code.  im super curious to see how it works.

 

any ideas as to what's going wrong on my end?

 

 

Link to comment
Share on other sites

damn...i just noticed that my web host is running php4.4.6. 

 

i tried the script on a test server at my work (not the same machine that host my website) which is running the latest version of PHP...and it works great.

 

Any chance this could be tweaked to run on a php4 machine?

 

-S

Link to comment
Share on other sites

Hi

 

No problem, thanks for getting back - I'm glad it works... yes I did get a bit carried away...

 

The error is because array_combine is a php5 only function

 

- which is why I coded it like this...

 

function my_array_combine($keys,$values)
{
if (function_exists('array_combine')) {
	return array_combine($keys,$values);
} else {
	return php4_array_combine($keys,$values);
}
}
function php4_array_combine($keys,$values)
// if running PHP4 then you'll need this PHP5 function...
{
$result = array();
$keys = (array)$keys;
$i=0;
if (count($keys) == count($values)) {
	foreach ($keys as $key) {
		$result[$key] = $values($i);
		$i++;
	}
	return $result;
} else {
	return false;
}
}

 

In the hope that it would work on php4...

Ah - i see what I didnt do - actually use it :)

If you change array_combine to my_array_combine on lines 448 & 464 it should work...

 

alternatively use the file attached...

 

 

 

[attachment deleted by admin]

Link to comment
Share on other sites

Hi,

I also don't have access to PM's in this forum - or is there something I need to do to gain that? So I'll reply to your message here.

 

 

well, thanks Jason.

 

that one produces this error:

Fatal error: Call to undefined function: array() in /.../phpBuoy.php on line

 

thanks for your help man.

 

so are you using this script for anything or are you just really good at fixing php scripts?

Please can you quote the complete error message that you are getting - including line numbers etc. Have you been able to run this on php5?

 

Not using the script and apparently not any good at all if its still crashing out...

Although it is working fine on my PC on php5 and on a linux test server on php5.

 

I just happened across your post when looking for some regular expression stuff... and thought - cool script - can't be that hard / won't take long to get that working... :)

I'm really just learning php and prefer to do something whilst learning.

 

 

 

[attachment deleted by admin]

Link to comment
Share on other sites

I 'googled' the error message that you were getting...

Fatal error: Call to undefined function: array() in /.../phpBuoy.php on line

which helped me find the (obvious?) error in this line...

374:   $result[$key] = $values($i);

 

so I fixed it to read...

 

374:   $result[$key] = $values[$i];

 

It should all work now... You'll need to remove the .txt extension of course

 

[attachment deleted by admin]

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.