Jump to content


Photo

RegEx - Lost in translation


  • Please log in to reply
4 replies to this topic

#1 Andre

Andre
  • New Members
  • Pip
  • Newbie
  • 2 posts
  • LocationSingapore

Posted 26 May 2006 - 03:04 PM

I coded a script, which parses incoming data from an mailbox (imap via socket), from which I now need to extract the data. What is the best way to do this? I had several attempts with regular expressions but never got something well working...

I cutted at '\r\n' and had one array per datapackage which I then searched for keywords with strpos() [e.g. 'Date'] and then used substr() to get the data [e.g. 18.11.01 00:04:04]

Maybe someone here has a better solution!! The big problem is that some values don't have a fixed number of digits, which makes substr() not a very good choice...



Desired data:
Date, Lc, Lat1 - Lon2, Pass duration, Altitude


Data Example
(attention: Additional white spaced missing in this post.)

1234 Date : 18.11.01 00:04:04 LC : 1 IQ : 50
Lat1 : 0.997N Lon1 : 86.470E Lat2 : 7.013N Lon2 : 60.194E
Nb mes : 006 Nb mes>-120dB : 000 Best level : -124 dB
Pass duration : 236s NOPC : 3
Calcul freq : 401 652669.6 Hz Altitude : 0 m
00 46 02 00
00 00 00 00
00 240 00 00
00 03 195 195


1234 Date : 18.11.01 01:36:18 LC : A IQ : 08
Lat1 : 0.998N Lon1 : 86.502E Lat2 : 10.384N Lon2 : 44.491E
Nb mes : 003 Nb mes>-120dB : 000 Best level : -131 dB
Pass duration : 282s NOPC : 3
Calcul freq : 401 652667.5 Hz Altitude : 0 m
00 46 08 00
00 00 00 00
00 248 00 00
00 00 00 218


3214 Date : 18.11.01 03:09:08 LC : 2 IQ : 58
Lat1 : 1.004N Lon1 : 86.530E Lat2 : 0.103N Lon2 : 90.360E
Nb mes : 012 Nb mes>-120dB : 000 Best level : -124 dB
Pass duration : 788s NOPC : 4
Calcul freq : 401 652664.2 Hz Altitude : 0 m
00 45 238 02
181 41 74 86
165 43 90 150
173 93 212 05

(possibly one additional whitespace in front of the 4-digit numbers in front)

#2 kiss-o-matic

kiss-o-matic
  • Members
  • PipPipPip
  • Advanced Member
  • 51 posts

Posted 26 May 2006 - 08:54 PM

If the strings you are parsing are of different size, substr() is probably not a good choice. Regex is about the only way I can think of. I assume the roadblock you hit w/ your regex previously is b/c of newline characters? But you stripped those out, so that shouldn't be the problem.

#3 poirot

poirot
  • Members
  • PipPipPip
  • Advanced Member
  • 646 posts
  • LocationAustin, TX

Posted 26 May 2006 - 09:54 PM

You can use substr and strstr to determine character position.
I'll be right back with a code.

But I think RegEx would be a better solution.
~ D Kuang

#4 poirot

poirot
  • Members
  • PipPipPip
  • Advanced Member
  • 646 posts
  • LocationAustin, TX

Posted 26 May 2006 - 10:20 PM

OK, I am not good with RegEx, but at least it works:
<?php

// Date, Lc, Lat1 - Lon2, Pass duration, Altitude 

$str = '1234 Date : 18.11.01 00:04:04 LC : 1 IQ : 50
Lat1 : 0.997N Lon1 : 86.470E Lat2 : 7.013N Lon2 : 60.194E
Nb mes : 006 Nb mes>-120dB : 000 Best level : -124 dB
Pass duration : 236s NOPC : 3
Calcul freq : 401 652669.6 Hz Altitude : 0 m
00 46 02 00
00 00 00 00
00 240 00 00
00 03 195 195';

preg_match('/Date :(.*)LC :(.*)IQ :(.*)Lat1 :(.*)Lon1 :(.*)Lon2 :(.*)Nb mes :(.*)Pass duration :(.*)NOPC(.*)Altitude :(.*) m/s', $str, $m);

for($i=0; $i<count($m); $i++) {
   $m[$i] = trim($m[$i]);
}

$info = array ( 
   'date' => $m[1],
   'lc'   => $m[2], 
   'lat1' => $m[4], 
   'lon2' => $m[6], 
   'pd'   => $m[8], 
   'alt'  => $m[10],
);

echo '<pre>';
print_r($info);

?>
Array
(
    [date] => 18.11.01 00:04:04
    [lc] => 1
    [lat1] => 0.997N
    [lon2] => 60.194E
    [pd] => 236s
    [alt] => 0
)

~ D Kuang

#5 Andre

Andre
  • New Members
  • Pip
  • Newbie
  • 2 posts
  • LocationSingapore

Posted 27 May 2006 - 02:03 AM

Thank you very very very much! This is the perfect solution :-)
RegEx had always been a big mysterium for me...





0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users