Jump to content

[SOLVED] need help fixing regex


tomasd

Recommended Posts

Hi,

I have regex funtion extracting flight details from my data;

<?php
        function regex_($data) {
        // Sample data:
        // Regular FareAdult19.99 GBPWed,22 Oct 08FlightFR 214407:00Depart11:35Arrive
        // Regular FareAdult19.99 GBPWed,22 Oct 08FlightFR 214817:05Depart21:40Arrive
                $regex_fare = "(Regular FareAdult(\d+.\d{2}) (\w{3}))";
                $regex_date = "(\d{1,2} \w{3} \d{2})Flight";
                $regex_flight_and_depart = "FR (\d{3,4})(\d{2}:\d{2}).Depart";
       //       $regex_depart = "(\d{2}:\d{2}).Depart";
                $regex_arrive = "(\d{2}:\d{2}).Arrive";
                preg_match_all("/$regex_fare|$regex_date|$regex_flight_and_depart|$regex_arrive/", $data, $result, PREG_PATTERN_ORDER);

                return $result;
        }

?>

It all works just fine, the only problem is that my data sometimes differs i.e.

Regular FareAdult19.99 GBPWed,22 Oct 08FlightFR 214407:00Depart11:35Arrive
SPECIAL OFFERAdult19.99 GBPWed,22 Oct 08FlightFR 214407:00Depart11:35Arrive
NO TAXESAdult19.99 GBPWed,22 Oct 08FlightFR 214407:00Depart11:35Arrive

And often I'm getting only parts like $regex_date $regex_flight_and_depart $regex_arrive matched, but not $regex_fare as $regex_fare differs time to time (Regular Fare|SPECIAL OFFER|NO TAXES).

My question is how can I run my regex as 1 long regex so if 1 part of it is not matched, nothing is matched?

I tried

                preg_match_all("/$regex_fare.$regex_date.$regex_flight_and_depart.$regex_arrive/", $data, $result,

                preg_match_all("/($regex_fare)($regex_date)($regex_flight_and_depart)($regex_arrive)/", $data, $result,

but no joy, any help is appreciated.

 

 

 

Link to comment
Share on other sites

What do you expect to be between the patterns? A general separator is .*?.

hey thanks for the tip, I tried;

preg_match_all("/$regex_fare.*?.$regex_date.*?.$regex_flight_and_depart.*?.$regex_arrive/", $data, $result, PREG_PATTERN_ORDER);

but $result array returns as empty.

 

To clarify further on what I'm trying to do here...

The problem I'm having with my regex is that it continues to match $regex_date $regex_flight_and_depart $regex_arrive and write the values to an array when $regex_fare = "(Regular FareAdult(\d+.\d{2}) (\w{3}))" is not matched. That is when my $data fare field looks different like SPECIAL OFFERAdult or NO TAXESAdult not Regular FareAdult. My $data contains ~10 results and they're not the same i.e. I have 3 Regular FareAdult, 2 SPECIAL OFFERAdult and some NO TAXESAdult. What I'm trying to achieve is to parse all and only results starting with Regular FareAdult from $data, then all and only results starting with SPECIAL OFFERAdult and lastly all results starting with NO TAXESAdult... but with my current syntax my regex is treated like 4 regex not 1.

Link to comment
Share on other sites

Why do you have the extra periods in there? You do not concatenate patterns.

 

with my current syntax my regex is treated like 4 regex not 1.

 

This is what alternation does. In another post I had given you a single expression. Do we need to return to that and make modifications?

 

What I'm trying to achieve is to parse all and only results starting with Regular FareAdult from $data, then all and only results starting with SPECIAL OFFERAdult and lastly all results starting with NO TAXESAdult...

 

Are you parsing each line individually, or the entire string?

Link to comment
Share on other sites

hey thanks for your reply ;)

 

This is what alternation does. In another post I had given you a single expression. Do we need to return to that and make modifications?

quite possibly, if you could please tell me how can I extract bolded from the following;

Regular FareAdult19.99 GBPWed,22 Oct 08FlightFR 214407:00Depart11:35Arrive

if possible could you please reuse my original syntax?

 

 

Are you parsing each line individually, or the entire string?

my $data is contains stripped html which I'm getting via curl

 

...

DayNext Day »Select A FlightSelect a FlightSPECIAL OFFERAdult0.00 GBPThu, 25 Sep 08FlightFR 20306:30 Depart07:45 ArriveSPECIAL OFFERAdult0.00 GBPThu, 25 Sep 08FlightFR 20508:10 Depart09:25 ArriveSPECIAL OFFERAdult0.00 GBPThu, 25 Sep 08FlightFR 20709:35 Depart10:50 ArriveSPECIAL OFFERAdult0.00 GBPThu, 25 Sep 08FlightFR 21711:15 Depart12:30 ArriveRegular FareAdult24.99 GBPThu, 25 Sep 08FlightFR 21112:00 Depart13:15 ArriveSPECIAL OFFERAdult0.00 GBPThu, 25 Sep 08FlightFR 22515:45 Depart17:00 ArriveSPECIAL OFFERAdult0.00 GBPThu, 25 Sep 08FlightFR 29517:10 Depart18:25 ArriveSPECIAL OFFERAdult0.00 GBPThu, 25 Sep 08FlightFR 22718:30 Depart19:45 ArriveRegular FareAdult9.99 GBPThu, 25 Sep 08FlightFR 29319:35 Depart20:50 ArriveSPECIAL OFFERAdult0.00 GBPThu, 25 Sep 08FlightFR 29721:45 Depart23:00 ArriveSelect Your Flights and C...

Link to comment
Share on other sites

Try this, dealing with each flight at a time:

 

<pre>
<?php
$data = 'DayNext Day »Select A FlightSelect a FlightSPECIAL OFFERAdult0.00 GBPThu, 25 Sep 08FlightFR 20306:30 Depart07:45 ArriveSPECIAL OFFERAdult0.00 GBPThu, 25 Sep 08FlightFR 20508:10 Depart09:25 ArriveSPECIAL OFFERAdult0.00 GBPThu, 25 Sep 08FlightFR 20709:35 Depart10:50 ArriveSPECIAL OFFERAdult0.00 GBPThu, 25 Sep 08FlightFR 21711:15 Depart12:30 ArriveRegular FareAdult24.99 GBPThu, 25 Sep 08FlightFR 21112:00 Depart13:15 ArriveSPECIAL OFFERAdult0.00 GBPThu, 25 Sep 08FlightFR 22515:45 Depart17:00 ArriveSPECIAL OFFERAdult0.00 GBPThu, 25 Sep 08FlightFR 29517:10 Depart18:25 ArriveSPECIAL OFFERAdult0.00 GBPThu, 25 Sep 08FlightFR 22718:30 Depart19:45 ArriveRegular FareAdult9.99 GBPThu, 25 Sep 08FlightFR 29319:35 Depart20:50 ArriveSPECIAL OFFERAdult0.00 GBPThu, 25 Sep 08FlightFR 29721:45 Depart23:00 ArriveSelect Your Flights and C...';
$flights = preg_split('/(?<=FR)/', $data);
print_r($flights);
foreach ($flights as $flight) {
	### Parse here.
}
?>
</pre>

Link to comment
Share on other sites

Try this, dealing with each flight at a time:

 

Hey, thanks for the code, I've tried running it and it looks like it might work. I'm a little worried I might run in other problems if I abandon my current method, the reason is because of what happens after data is written to an array...

Could you please tell me (as you already did once) how can I change my current regex so if one part of it doesn't match, nothing matches?

Link to comment
Share on other sites

ok finally I got it...

 

$regex = "Regular FareAdult(\d+.\d{2}) (\w{3})(\w{3},.)(\d{1,2} \w{3} \d{2})FlightFR (\d{3,4})(\d{2}:\d{2}).Depart(\d{2}:\d{2}).Arrive";
preg_match_all("/($regex)/", $data, $result, PREG_PATTERN_ORDER);

 

Can somebody please tell me how can I pass below

$regex_price = "Regular FareAdult(\d+.\d{2}) (\w{3})";
$regex_date = "(\w{3},.)(\d{1,2} \w{3} \d{2})";
$regex_flight = "FlightFR (\d{3,4})";
$regex_depart = "(\d{2}:\d{2}).Depart";
$regex_arrive = "(\d{2}:\d{2}).Arrive";

to

preg_match_all("/(??????)/", $data, $result, PREG_PATTERN_ORDER);

?

thanks!

 

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.