Jump to content

Getting the info I need?


Nodral

Recommended Posts

Hi All

 

I'm absolutely cr@p at regex and to be quite honest I'm also being a bit lazy here as I'm on a short timescale and haven't got hours to read through loads of tutorials at the moment.  (I will be doing soon though coz it's not fair on you guys out there)

 

I have a huge text file which I need to search through and find the statement REGUK01.  This will only ever appear once in the file. before this is 2 new lines and then a percentage.  I need to be able to pull this particular percentage out.  eg, it's 69.1% this time, however this changes on a daily basis and I receive a new text file every day.

 

Any thoughts / ideas / help?

Link to comment
Share on other sites

If you could provide a couple of examples of the text that will be parsed it would be helpful. Is the percentage the only thing on the next line or is it embedded in other text. Is the percentage always in the format dd.d% or can it be a single digit (or no digits) before the decimal? Is there always 1 digit after the decimal? Can there be a percentage on the same line as "REGUK01"?

Link to comment
Share on other sites

Without more info, I can't be certain this will fulfill your needs, but this might work for you:

preg_match("#REGUK01.*?(\d{1,2}(\.\d+)?%)#s", $text, $match);
$percent = $match[1];

 

Notes:

 

1. It finds the first "percentage" that follows after "REGUK01". So if you have a percentage on the same line it will find that one instead of the one on the next line. For that matter if the first match if after the next line it will find that as well.

 

2. It will match a "percentage" that is in any of the following formats:

 

1%

12%

1.2%

1.23% (or any number of digits following the decimal)

12.3%

12.34% (or any number of digits following the decimal)

 

There must be one or two digits at the beginning. The decimal is optional and when it exist there must be one or more digits that follow after it.

Link to comment
Share on other sites

Here is an excerpt from the textfile showing the bit I want. 

 

   
      Month to Date Summary UK
      From: 01/08/2011 00:00
      To: 18/08/2011 00:00
     



       
          Responses
     Overall
      Satisfaction
     Rent Next
      Time
     Recommend
          
      GB
     UK
     4344
     8.6
     8.6
     8.5
     69.1% <--------------------THIS IS THE FIGURE I NEED TO PASS TO MY SCRIPT AND CHANGES DAY BY DAY
     
      REGUK01 <-----------------THIS ONLY EVER APPEARS HERE
     London
     611
     8.5
     8.5
     8.4
     65.7%
     
      TERUK11
     Heathrow 
     253
     8.4
     8.4
     8.2
     61.9%
     
      LHRT01
     London 
     252
     8.4
     8.4
     8.2
     61.8%
     
      LHRT10
     Heathrow 
     1
     10.0
     10.0
     10.0
     100.0%
     
      TERUK12
     Central London Territory
     200
     8.7
     8.7
     8.6
     72.5%

Link to comment
Share on other sites

I don't know of a good method to check for number of line breaks since they can be different between OSes and such. Your example above has "REGUK01" after the very first percent. I could provide a regex to find the very first percent in the file, but I have a suspicion that might not always be correct. Someone might have a regex solution for you, but I'm not sure how to work it out. The only solution I can come up with would be to iterate through each line.

 

$lines = file('filename.txt');
foreach($lines as $index => $line)
{
    if(strpos($line, 'REGUK01')!==false)
    {
        $percent = trim($lines[$index-2]);
        break;
    }
}
echo "Percent: {$percent}";

Link to comment
Share on other sites

<?php 

$pattern = '/([0-9]{1,3}(?:\.[0-9]+){0,1})%\s+REGUK01/';

$subject = getData();

preg_match($pattern, $subject, $matches);

print_r( $matches );

function getData() {
return <<<HEREDOC
   
      Month to Date Summary UK
      From: 01/08/2011 00:00
      To: 18/08/2011 00:00
     



       
          Responses
     Overall
      Satisfaction
     Rent Next
      Time
     Recommend
          
      GB
     UK
     4344
     8.6
     8.6
     8.5
     69.1%
     
      REGUK01
     London
     611
     8.5
     8.5
     8.4
     65.7%
     
      TERUK11
     Heathrow 
     253
     8.4
     8.4
     8.2
     61.9%
     
      LHRT01
     London 
     252
     8.4
     8.4
     8.2
     61.8%
     
      LHRT10
     Heathrow 
     1
     10.0
     10.0
     10.0
     100.0%
     
      TERUK12
     Central London Territory
     200
     8.7
     8.7
     8.6
     72.5%
HEREDOC;
}

?>

 

hope that helps

 

in english


([0-9]{1,3}(\.[0-9]+){0,1})%\s+REGUK01

Match the regular expression below and capture its match into backreference number 1 «([0-9]{1,3}(\.[0-9]+){0,1})»
   Match a single character in the range between “0” and “9” «[0-9]{1,3}»
      Between one and 3 times, as many times as possible, giving back as needed (greedy) «{1,3}»
   Match the regular expression below and capture its match into backreference number 2 «(\.[0-9]+){0,1}»
      Between zero and one times, as many times as possible, giving back as needed (greedy) «{0,1}»
      Match the character “.” literally «\.»
      Match a single character in the range between “0” and “9” «[0-9]+»
         Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
Match the character “%” literally «%»
Match a single character that is a “whitespace character” (spaces, tabs, and line breaks) «\s+»
   Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
Match the characters “REGUK01” literally «REGUK01»

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.