Jump to content

Reading between two lines in text file


Valkrider

Recommended Posts

I need to read all the rows from a text file between two specific lines into an array where I can process them. I then need to do the same thing again until the end of the file. The file is too large to read the entire file into an array. The number of rows that are between the key lines are variable.

This is what I have so far to read the first set of rows into the array but it is only reading the first row when in fact in this case there are 15. Any pointers as to where I am going wrong please?

if ($_POST['srch']=="Import") {
                            //open file for reading
                            $family_rec=[];
                            $file_handle = fopen($fn, "r");
                            if(!$file_handle){echo "Could not open file";}
                            while(!feof($file_handle)) {
                                $line = fgets($file_handle);
                                if ((strpos($line, "0 @") !== false) && (strpos($line, "FAM") !== false)) {
                                    $famref = explode("@", $line);
                                    $line = fgets($file_handle);
                                    //echo $line."<br>";
                                    do{
                                        $family_rec[]=$line;
                                        $line = fgets($file_handle);
                                        //echo $line."<br>";
                                    }
                                    while((strpos($line, "0 @") !== false) && (strpos($line, "FAM") !== false));
                                    print_r($family_rec);
                                }//end of family

                            }//end of while not eof

                        }//end of import

 

Link to comment
Share on other sites

Sorry here is an example of two records:

0 @F1@ FAM
1 MARR
2 _SHAR @I3@
3 ROLE Witness
3 SOUR @S277@
2 DATE 27 SEP 1975
2 PLAC Neasden, London, NW2
2 ADDR St Catherine's Church
2 SOUR @S277@
2 HUSB
3 AGE 24y
2 WIFE
3 AGE 25y
1 HUSB @I1@
1 WIFE @I2@
1 CHIL @I67@
1 SOUR @S1@
2 NOTE Record originated in...
1 _UID A8FB7A6D2AC6D5118349525400DA6D5E773C
1 CHAN
2 DATE 8 JAN 2016
3 TIME 15:25:29
0 @F2@ FAM
1 MARR
2 _SHAN H.W. Piper
3 ROLE Witness
3 SOUR @S275@
2 DATE 27 MAR 1948
2 PLAC Gosport, Hampshire
2 ADDR Christ Church, The Parish Church
2 SOUR @S275@
3 DATA
4 DATE MAR 1948
4 TEXT Gosport 6b 544
2 HUSB
3 AGE 26y
2 WIFE
3 AGE 20y
1 HUSB @I3@
1 WIFE @I4@
1 CHIL @I1@
1 SOUR @S1@
2 NOTE Record originated in...
1 _UID 77FB7A6D2AC6D5118349525400DA6D5E462C
1 CHAN
2 DATE 10 DEC 2016
3 TIME 10:35:10

This is another record that is much shorter and in a different order to the other two:

0 @F69@ FAM
1 MARR
2 DATE 18 APR 2008
2 PLAC Datchet, Berkshire
1 HUSB @I67@
1 WIFE @I254@
1 SOUR @S1@
2 NOTE Record originated in...
1 _UID D3452E8B662E584AB5F3295593A5EBF04087
1 CHAN
2 DATE 29 SEP 2014
3 TIME 07:20:27

 

Link to comment
Share on other sites

So you want to start reading data at the 0 @F*@ FAM line, then stop when you either hit the next FAM line or hit the end of file right?

 

<?php

$fp = fopen('valkrider.txt', 'r');

$record = [];
while (!feof($fp)){
    $line = trim(fgets($fp));

    if (isHeaderLine($line)){
        if (!empty($record)){
            doSomethingWithRecord($record);
            $record = [];
        }
    }

    $record[] = $line;
}

if (!empty($record)){
    doSomethingWithRecord($record);
}


function isHeaderLine($line){
    return strncmp($line, '0 @F',4) === 0;
}

function doSomethingWithRecord($record){
    var_dump($record);
}

This code works by just gathering every line read into a record array until it encounters one of those header lines.  When it finds a header line it processes the previous record (if any) then starts a new record array.  At the end of the loop it will process the final record if one exists.

I made the assumption that the header lines all begin with "0 @F".  If that's not accurate you'll have to expand on that condition.  I'm also assuming there are not lines you need to ignore at the start/end of the file.  Again, if that's not true you'll need to make adjustments.

 

 

Link to comment
Share on other sites

Thanks for that I will have a play with it, I will need to adjust it.

In answer to your questions:

Yes there is data before these rows start that I can already process and extract the data from successfully.

No the record will not always start with "0 @F" it will always start with "O @" and will later in the line have "FAM" hence the double strpos in my code.

Yes there is data after the last record BUT I am not interested in that.

Thanks once again for this.

Link to comment
Share on other sites

2 hours ago, gw1500se said:

It appears that you are processing a GEDCOM file. Why not use a GEDCOM parser that has already been developed?

It is indeed a gedcom file. I tried that parser and it didn't do what I want and it is no longer developed / supported. I tried several others and they are all in the same position so I decided to do my own as I only require a small subset of the gedcom tags.

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.