Reading between two lines in text file

Valkrider · January 10, 2019

I need to read all the rows from a text file between two specific lines into an array where I can process them. I then need to do the same thing again until the end of the file. The file is too large to read the entire file into an array. The number of rows that are between the key lines are variable.

This is what I have so far to read the first set of rows into the array but it is only reading the first row when in fact in this case there are 15. Any pointers as to where I am going wrong please?

if ($_POST['srch']=="Import") {
                            //open file for reading
                            $family_rec=[];
                            $file_handle = fopen($fn, "r");
                            if(!$file_handle){echo "Could not open file";}
                            while(!feof($file_handle)) {
                                $line = fgets($file_handle);
                                if ((strpos($line, "0 @") !== false) && (strpos($line, "FAM") !== false)) {
                                    $famref = explode("@", $line);
                                    $line = fgets($file_handle);
                                    //echo $line."<br>";
                                    do{
                                        $family_rec[]=$line;
                                        $line = fgets($file_handle);
                                        //echo $line."<br>";
                                    }
                                    while((strpos($line, "0 @") !== false) && (strpos($line, "FAM") !== false));
                                    print_r($family_rec);
                                }//end of family

                            }//end of while not eof

                        }//end of import

kicken · January 10, 2019

It would help if you included an example of the file you are reading and which lines are your start and stop lines.

Valkrider · January 10, 2019

Sorry here is an example of two records:

0 @F1@ FAM
1 MARR
2 _SHAR @I3@
3 ROLE Witness
3 SOUR @S277@
2 DATE 27 SEP 1975
2 PLAC Neasden, London, NW2
2 ADDR St Catherine's Church
2 SOUR @S277@
2 HUSB
3 AGE 24y
2 WIFE
3 AGE 25y
1 HUSB @I1@
1 WIFE @I2@
1 CHIL @I67@
1 SOUR @S1@
2 NOTE Record originated in...
1 _UID A8FB7A6D2AC6D5118349525400DA6D5E773C
1 CHAN
2 DATE 8 JAN 2016
3 TIME 15:25:29
0 @F2@ FAM
1 MARR
2 _SHAN H.W. Piper
3 ROLE Witness
3 SOUR @S275@
2 DATE 27 MAR 1948
2 PLAC Gosport, Hampshire
2 ADDR Christ Church, The Parish Church
2 SOUR @S275@
3 DATA
4 DATE MAR 1948
4 TEXT Gosport 6b 544
2 HUSB
3 AGE 26y
2 WIFE
3 AGE 20y
1 HUSB @I3@
1 WIFE @I4@
1 CHIL @I1@
1 SOUR @S1@
2 NOTE Record originated in...
1 _UID 77FB7A6D2AC6D5118349525400DA6D5E462C
1 CHAN
2 DATE 10 DEC 2016
3 TIME 10:35:10

This is another record that is much shorter and in a different order to the other two:

0 @F69@ FAM
1 MARR
2 DATE 18 APR 2008
2 PLAC Datchet, Berkshire
1 HUSB @I67@
1 WIFE @I254@
1 SOUR @S1@
2 NOTE Record originated in...
1 _UID D3452E8B662E584AB5F3295593A5EBF04087
1 CHAN
2 DATE 29 SEP 2014
3 TIME 07:20:27

kicken · January 10, 2019

So you want to start reading data at the 0 @F*@ FAM line, then stop when you either hit the next FAM line or hit the end of file right?

<?php

$fp = fopen('valkrider.txt', 'r');

$record = [];
while (!feof($fp)){
    $line = trim(fgets($fp));

    if (isHeaderLine($line)){
        if (!empty($record)){
            doSomethingWithRecord($record);
            $record = [];
        }
    }

    $record[] = $line;
}

if (!empty($record)){
    doSomethingWithRecord($record);
}


function isHeaderLine($line){
    return strncmp($line, '0 @F',4) === 0;
}

function doSomethingWithRecord($record){
    var_dump($record);
}

This code works by just gathering every line read into a record array until it encounters one of those header lines. When it finds a header line it processes the previous record (if any) then starts a new record array. At the end of the loop it will process the final record if one exists.

I made the assumption that the header lines all begin with "0 @F". If that's not accurate you'll have to expand on that condition. I'm also assuming there are not lines you need to ignore at the start/end of the file. Again, if that's not true you'll need to make adjustments.

Valkrider · January 10, 2019

Thanks for that I will have a play with it, I will need to adjust it.

In answer to your questions:

Yes there is data before these rows start that I can already process and extract the data from successfully.

No the record will not always start with "0 @F" it will always start with "O @" and will later in the line have "FAM" hence the double strpos in my code.

Yes there is data after the last record BUT I am not interested in that.

Thanks once again for this.

gw1500se · January 10, 2019

It appears that you are processing a GEDCOM file. Why not use a GEDCOM parser that has already been developed?

Valkrider · January 10, 2019

2 hours ago, gw1500se said:

It appears that you are processing a GEDCOM file. Why not use a GEDCOM parser that has already been developed?

It is indeed a gedcom file. I tried that parser and it didn't do what I want and it is no longer developed / supported. I tried several others and they are all in the same position so I decided to do my own as I only require a small subset of the gedcom tags.

Sign In

Reading between two lines in text file

Recommended Posts

Valkrider

Link to comment

Share on other sites

kicken

Link to comment

Share on other sites

Valkrider

Link to comment

Share on other sites

kicken

Link to comment

Share on other sites

Valkrider

Link to comment

Share on other sites

gw1500se

Link to comment

Share on other sites

Valkrider

Link to comment

Share on other sites

Join the conversation

Browse

Activity

Important Information