Jump to content

trace a line in a text file


homer.favenir

Recommended Posts

hi,

can anyone please help me on this.

i have a 2 fixed length text file. file 1 and file 2

i will compare file 1 to file 2.

i already have a program that can do it.

im using array_diff()

it gets the text file per line and compare.

if there is difference it will echo it.

 

file 1

0003 B ALRENCO INC                                                                                                                                                                1
1059 MEISTER RD                                                               LORAIN                                                                                   324-4130  2
0003 B AL'S AUTO GLASS & CARMS RADIATOR INC                                                                                                                                       1
201 EAST BRIDGE ST                                                            ELYRIA                                                                                   322-6270  2
0003 R ALSTON JIMMIE                                                                                                                                                              1
222 16TH ST                                                                   ELYRIA                                                                                   323-7196  2
0003 R ALSTON MYRA P                                                                                                                                                              1
229 BELL AV                                                                                                                                                            322-1505  2
0003 R ALTEN BRAD & KAREN                                                                                                                                                         1
12399 HAWKE RD                                                                COLUMBIA STATION                                                                         236-6673  2
0003 R ALTEN KENNETH                                                                                                                                                              1

file 2

0003 B ALRENCO INC                                                                                                                                                               V1
1059 MEISTER RD                                                               LORAIN                                                                                   324-4130 V2
0003 B AL'S AUTO GLASS & CARMS RADIATOR INC                                                                                                                                      V1
201 EAST BRIDGE ST                                                            ELYRIA                                                                                   322-6270 V2
0003 R ALSTON BRANDY N                                                                                                                                                           V1
310 OBERLIN RD                                                                                                                                                         326-0369 V2
0003 R ALSTON JIMMIE                                                                                                                                                             V1
222 16TH ST                                                                   ELYRIA                                                                                   323-7196 V2
0003 R ALSTON MYRA P                                                                                                                                                             V1
229 BELL AV                                                                                                                                                            322-1505 V2
0003 R ALTEN BRAD & KAREN                                                                                                                                                        V1

in file 1 after 201 EAST BRIDGE ST is 0003 R ALSTON JIMMIE

in file 2 after 201 EAST BRIDGE ST is 0003 R ALSTON BRANDY N

and in file 2 the 0003 R ALSTON JIMMIE is after R ALSTON BRANDY N

which means file 1 forgot to enter R ALSTON BRANDY N.

now, i have to skip and compare  0003 R ALSTON JIMMIE from file 1

to  0003 R ALSTON JIMMIE of file 2. and skip R ALSTON BRANDY N

 

i hope i explained it well......

 

can anyone please help

 

pls....

 

thanks

???

 

Link to comment
Share on other sites

The text files differ quite a bit, 1/2 vs. V1/V2 in the last column etc, is the last column to be ignored from the comparison?  Are your real-world text files to be compared line by line exactly or every 2 lines as the records appear to be 2 lines long?  Is the order important or can the records appear anywhere in the text file to be considered valid?  What about records in file1 that don't appear in file2 (if that should ever occur)?

Link to comment
Share on other sites

the 1/2 and V1/V2 is the level 1 and level 2

 

LEVEL 1

FIELD NAME                                 LENGTH                     COLUMN
Page                                            4                               1
Type Of Listing                              1                               6
Business Name                               78                             8

LEVEL 2                                    LENGTH                     COLUMN

Indent Number                                1                            1
Address/Informational Text                78                          2
City                                              30                           80
State                                            2                            111
Zip Code                                        5                            113
Toll Free Text                                 40                           124
Area Code                                      3                            165
Prefix Number                                 3                            169
Suffix Number                                 4                            174

 

its like checking the file1 to file2 from errors

 

compare the 2 text files according to field name, this comparison is in array so

it is per line.

but the problem is if file1 has missing line. so the error is from the missing line onwards.

like my previous post, e.g. of file1 and file2.

file 1 has missing line compare to file2.

 

please advise

 

thanks

 

 

Link to comment
Share on other sites

See if this works for your requirements:

<pre>
<?php
function showdiff($f1,$f2){
  $file1=preg_replace('/ +/',' ',preg_replace('/(.*?)\S+(?=\r\n|$)/','$1',file_get_contents('file1.txt')));
  $file2=preg_replace('/ +/',' ',preg_replace('/(.*?)\S+(?=\r\n|$)/','$1',file_get_contents('file2.txt')));
  preg_match_all('/.*?\r\n.*?(?:\r\n|$)/',$file1,$f1matches);
  preg_match_all('/.*?\r\n.*?(?:\r\n|$)/',$file2,$f2matches);
  foreach($f2matches[0] as $f2line){
    if(!in_array($f2line,$f1matches[0])){
      echo "missing <font color=red>$f2line</font> from file 1.<br>";
    }
  }
  foreach($f1matches[0] as $f1line){
    if(!in_array($f1line,$f2matches[0])){
      echo "missing <font color=red>$f1line</font> from file 2.<br>";
    }
  }
}
showdiff('file1.txt','file2.txt');
?>

 

output:

missing 0003 R ALSTON BRANDY N

310 OBERLIN RD 326-0369

from file 1.

 

 

The spacing is different between the two files so multiple spaces have been reduced to 1 space and the last column which also differs between the two files is ignored. 

 

The comparison is irrespective of location in the file, it is a success if a given 2-line record in file1 is located anywhere in file2 and vice versa.

 

Your last entry in both files was excluded from my testing since they included only line 1 and not line 2, making a comparison of both not possible.

Link to comment
Share on other sites

ok. what i did is I substring each line so that i can get the fields, e.g. name for level1 and address etc

for level2, compare each field from file1 to file2.

so the error is per field.

 

the problem is if there is a missing line in file1 then the matching of each line will be misplace

as what 0003 R ALSTON JIMMIE of file1 has happened, it was compare to 0003 R ALSTON BRANDY N of

file2, instead it should skip 0003 R ALSTON BRANDY N because 0003 R ALSTON JIMMIE is right after

0003 R ALSTON BRANDY.

 

what happened is the mismatching comparison of each line because of missing line

 

hope i explained it well

 

please advice

 

thanks for the help, hope i didnt bother you

Link to comment
Share on other sites

To do that level of detail comparison each line would need to be broken out into fields by a different method than what I used.

 

Your file1 uses \r\n as line separators while file2 only uses \n so that was throwing off my code.  My code compares both lines of a record together so it's output is different than from your program.

 

<pre>
<?php
function showdiff($f1,$f2){
$file1=preg_replace('/ +/',' ',preg_replace('/(.*?)\S+(?=\r\n)/','$1',file_get_contents('file1.txt')));
$file2=preg_replace('/ +/',' ',preg_replace('/(.*?)\S+(?=\r\n)/','$1',preg_replace('/\n/',"\r\n",file_get_contents('file2.txt'))));
$f1count=preg_match_all('/(?:.*?\r\n){2}/',$file1,$f1matches);
$f2count=preg_match_all('/(?:.*?\r\n){2}/',$file2,$f2matches);
foreach($f2matches[0] as $f2line){
	if(!in_array($f2line,$f1matches[0])){
		echo "missing <font color=red>$f2line</font> from file 1.<br>";
	}
}
foreach($f1matches[0] as $f1line){
	if(!in_array($f1line,$f2matches[0])){
		echo "missing <font color=red>$f1line</font> from file 2.<br>";
	}
}
}
showdiff('file1','file2');
?>

Link to comment
Share on other sites

first i need to extract each field for each file

then i must compare them per line, when there is a difference it will be echoed.

and consider it as an error

but the problem is when file1 has missing line, it will compare the wrong line

to file2.

you will notice in my image the line 92

it starts the mismatch of line there

 

 

Link to comment
Share on other sites

This is actually a string question and not a regex question.

 

Here's the code required to parse the lines into fields within the files, I will leave it to you to work out the specifics on how you want to compare what.  The code and array output of the example (shown with file1 but file2 is parsed with the same code) should give you a start in the right direction.

 

<pre>

<?php

function parseline($line){

if(substr($line,178,1)=='2'){

$result['level']=2;

$result['indent']=substr($line,0,1);

$result['address']=substr($line,1,78);

$result['city']=substr($line,79,30);

$result['state']=substr($line,110,2);

$result['zip']=substr($line,113,5);

$result['tollfree']=substr($line,123,40);

$result['areacode']=substr($line,164,3);

$result['prefix']=substr($line,168,3);

$result['suffix']=substr($line,172,4);

} elseif(substr($line,178,1)=='1') {

$result['level']=1;

$result['page']=substr($line,0,4);

$result['type']=substr($line,5,1);

$result['name']=substr($line,7,78);

} else {

return false;

}

return array_map('trim',$result);

}

$lines=file('file1.txt');

foreach($lines as $line){

$fields=parseline($line);

if($fields){

echo "<hr>line:<br>$line<br>\$fields ";

echo print_r($fields,true);

}

}

?>

Link to comment
Share on other sites

hi,

i have a script already to get all fields.

and compare each field from file1 to file2.

the problem is if number of lines of both file does not match.

it will compare the field to wrong field.

in my example file, in file1 0003 R ALSTON JIMMIE was compare to 0003 R ALSTON BRANDY N because file1 has missing line starting from ALSTON JIMMIE.

<pre>
<?php
$kebase = '//asecasianas2/DS_Keying/STATS/KE-DIR/';
$qcbase = '//asecasianas2/DS_Keying/STATS/QC-DIR/';
$dir = 'sample';
$kedir = $kebase . $dir;
$qcdir = $qcbase . $dir;
$kefiles = scandir($kedir);
$qcfiles = scandir($qcdir);
$count_files = count($qcfiles);
echo $count_files . " " . "files found" . " " . "in" . " " . $dir . " " . "directory" . "<br>";

for($i = 2; $i <= $count_files; $i++)
{

	echo $qcfiles[$i] . "<br>";
	$file1 = file($kedir . "/" . $kefiles[$i]);
	$file2 = file($qcdir . "/" . $qcfiles[$i]);
	$file1_count = count($file1);
	$file2_count = count($file2);

	$count = 1;
	for($x = 0; $x <= $file2_count; $x++)
	{
		$vtag1 = substr($file1[$x],178,1);
		$vtag2 = substr($file2[$x],178,1);
		$page1 = substr($file1[$x],0,4);
		$page2 = substr($file2[$x],0,4);
		$name1 = substr($file1[$x],7,78);
		$name2 = substr($file2[$x],7,78);
		$addr1 = substr($file1[$x], 1, 78);
		$addr2 = substr($file2[$x], 1, 78);
		$city1 = substr($file1[$x], 79, 30);
		$city2 = substr($file2[$x], 79, 30);
		$state1 = substr($file1[$x], 110, 2);
		$state2 = substr($file2[$x], 110, 2);
		$zc1 = substr($file1[$x], 112, 5);
		$zc2 = substr($file2[$x], 112, 5);
		$tf1 = substr($file1[$x], 123, 40);
		$tf2 = substr($file2[$x], 123, 40);
		$phone1  = substr($file1[$x], 164, 12);
		$phone2  = substr($file2[$x], 164, 12);
		if($vtag2 == 1)
		{
			$name_comp = strcmp($name1, $name2);
				if($name_comp != 0)
				{
					//echo "line".$count."error name" . "<br>";
					$err_stat = 1;
					$Error = 1;
					$err_line = $count;
					$err_name1 = $name1;
					$err_name2 = $name2;
					echo "Line" . $err_line . " " . "KE Name" . " " . $err_name1 . "<br> ";
					echo "       QC Name" . " " . $err_name2 . "<br>";
				}
		}else{
			$addr_comp = strcmp($addr1, $addr2);
			$city_comp = strcmp($city1, $city2);
			$state_comp = strcmp($state1, $state2);
			$zc_comp = strcmp($zc1, $zc2);
			$tf_comp = strcmp($tf1, $tf2);
			$phone_comp = strcmp($phone1, $phone2);
				if($addr_comp != 0)
				{
					$Error = 1;
					$err_line = $count;
					$err_addr1 = $addr1;
					$err_addr2 = $addr2;
					echo "Line" . $err_line . " " . "KE address" . " " . $err_addr1 . "<br>";
					echo "       QC address" . " " . $err_addr2 . "<br>";
				}
				if($city_comp != 0)
				{
					$Error = 1;
					$err_line = $count;
					$err_city1 = $city1;
					$err_city2 = $city2;
					echo "Line" . $err_line . " " . "KE City" . " " .  $err_city1 . "<br>";
					echo  "       QC City" . " " . $err_city2 . "<br>";
				}
				if($state_comp != 0)
				{
					$Error = 1;
					$err_line = $count;
					$err_state1 = $state1;
					$err_state2 = $state2;
					echo "Line" . $err_line . " " . "KE State" . " " .  $err_state1 . "<br>";
					echo  "       QC State" . " " . $err_state2 . "<br>";
				}
				if($zc_comp != 0)
				{
					$Error = 1;
					$err_line = $count;
					$err_zc1 = $zc1;
					$err_zc2 = $zc2;
					echo "Line" . $err_line . " " . "KE ZipCode" . " " .  $err_zc1 . "<br>";
					echo  "       QC ZipCode" . " " . $err_zc2 . "<br>";
				}
				if($tf_comp != 0)
				{
					$Error = 1;
					$err_line = $count;
					$err_tf1 = $tf1;
					$err_tf2 = $tf2;
					echo "Line" . $err_line . " " . "KE Toll Free Text" . " " .  $err_tf1 . "<br>";
					echo  "       QC Toll Free Text" . " " . $err_tf2 . "<br>";
				}
				if($phone_comp != 0)
				{
					$Error = 1;
					$err_line = $count;
					$err_phone1 = $phone1;
					$err_phone2 = $phone2;
					echo "Line" . $err_line . " " . "KE phone" . " " .  $err_phone1 . "<br>";
					echo  "       QC phone" . " " . $err_phone2 . "<br>";
				}

		}
		$count++;

	}

}

?>
</pre>

 

i think i can use in_array? to search each field of file1 to file2?

 

anyone please help

 

thanks in advance

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.