homer.favenir Posted November 19, 2008 Share Posted November 19, 2008 hi, can anyone please help me on this. i have a 2 fixed length text file. file 1 and file 2 i will compare file 1 to file 2. i already have a program that can do it. im using array_diff() it gets the text file per line and compare. if there is difference it will echo it. file 1 0003 B ALRENCO INC 1 1059 MEISTER RD LORAIN 324-4130 2 0003 B AL'S AUTO GLASS & CARMS RADIATOR INC 1 201 EAST BRIDGE ST ELYRIA 322-6270 2 0003 R ALSTON JIMMIE 1 222 16TH ST ELYRIA 323-7196 2 0003 R ALSTON MYRA P 1 229 BELL AV 322-1505 2 0003 R ALTEN BRAD & KAREN 1 12399 HAWKE RD COLUMBIA STATION 236-6673 2 0003 R ALTEN KENNETH 1 file 2 0003 B ALRENCO INC V1 1059 MEISTER RD LORAIN 324-4130 V2 0003 B AL'S AUTO GLASS & CARMS RADIATOR INC V1 201 EAST BRIDGE ST ELYRIA 322-6270 V2 0003 R ALSTON BRANDY N V1 310 OBERLIN RD 326-0369 V2 0003 R ALSTON JIMMIE V1 222 16TH ST ELYRIA 323-7196 V2 0003 R ALSTON MYRA P V1 229 BELL AV 322-1505 V2 0003 R ALTEN BRAD & KAREN V1 in file 1 after 201 EAST BRIDGE ST is 0003 R ALSTON JIMMIE in file 2 after 201 EAST BRIDGE ST is 0003 R ALSTON BRANDY N and in file 2 the 0003 R ALSTON JIMMIE is after R ALSTON BRANDY N which means file 1 forgot to enter R ALSTON BRANDY N. now, i have to skip and compare 0003 R ALSTON JIMMIE from file 1 to 0003 R ALSTON JIMMIE of file 2. and skip R ALSTON BRANDY N i hope i explained it well...... can anyone please help pls.... thanks ??? Quote Link to comment Share on other sites More sharing options...
ddrudik Posted November 19, 2008 Share Posted November 19, 2008 The text files differ quite a bit, 1/2 vs. V1/V2 in the last column etc, is the last column to be ignored from the comparison? Are your real-world text files to be compared line by line exactly or every 2 lines as the records appear to be 2 lines long? Is the order important or can the records appear anywhere in the text file to be considered valid? What about records in file1 that don't appear in file2 (if that should ever occur)? Quote Link to comment Share on other sites More sharing options...
homer.favenir Posted November 20, 2008 Author Share Posted November 20, 2008 the 1/2 and V1/V2 is the level 1 and level 2 LEVEL 1 FIELD NAME LENGTH COLUMN Page 4 1 Type Of Listing 1 6 Business Name 78 8 LEVEL 2 LENGTH COLUMN Indent Number 1 1 Address/Informational Text 78 2 City 30 80 State 2 111 Zip Code 5 113 Toll Free Text 40 124 Area Code 3 165 Prefix Number 3 169 Suffix Number 4 174 its like checking the file1 to file2 from errors compare the 2 text files according to field name, this comparison is in array so it is per line. but the problem is if file1 has missing line. so the error is from the missing line onwards. like my previous post, e.g. of file1 and file2. file 1 has missing line compare to file2. please advise thanks Quote Link to comment Share on other sites More sharing options...
ddrudik Posted November 20, 2008 Share Posted November 20, 2008 See if this works for your requirements: <pre> <?php function showdiff($f1,$f2){ $file1=preg_replace('/ +/',' ',preg_replace('/(.*?)\S+(?=\r\n|$)/','$1',file_get_contents('file1.txt'))); $file2=preg_replace('/ +/',' ',preg_replace('/(.*?)\S+(?=\r\n|$)/','$1',file_get_contents('file2.txt'))); preg_match_all('/.*?\r\n.*?(?:\r\n|$)/',$file1,$f1matches); preg_match_all('/.*?\r\n.*?(?:\r\n|$)/',$file2,$f2matches); foreach($f2matches[0] as $f2line){ if(!in_array($f2line,$f1matches[0])){ echo "missing <font color=red>$f2line</font> from file 1.<br>"; } } foreach($f1matches[0] as $f1line){ if(!in_array($f1line,$f2matches[0])){ echo "missing <font color=red>$f1line</font> from file 2.<br>"; } } } showdiff('file1.txt','file2.txt'); ?> output: missing 0003 R ALSTON BRANDY N 310 OBERLIN RD 326-0369 from file 1. The spacing is different between the two files so multiple spaces have been reduced to 1 space and the last column which also differs between the two files is ignored. The comparison is irrespective of location in the file, it is a success if a given 2-line record in file1 is located anywhere in file2 and vice versa. Your last entry in both files was excluded from my testing since they included only line 1 and not line 2, making a comparison of both not possible. Quote Link to comment Share on other sites More sharing options...
homer.favenir Posted November 20, 2008 Author Share Posted November 20, 2008 did you compare those files per line? as an array? it doesnt work. it should trace where the missing line. please try this code in this txt file see attached file [attachment deleted by admin] Quote Link to comment Share on other sites More sharing options...
ddrudik Posted November 20, 2008 Share Posted November 20, 2008 They were compared between the two files as two-line records (name etc on line 1 and address etc on line 2). preg_match_all put them into into an array per file and then I compared them between the two arrays. Quote Link to comment Share on other sites More sharing options...
homer.favenir Posted November 20, 2008 Author Share Posted November 20, 2008 ok. what i did is I substring each line so that i can get the fields, e.g. name for level1 and address etc for level2, compare each field from file1 to file2. so the error is per field. the problem is if there is a missing line in file1 then the matching of each line will be misplace as what 0003 R ALSTON JIMMIE of file1 has happened, it was compare to 0003 R ALSTON BRANDY N of file2, instead it should skip 0003 R ALSTON BRANDY N because 0003 R ALSTON JIMMIE is right after 0003 R ALSTON BRANDY. what happened is the mismatching comparison of each line because of missing line hope i explained it well please advice thanks for the help, hope i didnt bother you Quote Link to comment Share on other sites More sharing options...
ddrudik Posted November 20, 2008 Share Posted November 20, 2008 Please show what specific output you expect when comparing the two file samples as shown in your original question. Quote Link to comment Share on other sites More sharing options...
homer.favenir Posted November 20, 2008 Author Share Posted November 20, 2008 hi, here is the screen shot of my program [attachment deleted by admin] Quote Link to comment Share on other sites More sharing options...
ddrudik Posted November 20, 2008 Share Posted November 20, 2008 To do that level of detail comparison each line would need to be broken out into fields by a different method than what I used. Your file1 uses \r\n as line separators while file2 only uses \n so that was throwing off my code. My code compares both lines of a record together so it's output is different than from your program. <pre> <?php function showdiff($f1,$f2){ $file1=preg_replace('/ +/',' ',preg_replace('/(.*?)\S+(?=\r\n)/','$1',file_get_contents('file1.txt'))); $file2=preg_replace('/ +/',' ',preg_replace('/(.*?)\S+(?=\r\n)/','$1',preg_replace('/\n/',"\r\n",file_get_contents('file2.txt')))); $f1count=preg_match_all('/(?:.*?\r\n){2}/',$file1,$f1matches); $f2count=preg_match_all('/(?:.*?\r\n){2}/',$file2,$f2matches); foreach($f2matches[0] as $f2line){ if(!in_array($f2line,$f1matches[0])){ echo "missing <font color=red>$f2line</font> from file 1.<br>"; } } foreach($f1matches[0] as $f1line){ if(!in_array($f1line,$f2matches[0])){ echo "missing <font color=red>$f1line</font> from file 2.<br>"; } } } showdiff('file1','file2'); ?> Quote Link to comment Share on other sites More sharing options...
homer.favenir Posted November 20, 2008 Author Share Posted November 20, 2008 first i need to extract each field for each file then i must compare them per line, when there is a difference it will be echoed. and consider it as an error but the problem is when file1 has missing line, it will compare the wrong line to file2. you will notice in my image the line 92 it starts the mismatch of line there Quote Link to comment Share on other sites More sharing options...
ddrudik Posted November 20, 2008 Share Posted November 20, 2008 This is actually a string question and not a regex question. Here's the code required to parse the lines into fields within the files, I will leave it to you to work out the specifics on how you want to compare what. The code and array output of the example (shown with file1 but file2 is parsed with the same code) should give you a start in the right direction. <pre> <?php function parseline($line){ if(substr($line,178,1)=='2'){ $result['level']=2; $result['indent']=substr($line,0,1); $result['address']=substr($line,1,78); $result['city']=substr($line,79,30); $result['state']=substr($line,110,2); $result['zip']=substr($line,113,5); $result['tollfree']=substr($line,123,40); $result['areacode']=substr($line,164,3); $result['prefix']=substr($line,168,3); $result['suffix']=substr($line,172,4); } elseif(substr($line,178,1)=='1') { $result['level']=1; $result['page']=substr($line,0,4); $result['type']=substr($line,5,1); $result['name']=substr($line,7,78); } else { return false; } return array_map('trim',$result); } $lines=file('file1.txt'); foreach($lines as $line){ $fields=parseline($line); if($fields){ echo "<hr>line:<br>$line<br>\$fields "; echo print_r($fields,true); } } ?> Quote Link to comment Share on other sites More sharing options...
homer.favenir Posted November 21, 2008 Author Share Posted November 21, 2008 hi, i have a script already to get all fields. and compare each field from file1 to file2. the problem is if number of lines of both file does not match. it will compare the field to wrong field. in my example file, in file1 0003 R ALSTON JIMMIE was compare to 0003 R ALSTON BRANDY N because file1 has missing line starting from ALSTON JIMMIE. <pre> <?php $kebase = '//asecasianas2/DS_Keying/STATS/KE-DIR/'; $qcbase = '//asecasianas2/DS_Keying/STATS/QC-DIR/'; $dir = 'sample'; $kedir = $kebase . $dir; $qcdir = $qcbase . $dir; $kefiles = scandir($kedir); $qcfiles = scandir($qcdir); $count_files = count($qcfiles); echo $count_files . " " . "files found" . " " . "in" . " " . $dir . " " . "directory" . "<br>"; for($i = 2; $i <= $count_files; $i++) { echo $qcfiles[$i] . "<br>"; $file1 = file($kedir . "/" . $kefiles[$i]); $file2 = file($qcdir . "/" . $qcfiles[$i]); $file1_count = count($file1); $file2_count = count($file2); $count = 1; for($x = 0; $x <= $file2_count; $x++) { $vtag1 = substr($file1[$x],178,1); $vtag2 = substr($file2[$x],178,1); $page1 = substr($file1[$x],0,4); $page2 = substr($file2[$x],0,4); $name1 = substr($file1[$x],7,78); $name2 = substr($file2[$x],7,78); $addr1 = substr($file1[$x], 1, 78); $addr2 = substr($file2[$x], 1, 78); $city1 = substr($file1[$x], 79, 30); $city2 = substr($file2[$x], 79, 30); $state1 = substr($file1[$x], 110, 2); $state2 = substr($file2[$x], 110, 2); $zc1 = substr($file1[$x], 112, 5); $zc2 = substr($file2[$x], 112, 5); $tf1 = substr($file1[$x], 123, 40); $tf2 = substr($file2[$x], 123, 40); $phone1 = substr($file1[$x], 164, 12); $phone2 = substr($file2[$x], 164, 12); if($vtag2 == 1) { $name_comp = strcmp($name1, $name2); if($name_comp != 0) { //echo "line".$count."error name" . "<br>"; $err_stat = 1; $Error = 1; $err_line = $count; $err_name1 = $name1; $err_name2 = $name2; echo "Line" . $err_line . " " . "KE Name" . " " . $err_name1 . "<br> "; echo " QC Name" . " " . $err_name2 . "<br>"; } }else{ $addr_comp = strcmp($addr1, $addr2); $city_comp = strcmp($city1, $city2); $state_comp = strcmp($state1, $state2); $zc_comp = strcmp($zc1, $zc2); $tf_comp = strcmp($tf1, $tf2); $phone_comp = strcmp($phone1, $phone2); if($addr_comp != 0) { $Error = 1; $err_line = $count; $err_addr1 = $addr1; $err_addr2 = $addr2; echo "Line" . $err_line . " " . "KE address" . " " . $err_addr1 . "<br>"; echo " QC address" . " " . $err_addr2 . "<br>"; } if($city_comp != 0) { $Error = 1; $err_line = $count; $err_city1 = $city1; $err_city2 = $city2; echo "Line" . $err_line . " " . "KE City" . " " . $err_city1 . "<br>"; echo " QC City" . " " . $err_city2 . "<br>"; } if($state_comp != 0) { $Error = 1; $err_line = $count; $err_state1 = $state1; $err_state2 = $state2; echo "Line" . $err_line . " " . "KE State" . " " . $err_state1 . "<br>"; echo " QC State" . " " . $err_state2 . "<br>"; } if($zc_comp != 0) { $Error = 1; $err_line = $count; $err_zc1 = $zc1; $err_zc2 = $zc2; echo "Line" . $err_line . " " . "KE ZipCode" . " " . $err_zc1 . "<br>"; echo " QC ZipCode" . " " . $err_zc2 . "<br>"; } if($tf_comp != 0) { $Error = 1; $err_line = $count; $err_tf1 = $tf1; $err_tf2 = $tf2; echo "Line" . $err_line . " " . "KE Toll Free Text" . " " . $err_tf1 . "<br>"; echo " QC Toll Free Text" . " " . $err_tf2 . "<br>"; } if($phone_comp != 0) { $Error = 1; $err_line = $count; $err_phone1 = $phone1; $err_phone2 = $phone2; echo "Line" . $err_line . " " . "KE phone" . " " . $err_phone1 . "<br>"; echo " QC phone" . " " . $err_phone2 . "<br>"; } } $count++; } } ?> </pre> i think i can use in_array? to search each field of file1 to file2? anyone please help thanks in advance Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.