homer.favenir Posted June 3, 2008 Share Posted June 3, 2008 hi to all. how can i compare 2 txt files, and extract the difference. e.g. txt1 - txt2 = txt3, if txt1 = txt2 then txt3 = 0. txt1 is the needle and txt2 is the haystack. my script is <?php $currFile = file("Portfolio extracted/sent/UFPB Catalog_5-30-08.txt"); $prevFile = file("Portfolio extracted/sent/UFPB Catalog_5-29-08.txt"); ?> <?php function find_dup($str, $arr) { $key = array_search($str, $arr); echo $arr[$key] . "<br>"; } ?> <html> <head> <title>compare</title> </head> <body> <table border = 1> <th>current file</th> <?php foreach($currFile as $line) {?> <tr> <td> <?php if(!find_dup($line, $prevFile)); ?> <td> </tr> <?php } ?> </table> </body> </html> please advice thanks Quote Link to comment Share on other sites More sharing options...
samshel Posted June 3, 2008 Share Posted June 3, 2008 Try array_diff() after converting the files into arrays and sorting them. Quote Link to comment Share on other sites More sharing options...
kbh43dz_u Posted June 3, 2008 Share Posted June 3, 2008 i haven't tested it, but give it a try: <?php $txt1 = file("../file1.txt"); $txt2 = file("../file2.txt"); for($i=0; $i<count($txt1); $i++){ if($txt1[$i] == $txt2[$i]){ // lines are the same echo $txt1[$i]." == ".$txt2[$i]; } else { // lines incorrect echo "lines not the same."; break; } } ?> But be aware: Just one newline on the top and everything is invalid. Quote Link to comment Share on other sites More sharing options...
samshel Posted June 3, 2008 Share Posted June 3, 2008 Thats the reason i suggested array_diff() , it will take care even if the array elements are not in the same order. Quote Link to comment Share on other sites More sharing options...
freeloader Posted June 3, 2008 Share Posted June 3, 2008 But be aware: Just one newline on the top and everything is invalid. Exactly, in order to avoid that: $txt1 = file("../file1.txt", FILE_IGNORE_NEW_LINES); $txt2 = file("../file2.txt", FILE_IGNORE_NEW_LINES); Quote Link to comment Share on other sites More sharing options...
kbh43dz_u Posted June 3, 2008 Share Posted June 3, 2008 But be aware: Just one newline on the top and everything is invalid. Exactly, in order to avoid that: $txt1 = file("../file1.txt", FILE_IGNORE_NEW_LINES); $txt2 = file("../file2.txt", FILE_IGNORE_NEW_LINES); ... ok, to more precisely: just add a new [_space_] line (with content, not an empty one) on the top and nothing will fit anymore Quote Link to comment Share on other sites More sharing options...
samshel Posted June 3, 2008 Share Posted June 3, 2008 dont you people agree that array_diff is a short, more accurate and hassle less option? Quote Link to comment Share on other sites More sharing options...
kbh43dz_u Posted June 3, 2008 Share Posted June 3, 2008 it depends on the usage. if you want to abort your script right after a wrong line is detected my version will have better performance. With array_diff you will also have to put file contents in an array first -> you will also have read the files first. You can also handle mistakes right away and continue depending on the results after every line. I don't think that it is better. If you want to have all differences in an array at the end (what array_diff is made for) array_diff will be a petter choice. But you are right, i think homer.favenir wants to extract the difference Quote Link to comment Share on other sites More sharing options...
samshel Posted June 3, 2008 Share Posted June 3, 2008 $txt1 = file("../file1.txt"); $txt2 = file("../file2.txt"); above code is anyways reading the complete files and putting in array. sorry if i got it wrong, but i think if you need to compare 2 files, you need to read them completely anyways. Quote Link to comment Share on other sites More sharing options...
kbh43dz_u Posted June 3, 2008 Share Posted June 3, 2008 $txt1 = file("../file1.txt"); $txt2 = file("../file2.txt"); above code is anyways reading the complete files and putting in array. sorry if i got it wrong, but i think if you need to compare 2 files, you need to read them completely anyways. Yes you are right! it puts contents in an array... but instead of file() you could also use fopen fread etc. and read it line by line ...But you are right, i think homer.favenir wants to extract the difference Quote Link to comment Share on other sites More sharing options...
samshel Posted June 3, 2008 Share Posted June 3, 2008 $txt1 = file("../file1.txt"); $txt2 = file("../file2.txt"); above code is anyways reading the complete files and putting in array. sorry if i got it wrong, but i think if you need to compare 2 files, you need to read them completely anyways. for comparing 2 files completely, we would need to read them completely anyways, besides fread will need them to have the data in same sequence, array will check even if data is scattered and not in sequence..but as you said it depends on the use.. Quote Link to comment Share on other sites More sharing options...
homer.favenir Posted June 3, 2008 Author Share Posted June 3, 2008 thanks guys! i need to extract only lines that are not in the other txtfile, thats why i need to compare them, my txt file has 62200 lines. if line1 will search for 62200 lines and line2 will search again it will takes time or it will hang. Quote Link to comment Share on other sites More sharing options...
kbh43dz_u Posted June 3, 2008 Share Posted June 3, 2008 sure, in this case of course use array_diff(). I thought you want to compare line by line. (line1 with line1, line2 with line2,...) Quote Link to comment Share on other sites More sharing options...
homer.favenir Posted June 3, 2008 Author Share Posted June 3, 2008 it works, i compared it, extract the different line, but i have one more question. it also extracts the last line even if it is not different from the source. and also i need it to be save in txt file as well.... <?php $currFile = file("Portfolio extracted/sent/UFPB Catalog_5-28-08.txt"); $prevFile = file("Portfolio extracted/sent/UFPB Catalog_5-27-08.txt"); ?> <html> <body> <table border=1> <tr> <td><?php print_r(array_diff($currFile, $prevFile)); ?> </td> </tr> </table> </body> </html> please advice... thanks Quote Link to comment Share on other sites More sharing options...
samshel Posted June 3, 2008 Share Posted June 3, 2008 The last line must be showing different because of a new line charachter. You can write the result to a new txt file using fwrite() [check manual] and while writing, discard changes of new line chars. Quote Link to comment Share on other sites More sharing options...
freeloader Posted June 3, 2008 Share Posted June 3, 2008 As I said previously, use this: $currFile = file("Portfolio extracted/sent/UFPB Catalog_5-28-08.txt", FILE_IGNORE_NEW_LINE); $prevFile = file("Portfolio extracted/sent/UFPB Catalog_5-27-08.txt", FILE_IGNORE_NEW_LINE); Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.