Jump to content

Find duplicate lines within a file?


dodgerfan

Recommended Posts

I need code that will scan a fairly large text file (10MB+) and either display or export all of the duplicate lines.  I know how to remove the dupes using array_unique and filling it with the contents of the file but I want to know what those lines are rather than just removing them.

 

Any thoughts on how I might go about doing this?

Link to comment
https://forums.phpfreaks.com/topic/128864-find-duplicate-lines-within-a-file/
Share on other sites

xylex,

 

Just to be clear how would that work with this code...

 

<?php
// Load file into Array
$list = file('file.txt');

// Remove duplicates
$list = array_unique($list);

// Write back to file
file_put_contents('uniques.txt', implode('', $list));
?>

 

Thanks in advance.

Untested

<?php

// Load file into Array
$original = file('file.txt');

// Remove duplicates
$uniques = array_unique($original);
$removed = array_diff_key($orginal, $uniques);

// Write back to file
file_put_contents('uniques.txt', $uniques);
file_put_contents('removed.txt', $removed);

?>

 

Sorry, can't spell.

<?php

// Load file into Array
$original = file('file.txt');

// Remove duplicates
$uniques = array_unique($original);
$removed = array_diff_key($original, $uniques);

// Write back to file
file_put_contents('uniques.txt', $uniques);
file_put_contents('removed.txt', $removed);

?>

 

 

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.