jackg Posted February 18, 2008 Share Posted February 18, 2008 I need to sort and de-duplicate terms in a very large file. Too big for array sort. I can't load into an array a file of, say, 50,000 to 80,000 terms. How can I approach doing this? Thanks, Jackg Louisiana Link to comment https://forums.phpfreaks.com/topic/91704-sorting-de-duplicating-a-very-large-file-of-terms/ Share on other sites More sharing options...
effigy Posted February 18, 2008 Share Posted February 18, 2008 Have you tried Unix's sort -u? Link to comment https://forums.phpfreaks.com/topic/91704-sorting-de-duplicating-a-very-large-file-of-terms/#findComment-469664 Share on other sites More sharing options...
jackg Posted February 18, 2008 Author Share Posted February 18, 2008 Yes, sort -u worked on unix, now I have to re-write all my code for a windows box? Dam. I can't get access to windows commands -- not that I want them! Thanks, Jackg Link to comment https://forums.phpfreaks.com/topic/91704-sorting-de-duplicating-a-very-large-file-of-terms/#findComment-469669 Share on other sites More sharing options...
effigy Posted February 18, 2008 Share Posted February 18, 2008 Perl and File::Sort perhaps? If not, there are Unix utilities available for Windows: Cygwin. Link to comment https://forums.phpfreaks.com/topic/91704-sorting-de-duplicating-a-very-large-file-of-terms/#findComment-469689 Share on other sites More sharing options...
Recommended Posts
Archived
This topic is now archived and is closed to further replies.