Jiberish Posted April 12, 2007 Share Posted April 12, 2007 I have a textfile with millions of rows. I want to collect the first word from each row. The first thing that came to mind was an array, but that would allocate so much memory and it's just not a practical solution because of that. The reason to why I find an array interesting is that it would be extremely simple to compare the array with another array at a later stage and find the words that only exist in one of the files. I have to go through the file sequentially in some way, but not sure how. Do I use fread() or something like that? I suppose I could read all the words into a database and take it from there. Any other solutions? Thanks in advance! Link to comment https://forums.phpfreaks.com/topic/46768-process-large-textfile-using-php/ Share on other sites More sharing options...
Glyde Posted April 12, 2007 Share Posted April 12, 2007 Whether from a database for a text file, it's still going to take up tons of memory if you truly have millions of lines. What many people seem to not understand is that a database is still a file, just as a text file is. Except, instead of just reading through a text file, databases give you the ability to run queries on those files. Although PHP will be able to run with an array that large, I can't guarantee your system won't slow down a little bit during the process. The most logical way I can think of doing it would be: <?php $fileLines = file("textfile.txt"); foreach ($fileLines as $lineNum=>$lineText) { $fileLines[$lineNum] = current(explode(" ", $lineText)); } ?> Link to comment https://forums.phpfreaks.com/topic/46768-process-large-textfile-using-php/#findComment-228003 Share on other sites More sharing options...
Jiberish Posted April 13, 2007 Author Share Posted April 13, 2007 Thanks I will work with it some and see what I can get working. Link to comment https://forums.phpfreaks.com/topic/46768-process-large-textfile-using-php/#findComment-228248 Share on other sites More sharing options...
Recommended Posts
Archived
This topic is now archived and is closed to further replies.