philipsbe Posted March 16, 2011 Share Posted March 16, 2011 I need to process a large CSV file (40 MB - 300,000 rows). While I have been working with smaller files my existing code is not able to work with large files. All I need to do is - read a particular column from file and then count total number of rows and add all the values from column. My exisitng piece of code imports whole CSV file into an array (a class is used) and then using 'ForEach' loop, reads the required column and values into another array. Once the data is in this array i can simply sum or count it. While this served me well for smaller files, i am not able to use this approach to read a larger php file. I have already increased the memory allocated to php and max_execution_time but the script just keeps on running I am no php expert but usualy get around with trial and error.......your thoughts and help will be greatly appreciated Exisiting code: Once data has been processed by initial class (Class used is freely available and known as 'parsecsv', available at http://code.google.com/p/parsecsv-for-php/) After calling the class and processing the csv file: ?php ini_set('max_execution_time', 3000); $init = array(0); //Initialize a dummy array, which hold value '0' foreach ($csv->data as $key => $col): //Get value from array, 'Data' is an array processed by class and holds csv data $ColValue = $col[sALARY']; //retrieves the column you want { $SAL= $col['SALARY']; //Column that you want to process from csv array_push ($init, $SAL); // Push value into dummy array created above echo "<pre>"; } endforeach; $total_rows = (Count($init) -1); //Count total number of value, '-1' to remove the first initilaized value in array echo "Total # of rows: ". $total_rows . "\n"; echo "Total Sum: ". array_sum($init) . "\n"; ?> ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Quote Link to comment Share on other sites More sharing options...
johnny86 Posted March 16, 2011 Share Posted March 16, 2011 You could try to get better performance by reading only one line into memory at once. Something like: <?php $handle = fopen('file.csv', 'r'); $totalSalary = 0; $totalRows = 0; while(($lineData = fgetcsv($handle)) !== FALSE) { if($lineData[0] !== NULL) { $totalRows++; $totalSalary += $lineData[indexToSalaryValue]; } } fclose($handle); ?> That way you don't have a huge array stored in your memory which I think causes your script to be so seriously slow. Another thing that could help is to split your file into many files and read those to save some memory. Quote Link to comment Share on other sites More sharing options...
philipsbe Posted March 16, 2011 Author Share Posted March 16, 2011 Thank you Johnny, While your code was handy, but since my team has growing needs, I ended up dumping the data from CSV in a MySql tables using 'LOAD DATA INFILE' - I am amazed that whole file was loaded in less than 10 seconds. Now we will manuplate the data in this table using sql which should be fairly quick. Thanks again for sharing your thoughts. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.