lopes_andre Posted December 21, 2010 Share Posted December 21, 2010 Hi I have a function that strips out lines from files. I'm handling with large files(more than 100Mb). I have the PHP Memory with 256MB but the function that handles with the strip out of lines blows up with a 100MB CSV File. What the function must do is this: Originally I have the CSV like: Copyright (c) 2007 MaxMind LLC. All Rights Reserved. locId,country,region,city,postalCode,latitude,longitude,metroCode,areaCode 1,"O1","","","",0.0000,0.0000,, 2,"AP","","","",35.0000,105.0000,, 3,"EU","","","",47.0000,8.0000,, 4,"AD","","","",42.5000,1.5000,, 5,"AE","","","",24.0000,54.0000,, 6,"AF","","","",33.0000,65.0000,, 7,"AG","","","",17.0500,-61.8000,, 8,"AI","","","",18.2500,-63.1667,, 9,"AL","","","",41.0000,20.0000,, When I pass the CSV file to this function I got: locId,country,region,city,postalCode,latitude,longitude,metroCode,areaCode 1,"O1","","","",0.0000,0.0000,, 2,"AP","","","",35.0000,105.0000,, 3,"EU","","","",47.0000,8.0000,, 4,"AD","","","",42.5000,1.5000,, 5,"AE","","","",24.0000,54.0000,, 6,"AF","","","",33.0000,65.0000,, 7,"AG","","","",17.0500,-61.8000,, 8,"AI","","","",18.2500,-63.1667,, 9,"AL","","","",41.0000,20.0000,, It only strips out the first line, nothing more. The problem is the performance of this function with large files, it blows up the memory. The function is: public function deleteLine($line_no, $csvFileName) { // this function strips a specific line from a file // if a line is stripped, functions returns True else false // // e.g. // deleteLine(-1, xyz.csv); // strip last line // deleteLine(1, xyz.csv); // strip first line // Assigna o nome do ficheiro $filename = $csvFileName; $strip_return=FALSE; $data=file($filename); $pipe=fopen($filename,'w'); $size=count($data); if($line_no==-1) $skip=$size-1; else $skip=$line_no-1; for($line=0;$line<$size;$line++) if($line!=$skip) fputs($pipe,$data[$line]); else $strip_return=TRUE; return $strip_return; } It is possible to refactor this function to not blow up with the 256MB PHP Memory? Give me some clues. Best Regards, Quote Link to comment https://forums.phpfreaks.com/topic/222305-slow-performance-function-with-large-files-memory-blows-up-how-can-i-refactor/ Share on other sites More sharing options...
Psycho Posted December 21, 2010 Share Posted December 21, 2010 I will give no guarantees since I do not have your files to work with, but I'll give some suggestions. First don't run any code that you don't need to. For example, you define the variable $size, but you only use it within an IF condition. So, you only need to define it within the condition. Defining it outside the condition is unnecessary. In fact, I wouldn't even define it at all. In this case, however, the performance gain is imperceptible, but just giving an idea of the logic I would use to approach a problem such as this. The real meat of the function is the for loop that loops over every record in the array created using file(). But, you are only removing one line, correct? So, you don't even need a loop. Just unset() the one line you don't want and then implode the array! Also, PHP uses zero based indexes by default, I would adjust the function to do the same (i.e. pass a zero to remove the first line, 1 to remove the 2nd line , etc.). That makes it consistent with how PHP operates and will solve some other possible issues. Give this a try public function deleteLine($line_no, $csvFileName) { // this function strips a specific line from a file // if a line is stripped, functions returns True else false // // e.g. // deleteLine(-1, xyz.csv); // strip last line // deleteLine(-2, xyz.csv); // strip 2nd to last line // deleteLine(0, xyz.csv); // strip 1st line // deleteLine(1, xyz.csv); // strip 2nd line // Assigna o nome do ficheiro //Define default return value $strip_return = false; //Parse fiel into an arrya $lines = file($csvFileName); //Define the line index to be removed $skip_index = ($line_no < 0) ? count($lines)+$line_no : $line_no; //Check if the line exists if(isset($lines[$skip_index])) { //The line exists - remove it unset($lines[$skip_index]); $strip_return = true; //Open file to write new contents $pipe = fopen($csvFileName, 'w'); //Write the array using implode fwrite($pipe, implode(PHP_EOL, $lines); //Close the file fclose($pipe); } return $strip_return; } Quote Link to comment https://forums.phpfreaks.com/topic/222305-slow-performance-function-with-large-files-memory-blows-up-how-can-i-refactor/#findComment-1149948 Share on other sites More sharing options...
lopes_andre Posted December 21, 2010 Author Share Posted December 21, 2010 Hi, thanks for the reply. Infortunately the function blows up because of memory. Fatal error: Allowed memory size of 268435456 bytes exhausted (tried to allocate 34 bytes) There are other options? Best Regards, Quote Link to comment https://forums.phpfreaks.com/topic/222305-slow-performance-function-with-large-files-memory-blows-up-how-can-i-refactor/#findComment-1149956 Share on other sites More sharing options...
PFMaBiSmAd Posted December 21, 2010 Share Posted December 21, 2010 Is there some reason you are not simply skipping over the first line at the point where you are actually using the data in the file? Quote Link to comment https://forums.phpfreaks.com/topic/222305-slow-performance-function-with-large-files-memory-blows-up-how-can-i-refactor/#findComment-1149957 Share on other sites More sharing options...
lopes_andre Posted December 22, 2010 Author Share Posted December 22, 2010 It is solved! What made the function blow out was the usage of the "file()" function. Instead of using the file() that put all the content of the file to memory the refactor of the function have used to read the file line by line and put the contents in a temporary file. public function deleteLine($line_no, $csvFileName) { // this function strips a specific line from a file // if a line is stripped, functions returns True else false // // e.g. // deleteLine(1, xyz.csv); // strip first line $tmpFileName = tempnam(".", "csv"); $strip_return=FALSE; $readFD=fopen($csvFileName,'r'); $writeFD=fopen($tmpFileName,'w'); // check for fopen errors. if($line_no==-1) { $skip=$size-1; } else { $skip=$line_no-1; } $line = 0; while (($buffer = fgets($readFD)) !== false) { if($line!=$skip) fputs($writeFD,$buffer); else $strip_return=TRUE; $line++; } // Vou agora fechar o acesso aos ficheiros fclose($readFD); // Ficheiro Original fclose($writeFD); // Ficheiro Temporário // Apagar o csvFileName(Ficheiro Original) unlink($csvFileName); rename($tmpFileName,$csvFileName); return $strip_return; } So, here is the solution to edit large files using PHP. Quote Link to comment https://forums.phpfreaks.com/topic/222305-slow-performance-function-with-large-files-memory-blows-up-how-can-i-refactor/#findComment-1150240 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.