kristo5747 Posted March 9, 2011 Share Posted March 9, 2011 Hello, I have no experience when dealing with large files so I am not sure what to do about this. I have attempted to read several large files using file_get_contents ; the task is to clean and munge them using preg_replace(). My code runs fine on small files ; however, the large files (40 MB) trigger an Memory exhausted error: PHP Fatal error: Allowed memory size of 16777216 bytes exhausted (tried to allocate 41390283 bytes) I was thinking of using fread() instead but I am not sure that'll work either. Is there a workaround for this problem? Thanks for your input. Al. Quote Link to comment Share on other sites More sharing options...
Psycho Posted March 9, 2011 Share Posted March 9, 2011 Can you provide some information on the operations you need to perform on the data? fread() might be an option depending on how you need to manipulate the data. If this is YOUR server, you can increase the allowed memory limit in the php.ini file Quote Link to comment Share on other sites More sharing options...
kristo5747 Posted March 9, 2011 Author Share Posted March 9, 2011 Can you provide some information on the operations you need to perform on the data? fread() might be an option depending on how you need to manipulate the data. preg_replace() and str_replace() operations on the file and save the mods into a new file. Again, this works on small files. If this is YOUR server, you can increase the allowed memory limit in the php.ini file Not my server, sadly. Quote Link to comment Share on other sites More sharing options...
kristo5747 Posted March 9, 2011 Author Share Posted March 9, 2011 Can you provide some information on the operations you need to perform on the data? This is my code: <?php error_reporting(E_ALL); ##get find() results and remove DOS carriage returns. ##The error is thrown on the next line for large files! $myData = file_get_contents("tmp11"); $newData = str_replace("^M", "", $myData); ##cleanup Model-Manufacturer field. $pattern = '/(Model-Manufacturer:)(\n)(\w+)/i'; $replacement = '$1$3'; $newData = preg_replace($pattern, $replacement, $newData); ##cleanup Test_Version field and create comma delimited layout. $pattern = '/(Test_Version=)(\d).(\d).(\d)(\n+)/'; $replacement = '$1$2.$3.$4 '; $newData = preg_replace($pattern, $replacement, $newData); ##cleanup occasional empty Model-Manufacturer field. $pattern = '/(Test_Version=)(\d).(\d).(\d) (Test_Version=)/'; $replacement = '$1$2.$3.$4 Model-Manufacturer:N/A--$5'; $newData = preg_replace($pattern, $replacement, $newData); ##fix occasional Model-Manufacturer being incorrectly wrapped. $newData = str_replace("--","\n",$newData); ##fix 'Binary file' message when find() utility cannot id file. $pattern = '/(Binary file).*/'; $replacement = ''; $newData = preg_replace($pattern, $replacement, $newData); $newData = removeEmptyLines($newData); ##replace colon with equal sign $newData = str_replace("Model-Manufacturer:","Model-Manufacturer=",$newData); ##file stuff $fh2 = fopen("tmp2","w"); fwrite($fh2, $newData); fclose($fh2); ### Functions. ##Data cleanup function removeEmptyLines($string) { return preg_replace("/(^[\r\n]*|[\r\n]+)[\s\t]*[\r\n]+/", "\n", $string); } ?> Quote Link to comment Share on other sites More sharing options...
jasonrichardsmith Posted March 9, 2011 Share Posted March 9, 2011 If you are on a linux server, sed is your friend. http://www.grymoire.com/Unix/Sed.html Quote Link to comment Share on other sites More sharing options...
Psycho Posted March 10, 2011 Share Posted March 10, 2011 Looking at your patterns it appears that the replacements strings would need to fine "newline" characters. So, you wouldn't be able to read a line of data at a time and process it that way. One options would be to create a "buffer". Start by reading five or more lines of data. Process that data. Then remove the first line from the buffer and write it to a new file and get a new line from the input file to add to the buffer. Process the buffer again and start the process over again until you have read all the data from the input file. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.