efsnick Posted April 25, 2010 Share Posted April 25, 2010 Hello all, I need to take certain rows from a 800mb log file and insert them into a database. The log file is setup as follows: This is a file 1234 Here it is 1435 $sql=Select * from table; 1436 hello world 1437 $sql=Select * from table; 1438 your mom 1439 your dad 1440 $sql=Select * from anywhere; end of file; I basically only need the rows with the $sql statement, ID and all Here is the code I came up with that works on a demo log file, but the file is only about 10mb. function logparse($logfile){ $handle= file($logfile); $count= count($handle); for($i=0; $i < $count; $i++){ $numbers= substr($handle[$i], 0, 5); $statement= strrchr($handle[$i], '$'); $new_arr[$numbers]= $statement; $str_arr= array_filter($new_arr); } print_r($str_arr); } I know that file() will eat up alot of memory, but this is a once a month type job. Should I be ok? I dont have the Insert to db statement in here, but basically I am just putting it in one record at a time. Is there a better way? Quote Link to comment https://forums.phpfreaks.com/topic/199687-parsing-large-log-file/ Share on other sites More sharing options...
teamatomic Posted April 25, 2010 Share Posted April 25, 2010 800Mb is a kinda large file I would think. You might be best to work on it daily with a cron job collecting the lines you want into a file then appending the daily data onto a monthly log and clearing the daily log. Actually, handling large log files is better done by a shell script as you dont have to worry about the script timing out, and I think it would be a bit faster. HTH Teamatomic Quote Link to comment https://forums.phpfreaks.com/topic/199687-parsing-large-log-file/#findComment-1048067 Share on other sites More sharing options...
cags Posted April 25, 2010 Share Posted April 25, 2010 Rather than loading the file into an array (which will load it all into memory) and looping through the array, load the file a line at a time, parse that line then move on using fopen and fread. With regards to the script timing out you can use set_time_limit to stop this happening. Quote Link to comment https://forums.phpfreaks.com/topic/199687-parsing-large-log-file/#findComment-1048075 Share on other sites More sharing options...
efsnick Posted April 25, 2010 Author Share Posted April 25, 2010 Thanks for your replies. I am going to see what I can do with fread and fopen. Quote Link to comment https://forums.phpfreaks.com/topic/199687-parsing-large-log-file/#findComment-1048120 Share on other sites More sharing options...
ChemicalBliss Posted April 25, 2010 Share Posted April 25, 2010 Hmm, what i would do is have the file split into smaller chunks. You can do this with php quite nicely, Then have a timer in the code; if the time elapsed is near the max execution time, then reload the current script and continue. the only reason i never extend max_exec time for php, is that i like all my scripts to function on standard free webhosting (as compatible as possible). --- Though if this is for some professional project you would usually use a shell script on a *nix system as teamatomic stated. (Much,much faster with less load on the server processes). -cb- Quote Link to comment https://forums.phpfreaks.com/topic/199687-parsing-large-log-file/#findComment-1048162 Share on other sites More sharing options...
salathe Posted April 26, 2010 Share Posted April 26, 2010 An 800 MB file is not much of an issue so long as the script only tries to load small pieces of it at a time (e.g. a line); I've done similar processing of large log files before for analytics. Just to be clear, from the small sample given in the first post, you would only want the three lines (1435,1437,1440) which contain $sql? P.S. Since it might be useful to know, are you familiar with working with objects / object-oriented programming? Quote Link to comment https://forums.phpfreaks.com/topic/199687-parsing-large-log-file/#findComment-1048393 Share on other sites More sharing options...
efsnick Posted April 26, 2010 Author Share Posted April 26, 2010 Salathe, you are correct. Just the lines with $sql in it. Also, I am very familiar with OOP. Quote Link to comment https://forums.phpfreaks.com/topic/199687-parsing-large-log-file/#findComment-1048501 Share on other sites More sharing options...
efsnick Posted April 26, 2010 Author Share Posted April 26, 2010 Just an update, the log file is actually set up like the following, I was mistaking about the $sql 1645 Connect [email protected] 62541 Query set commit=1 24520 Query select * from table 24520 Quit 5819 Connect [email protected] 61558 Query set commit=1 1236 Query select * from table where Whereas I am only needing the lines that have a select * statement. thanks, Nick Quote Link to comment https://forums.phpfreaks.com/topic/199687-parsing-large-log-file/#findComment-1048505 Share on other sites More sharing options...
salathe Posted April 26, 2010 Share Posted April 26, 2010 If you haven't already, I would urge taking a good look at the Standard PHP Library, in particular the SplFileObject and FilterIterator. Using features available in there, this kind of task is often much more simple than the "old" way of fopen/fgets/fclose and complex conditions in loops. For example, as a guide only (i.e. may not be quite right for your needs), something like following could iterate over each line of a file that matches a certain filter (in this case, the line contains "select * ") and do something useful with only those lines. class SqlLogFilter extends FilterIterator { public function accept() { return strpos($this->getInnerIterator()->current(), 'select * ') !== FALSE; } } $filter = new SqlLogFilter(new SplFileObject($logfile)); foreach ($filter as $line) { sscanf($line, "%d Query %[^\n]", $number, $statement); // Do whatever with $number and $statement } P.S. The format string for sscanf might be unclear, see the user notes under sscanf. You're of course completely free to use a different approach to getting at the number and statement! Quote Link to comment https://forums.phpfreaks.com/topic/199687-parsing-large-log-file/#findComment-1048509 Share on other sites More sharing options...
efsnick Posted April 26, 2010 Author Share Posted April 26, 2010 Thanks for the direction Salathe, I will see what I can do with it. Quote Link to comment https://forums.phpfreaks.com/topic/199687-parsing-large-log-file/#findComment-1048528 Share on other sites More sharing options...
efsnick Posted April 26, 2010 Author Share Posted April 26, 2010 Salathe, thanks for adding FilterIterator and sscanf to my tool set, I dont know how I ever lived without them. Works perfectly thanks! Quote Link to comment https://forums.phpfreaks.com/topic/199687-parsing-large-log-file/#findComment-1048736 Share on other sites More sharing options...
salathe Posted April 26, 2010 Share Posted April 26, 2010 You're very welcome. They're both super-useful tools that I make use of all the time. P.S. Welcome to the site, I hope you can stick around. Feel free to introduce yourself. Quote Link to comment https://forums.phpfreaks.com/topic/199687-parsing-large-log-file/#findComment-1048803 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.