DanSgt Posted November 12, 2013 Share Posted November 12, 2013 Hello I am having a little trouble with a web server log file and wondered if anyone could give any advice? I currently have the code below: it opens the April log file, explodes it into an Array (text_line_array). I can echo each key in the array to show for example the bandwidth[5], as below, this gives me the 1000 entries of bandwidth looped through but after that I can't figure out how to add these sums together. I can not seem to get the syntax right that allows me to do a loop on an array? A snippet of the log file is below: 103.239.234.105 -- [2007-04-01 00:42:21] "GET articles/learn_PHP_basics HTTP/1.0" 200 12729 "Mozilla/4.0" 207.3.35.52 -- [2007-04-01 01:24:42] "GET index.php HTTP/1.0" 200 11411 "Mozilla/4.0" 51.4.190.113 -- [2007-04-01 02:07:04] "GET articles/php_classes_and_oop HTTP/1.0" 200 7674 "MSIE 7.0" 216.134.52.171 -- [2007-04-01 02:49:25] "GET articles/learn_PHP_basics HTTP/1.0" 200 12729 "MSIE 7.0" 97.212.128.181 -- [2007-04-01 03:31:46] "GET articles/using_regex_with_php HTTP/1.0" 200 12127 "Mozilla/4.0" 49.174.77.138 -- [2007-04-01 04:14:07] "GET about/contact.php HTTP/1.0" 200 7554 "Mozilla/4.0" 219.218.151.127 -- [2007-04-01 04:56:28] "GET reference/mysql_crib_sheet HTTP/1.0" 200 11109 "MSIE 7.0" 209.168.87.74 -- [2007-04-01 05:38:49] "GET articles/mysql_load_bala.0"ncing HTTP/1.0" 200 3189 "MSIE 7.0" 79.214.145.94 -- [2007-04-01 06:21:11] "GET articles/mysql_load_balancing HTTP/1.0" 200 3189 "MSIE 7.0" 177.158.203.244 -- [2007-04-01 07:03:32] "GET docs/regex_crib_sheet HTTP/1.0" 200 12439 "Mozilla/4" This is what I have so far: $handle = fopen('logs/april.log', 'r'); while (!feof($handle)) { $text_line = fgets($handle, 1024); $notNeeded = array(' --','[',']','GET ',' HTTP/1.0'); $text_line = str_replace($notNeeded,NULL,$text_line); $text_line_array = explode(' ',$text_line); $ipAddress = $text_line_array[0]; $timestamp1 = $text_line_array[1]; $timestamp2 = $text_line_array[2]; $filename = $text_line_array[3]; $statusCode = $text_line_array[4]; $bandwidth = $text_line_array[5]; $userAgent = $text_line_array[6]; } Do you have any ideas? I am trying to write a summary which displays: the total amount of requests, the total amount of requests form the articles directory, the total bandwidth consumed and finally the amount of 404 errors and their pages. Quote Link to comment Share on other sites More sharing options...
AaronClifford Posted November 12, 2013 Share Posted November 12, 2013 (edited) I'm 100% sure that someone can do it better than this, but if it helps you on your way then all good. <?php $data = array(); $newData = array(); $fourZeroFourItems = array(); $articleItems = array(); $articleCount = 0; $fourZeroFourCount = 0; $data = explode("\r",file_get_contents("logs.txt")); // Get Total Amount Of Rows $total = count($data); // Data Not Needed $notNeeded = array(' --','[',']','GET ',' HTTP/1.0'); // Remove Unwanted Values foreach ($data as $item) { $item = str_replace($notNeeded,NULL,$item); $newData[] = $item; } // Split Up Data foreach ($newData as $item) { // Split Up Data $splitData = explode(" ",$item); // Build Bandwith Array $bandwidthItems[] = $splitData[5]; // Build Article Count $articles = strpos($splitData[3],"articles/"); if ($articles !== false) { $articleCount++; $articleItems[] = $splitData[3]; } // Build 404 pages $fourzerofour = strpos($splitData[4],"404"); if ($fourzerofour !== false) { $fourzerofourCount++; $fourzerofourItems[] = $splitData[4]; } } // Output Data print_r($bandwidthItems); //All Bandwidth Values echo array_sum($bandwidthItems); // Bandwidth echo $articleCount; // Total Number Of Articles print_r($articleItems); // Output All Article Items echo $fourZeroFourCount; // Total Number Of Articles print_r($fourZeroFourItems) // Output All Article Items ?> Edited November 12, 2013 by AaronClifford Quote Link to comment Share on other sites More sharing options...
dalecosp Posted November 12, 2013 Share Posted November 12, 2013 That's a great start, AaronClifford! Kudos to you for helping.Only observation I would make is that server log files are often very, very large, and you can create a very large memory structure and load your box's RAM if you keep all that data in an array. Unless it's critical to have all the bandwidth values in an array, I'd just create a var to hold the total and add the value for each line to said var on each iteration of the loop. One var will not take up near as much RAM (and array_sum might use a heckuva lotta CPU, too, on a big array). Quote Link to comment Share on other sites More sharing options...
AaronClifford Posted November 13, 2013 Share Posted November 13, 2013 Yeah that makes sense actually, I'll have a little rewrite this evening. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.