phdphd Posted June 27, 2013 Share Posted June 27, 2013 Hi All,I am facing a big challenge right now. My website will let people look for events from a variety of criteria, one of them being the date of the events among the next 30 coming days from a tabular-looking calendar. It works very well but I noticed that from the user's point of view it takes too much time to build the calendar if there are hundreds of events in the DB.After retrieving the data (ID events, start and end dates) from the events table, my PHP code does a series of operations, the first one being building a global array containing -for each event- two subarrays : one for the date range and another one for the corresponding days of week.To get a better idea of which points in my code the processing was particularly slow at, I ran just the code portion that builds the global array.It turns out that the building of the global array accounts for 95% of the whole processing time.Every operation before (including retrieval of the data) or after (including processing the created array, and displaying the calendar) is done quickly.Below is the PHP code that builds the global array and tests time and memory usage. The 3 arrays $dates_event['id_event'], $dates_event['date_b'] and $dates_event['date_f'] store the event ids, event start dates and event end dates retreived from the DB. The testing spans more than 8.000 events and shows the following results :Array ( [memory] => 87.805824279785 [microtime] => 19.967195987701 )Number of values in the final global array : 537404As you can see, memory consumption and -above all- time are pretty high. Does anyone have an idea about how I could significantly reduce both figures ? Thank you for your help ! $time = microtime(TRUE); $mem = memory_get_usage(); $global_array = array(); $today=date('Y-m-d'); $todayonemonth = strtotime(date("Y-m-d", strtotime($today)) . " +1 month"); $todayonemonth = date ("Y-m-d", $todayonemonth); $step = '+1 day'; $format1 = 'd/m/Y'; $format2 = 'w'; foreach ($dates_event['id_event'] as $k=>$v) { if ($dates_event['date_b'][$k]<$today) { $first=$today; } else { $first=$dates_event['date_b'][$k]; } if ($dates_event['date_f'][$k]<=$todayonemonth) { $last=$dates_event['date_f'][$k]; } else { $last=$todayonemonth; } $current = strtotime($first); $last = strtotime($last); $global_array[$v]['dates'] = array(); $global_array[$v]['days'] = array(); while($current <= $last) { $global_array[$v]['dates'][]= date($format1, $current); $global_array[$v]['days'][] = date($format2, $current); $current = strtotime($step, $current); } } print_r(array('memory' => (memory_get_usage() - $mem) / (1024 * 1024), 'microtime' => microtime(TRUE) - $time)); echo '<br />Number of values in the final global array :' .count($global_array, COUNT_RECURSIVE); Quote Link to comment Share on other sites More sharing options...
ginerjm Posted June 27, 2013 Share Posted June 27, 2013 You could place some code in your processing logic to measure the time that each step takes and output that and see where you need to re-think your code. Look up microtime in the php manual - some good examples there of how to use it. Quote Link to comment Share on other sites More sharing options...
phdphd Posted June 27, 2013 Author Share Posted June 27, 2013 I am 99% convinced that the combination of foreach+while loops and/or the use of mutidimensional arrays are responsible for the time elapsed. The problem is I do not know about any other alternative to using these in order to obtain the same results. Quote Link to comment Share on other sites More sharing options...
AbraCadaver Posted June 27, 2013 Share Posted June 27, 2013 What do you query, just 1 month, the entire year, all events in the DB? Quote Link to comment Share on other sites More sharing options...
mac_gyver Posted June 27, 2013 Share Posted June 27, 2013 it seems like you are retrieving ALL the rows from your database, then looping through all of them? your query should only match and return rows for events that span today through one month from today, i.e. the event start date or the event end date is between today and one month from today or the event totally spans the date range ( today is between the start date and the end date.) you might be able to make your existing code more efficient by having less code/testing. assuming your query returns only the rows you are interested in, if you simply loop from the date_b to date_f, adding one day inside the loop, this would give you entries for all the days of that event. you could then either ignore the dates outside of today through one month from today (which you are probably doing anyway later in your code when you use the data) or you could remove the date entries outside the date range using a function you write using array_map or array_walk to operate on the entire array at once. Quote Link to comment Share on other sites More sharing options...
phdphd Posted June 27, 2013 Author Share Posted June 27, 2013 I query lines where end dates are greater than or equal to today (to ignore finished events) and where start dates are lower than today + 30 (to ignore events that start more than 30 days from today) event_end_date >= CURDATE() and event_start_date<(DATE_ADD(CURDATE(), INTERVAL 1 MONTH)) This way I have all (but only) event Ids for events occurring during at least one day from today to today+30. However I think the issue is not the number of IDs retreived. Here is a code you can play with, that leads to the creation of an multidimensional array of more o less the same size as the one mentioned in my first message, but with different results in terms of processing time and memory consumption. $limit = 250000; error_reporting(-1); $time = microtime(TRUE); $mem = memory_get_usage(); // we create a huge array $arr = array(); for($i=0; $i<=$limit; $i++) { $arr['a'][]= 'blablabla'; $arr['b'][]= 'blablabla'; } echo '250000 loop results (adapted from http://gonzalo123.com/2011/08/15/checking-the-performance-reading-arrays-with-php/)<br />'; print_r(array('memory' => (memory_get_usage() - $mem) / (1024 * 1024), 'microtime' => microtime(TRUE) - $time)); echo '<br />COUNT_RECURSIVE :' .count($arr, COUNT_RECURSIVE); Results : 250000 loop results (adapted from http://gonzalo123.com/2011/08/15/checking-the-performance-reading-arrays-with-php/)Array ( [memory] => 51.591873168945 [microtime] => 0.54914712905884 )COUNT_RECURSIVE :500004 Here is another example $limit = 250000; error_reporting(-1); $time = microtime(TRUE); $mem = memory_get_usage(); // we create a huge array $arr = array(); for($i=0; $i<=$limit; $i++) { $arr['a'][]= 'blablabla'; $arr['b'][][]= 'blablabla'; } echo '250000 loop results (adapted from http://gonzalo123.com/2011/08/15/checking-the-performance-reading-arrays-with-php/)<br />'; print_r(array('memory' => (memory_get_usage() - $mem) / (1024 * 1024), 'microtime' => microtime(TRUE) - $time)); echo '<br />COUNT_RECURSIVE :' .count($arr, COUNT_RECURSIVE); Results : 250000 loop results (adapted from http://gonzalo123.com/2011/08/15/checking-the-performance-reading-arrays-with-php/)Array ( [memory] => 91.647163391113 [microtime] => 0.49137902259827 )COUNT_RECURSIVE :750005 As you can see that the mere fact of adding/removing subarrays In the for loop will change the results. And if you place the for loop in another for loop, you will get a Fatal error: Allowed memory size of 134217728 bytes exhausted message. This is why I think the issue is linked to the loops nesting and/or to the dimension of the arrays. Quote Link to comment Share on other sites More sharing options...
mac_gyver Posted June 27, 2013 Share Posted June 27, 2013 i was able to produce your 19+ second runtime, using 8000 random events, only when the date strings were missing leading zeros and the comparison basically matched all dates. with correctly formatted dates in my random events, i typically got a 2 second runtime for 8000 events, which changed by only a fraction of a second between 8000 random events and filtering out events that didn't overlap or span the today to one month from today date range. what does your code producing the $dates_event arrays look like or what is a sample of the data in your $dates_event array? Quote Link to comment Share on other sites More sharing options...
phdphd Posted June 27, 2013 Author Share Posted June 27, 2013 The code producing the $dates_event arrays looks as follows : $ids = array(); $date_b = array(); $date_f = array(); while($tab = mysql_fetch_assoc($rs)) { array_push($ids, $tab['ids']); array_push($date_b, $tab['event_start_date']); array_push($date_f, $tab['event_end_date']); } $dates_event = array(); $dates_event = array( 'id_event' => $ids, 'date_b' => $date_b, 'date_f' => $date_f, ); And here is a sample of the data in the $dates_event array: Array ( [id_event] => Array ( [0] => 7 [1] => 10 [2] => 11 [3] => 12 [4] => 14 [5] => 15 .... [8280] => 8605 [8281] => 8606 ) [date_b] => Array ( [0] => 2013-02-13 [1] => 2012-12-09 [2] => 2012-12-09 [3] => 2012-12-09 [4] => 2012-12-11 [5] => 2012-12-13 ... [8280] => 2013-06-26 [8281] => 2013-06-26 ) [date_f] => Array ( [0] => 2014-01-02 [1] => 2013-12-31 [2] => 2013-12-31 [3] => 2013-12-31 [4] => 2014-01-12 [5] => 2013-12-31 [8280] => 2013-09-24 [8281] => 2013-09-24 ) ) Thanks again. Quote Link to comment Share on other sites More sharing options...
mac_gyver Posted June 27, 2013 Share Posted June 27, 2013 as a continuation of my above reply, i generated 8000 in range random events and this gives a runtime for your posted code around 12 seconds, so if you mean that you have 8000 matching events in a 30 day range (which your post above confirms), i would say this is the base amount of time it is gong to take to expand and fill the dates between the start/end values (you might be able to fractionally reduce this time), but who wants to see or sift through 8000 items at one time. if each event was only one day (many are not) and they were equally distributed, that would be 266 events listed on each day (or for your data, there could be many more than 266 listed on any one day.) it just occurred to me that you can probably use the range() function or a user written function that gets called using array_map() that will avoid the looping entirely to fill the dates between the start/end values. Quote Link to comment Share on other sites More sharing options...
Solution mac_gyver Posted June 27, 2013 Solution Share Posted June 27, 2013 version without the while() loop (just the right amount of code and runs very fast) - $time = microtime(TRUE); $mem = memory_get_usage(); $global_array = array(); $day = 60*60*24; $today=date('Y-m-d'); $todayonemonth = date("Y-m-d", strtotime('+1 month')); function date_fm1($val){ return date('d/m/Y', $val); } function date_fm2($val){ return date('w', $val); } foreach ($dates_event['id_event'] as $k=>$v){ $first=$dates_event['date_b'][$k]; if ($first<$today){ $first=$today; } $last=$dates_event['date_f'][$k]; if ($last>$todayonemonth){ $last=$todayonemonth; } $list = range(strtotime($first),strtotime($last),$day); $global_array[$v]['dates'] = array_map('date_fm1',$list); $global_array[$v]['days'] = array_map('date_fm2', $list); } print_r(array('memory' => (memory_get_usage() - $mem) / (1024 * 1024), 'microtime' => microtime(TRUE) - $time)); echo '<br />Number of values in the final global array :' .count($global_array, COUNT_RECURSIVE); Quote Link to comment Share on other sites More sharing options...
phdphd Posted June 27, 2013 Author Share Posted June 27, 2013 Thank you very much mac_gyver! Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.