Jump to content

Any idea of why so much time to generate date ranges ?


phdphd
Go to solution Solved by mac_gyver,

Recommended Posts

Hi All,
I am facing a big challenge right now. My website will let people look for events from a variety of criteria, one of them being the date of the events among the next 30 coming days from a tabular-looking calendar.

It works very well but I noticed that from the user's point of view it takes too much time to build the calendar if there are hundreds of events in the DB.
After retrieving the data (ID events, start and end dates) from the events table, my PHP code does a series of operations, the first one being building a global array containing -for each event- two subarrays : one for the date range and another one for the corresponding days of week.
To get a better idea of which points in my code the processing was particularly slow at, I ran just the code portion that builds the global array.
It turns out that the building of the global array accounts for 95% of the whole processing time.
Every operation before (including retrieval of the data) or after (including processing the created array, and displaying the calendar) is done quickly.
Below is the PHP code that builds the global array and tests time and memory usage. The 3 arrays $dates_event['id_event'], $dates_event['date_b'] and $dates_event['date_f'] store the event ids, event start dates and event end dates retreived from the DB.  The testing spans more than 8.000 events and shows the following results :
Array ( [memory] => 87.805824279785 [microtime] => 19.967195987701 )
Number of values in the final global array : 537404
As you can see, memory consumption and -above all- time are pretty high. Does anyone have an idea about how I could significantly reduce both figures ? Thank you for your help !

$time = microtime(TRUE);
$mem = memory_get_usage();

$global_array = array();
$today=date('Y-m-d');
$todayonemonth = strtotime(date("Y-m-d", strtotime($today)) . " +1 month");
$todayonemonth = date ("Y-m-d", $todayonemonth);
$step = '+1 day';
 $format1 = 'd/m/Y';
$format2 = 'w';
foreach ($dates_event['id_event'] as $k=>$v)
	{

	if ($dates_event['date_b'][$k]<$today)
		{
		$first=$today;
		}
	else
		{
		$first=$dates_event['date_b'][$k];
		}


	if ($dates_event['date_f'][$k]<=$todayonemonth)
		{
		$last=$dates_event['date_f'][$k];
		}
	else
		{
		$last=$todayonemonth;
		}

	$current = strtotime($first);
	$last = strtotime($last);
	$global_array[$v]['dates'] = array();
	$global_array[$v]['days'] = array();

	while($current <= $last) 
		{
		$global_array[$v]['dates'][]= date($format1, $current);
		$global_array[$v]['days'][] = date($format2, $current);
		$current = strtotime($step, $current);
		}
																				
	}

print_r(array('memory' => (memory_get_usage() - $mem) / (1024 * 1024), 'microtime' => microtime(TRUE) - $time));

echo '<br />Number of values in the final global array :' .count($global_array, COUNT_RECURSIVE);

 

Link to comment
Share on other sites

You could place some code in your processing logic to measure the time that each step takes and output that and see where you need to re-think your code.  Look up microtime in the php manual - some good examples there of how to use it.

Link to comment
Share on other sites

I am 99% convinced that the combination of foreach+while loops and/or the use of mutidimensional arrays are responsible for the time elapsed. The problem is I do not know about any other alternative to using these in order to obtain the same results.

Link to comment
Share on other sites

it seems like you are retrieving ALL the rows from your database, then looping through all of them? your query should only match and return rows for events that span today through one month from today, i.e. the event start date or the event end date is between today and one month from today or the event totally spans the date range ( today is between the start date and the end date.)

 


 

you might be able to make your existing code more efficient by having less code/testing. assuming your query returns only the rows you are interested in, if you simply loop from the date_b to date_f, adding one day inside the loop, this would give you entries for all the days of that event. you could then either ignore the dates outside of today through one month from today (which you are probably doing anyway later in your code when you use the data) or you could remove the date entries outside the date range using a function you write using array_map or array_walk to operate on the entire array at once.

Link to comment
Share on other sites

I query lines where end dates are greater than or equal to today (to ignore finished events) and where start dates are lower than today + 30 (to ignore events that start more than 30 days from today)

event_end_date >= CURDATE() and event_start_date<(DATE_ADD(CURDATE(), INTERVAL 1 MONTH))

This way I have all (but only) event Ids for events occurring during at least one day from today to today+30.

 

However I think the issue is not the number of IDs retreived.

 

Here is a code you can play with, that leads to the creation of an multidimensional array of more o less the same size as the one mentioned in my first message, but with different results in terms of processing time and memory consumption.  

$limit = 250000;

error_reporting(-1);
$time = microtime(TRUE);
$mem = memory_get_usage();

 
// we create a huge array
$arr = array();

for($i=0; $i<=$limit; $i++) {
    $arr['a'][]= 'blablabla';
$arr['b'][]= 'blablabla';
		
		}
 
 
echo '250000 loop results (adapted from http://gonzalo123.com/2011/08/15/checking-the-performance-reading-arrays-with-php/)<br />'; 
print_r(array('memory' => (memory_get_usage() - $mem) / (1024 * 1024), 'microtime' => microtime(TRUE) - $time));

echo '<br />COUNT_RECURSIVE :' .count($arr, COUNT_RECURSIVE);

Results :

 

250000 loop results (adapted from http://gonzalo123.com/2011/08/15/checking-the-performance-reading-arrays-with-php/)
Array ( [memory] => 51.591873168945 [microtime] => 0.54914712905884 )
COUNT_RECURSIVE :500004

 

Here is another example

$limit = 250000;

error_reporting(-1);
$time = microtime(TRUE);
$mem = memory_get_usage();

 
// we create a huge array
$arr = array();

for($i=0; $i<=$limit; $i++) {
    $arr['a'][]= 'blablabla';
    $arr['b'][][]= 'blablabla';
		}
 
 
echo '250000 loop results (adapted from http://gonzalo123.com/2011/08/15/checking-the-performance-reading-arrays-with-php/)<br />'; 
print_r(array('memory' => (memory_get_usage() - $mem) / (1024 * 1024), 'microtime' => microtime(TRUE) - $time));

echo '<br />COUNT_RECURSIVE :' .count($arr, COUNT_RECURSIVE);

Results :

 

250000 loop results (adapted from http://gonzalo123.com/2011/08/15/checking-the-performance-reading-arrays-with-php/)
Array ( [memory] => 91.647163391113 [microtime] => 0.49137902259827 )
COUNT_RECURSIVE :750005

 

As you can see that the mere fact of adding/removing subarrays In the for loop will change the results.

 

And if you place the for loop in another for loop, you will get a Fatal error: Allowed memory size of 134217728 bytes exhausted message.

 

This is why I think the issue is linked to the loops nesting and/or to the dimension of the arrays.

Link to comment
Share on other sites

i was able to produce your 19+ second runtime, using 8000 random events, only when the date strings were missing leading zeros and the comparison basically matched all dates. with correctly formatted dates in my random events, i typically got a 2 second runtime for 8000 events, which changed by only a fraction of a second between 8000 random events and filtering out events that didn't overlap or span the today to one month from today date range.

 

what does your code producing the $dates_event arrays look like or what is a sample of the data in your $dates_event array?

Link to comment
Share on other sites

The code producing the $dates_event arrays looks as follows :

$ids = array();
$date_b = array();
$date_f = array();
while($tab = mysql_fetch_assoc($rs))
{
array_push($ids, $tab['ids']);
array_push($date_b, $tab['event_start_date']);
array_push($date_f, $tab['event_end_date']);
  	}
$dates_event = array();
$dates_event = array(     
  'id_event' => $ids,
  'date_b' => $date_b,
  'date_f' => $date_f,
 	 );

And here is a sample of the data in the $dates_event array:



Array
(
    [id_event] => Array
        (
            [0] => 7
            [1] => 10
            [2] => 11
            [3] => 12
            [4] => 14
            [5] => 15
....
            [8280] => 8605
            [8281] => 8606
        )

    [date_b] => Array
        (
            [0] => 2013-02-13
            [1] => 2012-12-09
            [2] => 2012-12-09
            [3] => 2012-12-09
            [4] => 2012-12-11
            [5] => 2012-12-13
...
            [8280] => 2013-06-26
            [8281] => 2013-06-26
        )

    [date_f] => Array
        (
            [0] => 2014-01-02
            [1] => 2013-12-31
            [2] => 2013-12-31
            [3] => 2013-12-31
            [4] => 2014-01-12
            [5] => 2013-12-31
            [8280] => 2013-09-24
            [8281] => 2013-09-24
        )

)

Thanks again.

Link to comment
Share on other sites

as a continuation of my above reply, i generated 8000 in range random events and this gives a runtime for your posted code around 12 seconds, so if you mean that you have 8000 matching events in a 30 day range (which your post above confirms), i would say this is the base amount of time it is gong to take to expand and fill the dates between the start/end values (you might be able to fractionally reduce this time), but who wants to see or sift through 8000 items at one time. if each event was only one day (many are not) and they were equally distributed, that would be 266 events listed on each day (or for your data, there could be many more than 266 listed on any one day.)

 

it just occurred to me that you can probably use the range() function or a user written function that gets called using array_map() that will avoid the looping entirely to fill the dates between the start/end values.

Link to comment
Share on other sites

  • Solution

version without the while() loop (just the right amount of code and runs very fast) -

$time = microtime(TRUE);
$mem = memory_get_usage();

$global_array = array();
$day = 60*60*24;
$today=date('Y-m-d');
$todayonemonth = date("Y-m-d", strtotime('+1 month'));

function date_fm1($val){
    return date('d/m/Y', $val);
}

function date_fm2($val){
    return date('w', $val);
}

foreach ($dates_event['id_event'] as $k=>$v){
    $first=$dates_event['date_b'][$k];
    if ($first<$today){
        $first=$today;
    }

    $last=$dates_event['date_f'][$k];
    if ($last>$todayonemonth){
        $last=$todayonemonth;
    }
    $list = range(strtotime($first),strtotime($last),$day);
    $global_array[$v]['dates'] = array_map('date_fm1',$list);
    $global_array[$v]['days'] = array_map('date_fm2', $list);
}

print_r(array('memory' => (memory_get_usage() - $mem) / (1024 * 1024), 'microtime' => microtime(TRUE) - $time));

echo '<br />Number of values in the final global array :' .count($global_array, COUNT_RECURSIVE);

 

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.