Jump to content

How to use DATE_SUB() to perform count()'s over multiple datetime periods


jayarsee

Recommended Posts

I'm not very experienced with MySQL's date functions, so I've got what I think is a very incorrect/verbose and overcomplicated solution to my problem and I was wondering if someone with more experience in this area could at least point me in a better direction, if it exists.

 

This:

SELECT count(id)
FROM eventlog_events
WHERE event_id = 12
AND DATE_SUB(CURDATE(),INTERVAL 1 WEEK) <= datetime;

 

Tells me how many times a certain event in the eventlog_events table occurred within the last week, where datetime is the name of the field that records when it occurred and is also a "datetime" type, event_id is the type of event since the table records several types. However, I need to know not only how many times the event occurred within the last week, but how many times it occurred within each of the last 12 weeks. Further, I need to get this information for 3 different event_id's.

 

In natural language the information I want MySQL to tell me is "The event occurred 15 times during the last week (7-day period), 13 times the week before that, 19 times the week before that," etc

 

The only solution I have so far is to subtract two of these statements with different intervals to get the count in the period I want, and repeat this for each event_id, so for instance:

 

SELECT
(SELECT count(id)
FROM eventlog_events
WHERE event_id = 12
AND DATE_SUB(
CURDATE(),INTERVAL 4 WEEK
) <= datetime)
-
(SELECT count(id)
FROM eventlog_events
WHERE event_id = 12
AND DATE_SUB(
CURDATE(),INTERVAL 3 WEEK
) <= datetime)

 

However, my SQL spider sense tells me this is overly verbose and incorrect. It gets even more perverse when I use this method to capture the counts for each of the 12 weeks, where I execute 13 of these subtraction operations in one gigantic statement, 1 for each period.

 

Is there a better way? Is my objective clear?

Link to comment
Share on other sites

So what you really want to know is: For each of these events, how many times did it occur in each of the past several weeks

 

SELECT SE.event_id, FLOOR(DATEDIFF(CURDATE(), SE.event_datetime) / 7) AS WeekNo, COUNT(*)
FROM events as SE
WHERE SE.event_id IN (12, 13, 14)
GROUP BY SE.event_id, WeekNo

 

The WHERE clause specifies the events that you are interested in. You could add a clause here to limit the date range you want ( AND SE.event_datetime BETWEEN '$lowDate' AND '$highDate' ). Note that the BETWEEN phrase is inclusive (includes both the Low and High value).

 

The SELECT clause returns the event ID as well as a calculated week number (weeks ago) and a count of the rows found.

 

The GROUP BY clause causes the COUNT (in the SELECT) to be by event ID and week number.

 

You may have to make some adjustments. Since the WeekNo calculation is based on number of days since the event occured, the count will change from day to day (not just week to week). You may want to look at the mySql WEEKOFYEAR() and WEEK() functions to adjust that. I did not use it, because there will be issues to deal with when run during the first few weeks of any given year.

 

This query will return one row for each event and week. So with three events and a range that covers seven weeks, you could get 21 rows. Note, however, that if an event did not occur AT ALL in a given week (as calculated) you will NOT get a row for that event-week combination. You can combine these rows on the front-end and handle the "missing" row there.

 

If you want the weekly counts in a single row per event, you could do something like this (not as pretty, but just as effective):

SELECT ME.event_id, 
SUM(IF(DATEDIFF(CURDATE(), ME.event_datetime) < 7, 1, 0)) AS Week1,
SUM(IF(DATEDIFF(CURDATE(), ME.event_datetime) >= 7 AND DATEDIFF(CURDATE(), ME.event_datetime) < 14, 1, 0)) AS Week2,
SUM(IF(DATEDIFF(CURDATE(), ME.event_datetime) >= 14 AND DATEDIFF(CURDATE(), ME.event_datetime) < 21, 1, 0)) AS Week3
FROM events AS ME 
WHERE ME.event_id IN (12, 13, 14)
GROUP BY ME.event_id

 

This "SUMs" the value "1" for each row that fits a specific range; and "SUMs" the value "0" if the row does not fit the range.  I don't know if mySql will optimize the DATEDIFF() calculation performing it once per row, or if it will do the calculation (in the example above) 5 times per row. Depending on how much data you have, this could impact performance.  You could/should add a date range in the WHERE clause here, too.

 

You might be able to "optimize" the days calculation by using a psuedo-table (depends on your version of mySql):

SELECT ME.event_id, 
SUM(IF(DE.DaysAgo < 7, 1, 0)) AS Week1,
SUM(IF(DE.DaysAgo >= 7 AND DE.DaysAgo < 14, 1, 0)) AS Week2,
SUM(IF(DE.DaysAgo >= 14 AND DE.DaysAgo < 21, 1, 0)) AS Week3
FROM events AS ME JOIN 
(SELECT SE.event_id, DATEDIFF(CURDATE(), SE.event_datetime) AS DaysAgo 
	FROM events AS SE
	WHERE SE.event_id IN (12, 13, 14)
	AND SE.event_datetime BETWEEN '$lowDate' AND '$highDate') AS DE 
ON ME.eventID = DE.eventID
WHERE ME.event_id IN (12, 13, 14)
AND ME.event_datetime BETWEEN '$lowDate' AND '$highDate'
GROUP BY ME.event_id

 

In this one, we build a psuedo-table of event_ids and DaysAgo (that it occured). Then join that to the main query to get the weeks as columns. I think the event id and datetime limits need to be in both places for optimum performance.  You might also need an index on the datetime. You'd have to run EXPLAIN on the query to see what might improve it.

Link to comment
Share on other sites

DavidAM,

 

To me you are an SQL God and I have the words to express neither how thankful I am that you spared the time to write such such a helpful, detailed response, nor how inadequate I now feel about about my knowledge of SQL. Frankly, it could have taken me days of trial and error to produce queries as efficient as you have proposed here.

 

I searched Google high and low for help on this, and the closest I got was this: http://www.artfulsoftware.com/infotree/queries.php?&bw=1421#106

 

Which I was not able to adapt to my circumstances.

 

for($i=0;$i<infinity;$i++)
{
print("Thank you");
}

Link to comment
Share on other sites

Glad to help!  :thumb-up:  I remember the first time (years ago) I saw a query using SUM to do a COUNT and spent a couple of days trying to figure out what it meant.  It's actually pretty cool, and effecient.  When working on a query, you have to think in SETs and PATTERNs not PROCEDUREs.  Sometimes it's difficult to shift the mind-set back and forth between the front-end (procedural) and the back-end (sets), but it sure helps.

 

Mark the topic SOLVED (there's a link down at the bottom of the page). That way, someone else searching will know that there's an answer here, somewhere.

 

Happy Coding  :birthday:

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.