ajoo Posted November 27, 2020 Share Posted November 27, 2020 Hi, I have the following login and logout data of a student :- Quote Would it be possible to calculate the absents by subtracting consecutive times of logins or logouts using just Mysql ? From the example above the absents between 2019-04-01 and 2019-03-30 should be SELECT DATEDIFF("2019-04-01", "2019-03-30") = 2. So actual absents is 2 -1 = 1 and so on for all the consecutive dates and then their sum. I have no clue how to go about this. Gurus please help. Thanks ! Quote Link to comment Share on other sites More sharing options...
Barand Posted November 27, 2020 Share Posted November 27, 2020 (edited) Yes. Make sure your data is ordered by date Use an @variable to store the record's date compare the date in record n with the date stored from record n-1 store the cumalative difference in a second @variable [edit] PS as it isn't possible to load your data into a test table from an image, that's all the help I'll give. Edited November 27, 2020 by Barand Quote Link to comment Share on other sites More sharing options...
requinix Posted November 28, 2020 Share Posted November 28, 2020 My version would be to JOIN the table to itself once or twice in order to pair up consecutive records, then DATEDIFF()-1 the end of the first with the start of the second, then SUM the results. 1 Quote Link to comment Share on other sites More sharing options...
ajoo Posted November 28, 2020 Author Share Posted November 28, 2020 Thanks Guru Barand and Requinix for the replies. @Guru Barand : Please find the data attached as data. I have tried your method but I am stuck at the comparison part where I need to reference the next date. Quote +-------+---------------------+---------------------+ | RecNo | TimeOfLogin | TimeOfLogout | +-------+---------------------+---------------------+ | 3 | 2019-03-30 17:05:24 | 2019-03-30 17:09:47 | | 6 | 2019-04-01 15:13:32 | 2019-04-01 15:19:46 | | 7 | 2019-04-04 23:37:21 | 2019-04-04 23:50:51 | | 8 | 2019-04-18 15:28:35 | 2019-04-18 15:33:10 | | 9 | 2019-04-23 16:35:20 | 2019-04-23 16:42:35 | | 10 | 2019-04-24 12:03:07 | 2019-04-24 12:10:28 | | 11 | 2019-05-01 08:05:48 | 2019-05-01 08:20:28 | | 13 | 2019-05-08 18:04:04 | 2019-05-08 18:14:57 | | 14 | 2019-05-09 08:18:15 | 2019-05-09 08:29:38 | | 15 | 2019-06-18 12:49:01 | 2019-06-18 13:10:15 | | 17 | 2019-09-05 17:17:33 | 2019-09-13 15:24:28 | | 18 | 2019-09-28 07:05:03 | 2019-09-28 08:12:26 | | 19 | 2019-09-28 12:55:56 | 2019-09-28 13:21:15 | | 20 | 2019-09-28 16:47:52 | 2019-10-01 16:28:18 | | 22 | 2019-10-03 13:11:44 | 2019-12-10 17:56:25 | | 74 | 2020-05-22 12:08:32 | 2020-08-27 17:21:02 | +-------+---------------------+---------------------+ @ Requinix : HI, I get the idea but I am not sure how to go about it. Thank you. Quote Link to comment Share on other sites More sharing options...
Barand Posted November 28, 2020 Share Posted November 28, 2020 OK, I loaded your data into a test table INSERT INTO ajoo_login (datein, dateout) VALUES ('2019-03-30 17:05:24', '2019-03-30 17:09:47'), ('2019-04-01 15:13:32', '2019-04-01 15:19:46'), ('2019-04-04 23:37:21', '2019-04-04 23:50:51'), ('2019-04-18 15:28:35', '2019-04-18 15:33:10'), ('2019-04-23 16:35:20', '2019-04-23 16:42:35'), ('2019-04-24 12:03:07', '2019-04-24 12:10:28'), ('2019-05-01 08:05:48', '2019-05-01 08:20:28'), ('2019-05-08 18:04:04', '2019-05-08 18:14:57'), ('2019-05-09 08:18:15', '2019-05-09 08:29:38'), ('2019-06-18 12:49:01', '2019-06-18 13:10:15'), ('2019-09-05 17:17:33', '2019-09-13 15:24:28'), ('2019-09-28 07:05:03', '2019-09-28 08:12:26'), ('2019-09-28 12:55:56', '2019-09-28 13:21:15'), ('2019-09-28 16:47:52', '2019-10-01 16:28:18'), ('2019-10-03 13:11:44', '2019-12-10 17:56:25'), ('2020-05-22 12:08:32', '2020-08-27 17:21:02'); Running the query gives SELECT SUM(diff) AS tot_absent FROM ( SELECT CASE WHEN DATE(datein) > DATE(@prevout) THEN DATEDIFF(datein, @prevout) - 1 ELSE 0 END AS diff , datein , @prevout := dateout AS dateout -- store dateout in @prevout FROM ajoo_login JOIN (SELECT @prevout := NULL) init -- initialize @prevout ) logins; +------------+ | tot_absent | +------------+ | 327 | +------------+ Running just the subquery portion gives mysql> SELECT -> CASE WHEN DATE(datein) > DATE(@prevout) -> THEN DATEDIFF(datein, @prevout) - 1 -> ELSE 0 -> END AS diff -> , datein -> , @prevout := dateout AS dateout -> FROM ajoo_login -> JOIN (SELECT @prevout := NULL) init; +------+---------------------+---------------------+ | diff | datein | dateout | +------+---------------------+---------------------+ | 0 | 2019-03-30 17:05:24 | 2019-03-30 17:09:47 | | 1 | 2019-04-01 15:13:32 | 2019-04-01 15:19:46 | | 2 | 2019-04-04 23:37:21 | 2019-04-04 23:50:51 | | 13 | 2019-04-18 15:28:35 | 2019-04-18 15:33:10 | | 4 | 2019-04-23 16:35:20 | 2019-04-23 16:42:35 | | 0 | 2019-04-24 12:03:07 | 2019-04-24 12:10:28 | | 6 | 2019-05-01 08:05:48 | 2019-05-01 08:20:28 | | 6 | 2019-05-08 18:04:04 | 2019-05-08 18:14:57 | | 0 | 2019-05-09 08:18:15 | 2019-05-09 08:29:38 | | 39 | 2019-06-18 12:49:01 | 2019-06-18 13:10:15 | | 78 | 2019-09-05 17:17:33 | 2019-09-13 15:24:28 | | 14 | 2019-09-28 07:05:03 | 2019-09-28 08:12:26 | | 0 | 2019-09-28 12:55:56 | 2019-09-28 13:21:15 | | 0 | 2019-09-28 16:47:52 | 2019-10-01 16:28:18 | | 1 | 2019-10-03 13:11:44 | 2019-12-10 17:56:25 | | 163 | 2020-05-22 12:08:32 | 2020-08-27 17:21:02 | +------+---------------------+---------------------+ 1 1 Quote Link to comment Share on other sites More sharing options...
ajoo Posted November 28, 2020 Author Share Posted November 28, 2020 Hi Guru Barand, Thank you very much. I couldn't have figured it out this way since I have never really worked much with @variable in mysql. I am not sure I understand their working very well. However, this below is what I tried and it almost works except for the first and last bit of absents and resembles somewhat what requinix suggested. SELECT RecNo, DATEDIFF(( SELECT t.TimeOfLogin FROM india_sessdata t WHERE t.RecNo > t1.RecNo ORDER BY t.RecNo LIMIT 1), MIN(TimeOfLogin) ) diff FROM india_sessdata t1 WHERE t1.StudentLogin = 'mina1111' GROUP BY RecNo; which gives the following output +-------+------+ | RecNo | diff | +-------+------+ | 3 | 184 | | 6 | 3 | | 7 | 14 | | 8 | 5 | | 9 | 1 | | 10 | 7 | | 11 | 7 | | 13 | 1 | | 14 | 40 | | 15 | 79 | | 17 | 23 | | 18 | 0 | | 19 | 0 | | 20 | 3 | | 22 | 0 | | 74 | 21 | +-------+------+ The first is incorrect because I think I am using a subset of values related to StudentLogin = 'mina1111'. This bit -- MIN(TimeOfLogin) -- in the query needs to be tweaked to get the correct first value. I tried a few things but none worked. The last entry is not really wrong but is incomplete. The SUM( ---) function needs to use only TimeOfLogin or dateIn (as used by you). The differences of all dateIn values needs to be calculated and in the end to complete the absents , for the last row alone, it needs to calculate (dateOut - dateIn). I am trying to understand your solution. The value of Date(@prevout) CASE WHEN DATE(datein) > DATE(@prevout) is initialized further down here JOIN (SELECT @prevout := NULL) init -- initialize @prevout Shouldn't it be initailized someplace before it is used at DATE(@prevout) ? Also I cannot really visualize what this join is doing. Some explanation would really help. Thanks. Quote Link to comment Share on other sites More sharing options...
Barand Posted November 28, 2020 Share Posted November 28, 2020 1 hour ago, ajoo said: WHERE t1.StudentLogin = 'mina1111' Where did that suddenly spring from? There's no mention in your original post. Don't keep us in the dark and still expect help. 1 Quote Link to comment Share on other sites More sharing options...
Barand Posted November 28, 2020 Share Posted November 28, 2020 Meanwhile, here's an alternative solution to my previous one, this one without the SQL variables. SELECT SUM(CASE WHEN DATE(datein) > DATE(dateout) THEN DATEDIFF(datein, dateout) - 1 ELSE 0 END ) as tot_absent FROM ( SELECT a.dateout , MIN(b.datein) as datein FROM ajoo_login a LEFT JOIN ajoo_login b ON a.dateout < b.datein GROUP BY a.dateout ) logins; +------------+ | tot_absent | +------------+ | 327 | +------------+ 1 1 Quote Link to comment Share on other sites More sharing options...
ajoo Posted November 29, 2020 Author Share Posted November 29, 2020 (edited) 20 hours ago, Barand said: Don't keep us in the dark and still expect help. Sir that's not my intention at all. It's just that sometimes there are so may aspects to a problem that I try and request help for the core of it while trying to attempt the changes around it myself. It's only if I get stuck subsequently, like because of what i added, then I ask again. But I will keep it in mind and try and avoid that as far as possible. Thank you. Edited November 29, 2020 by ajoo Quote Link to comment Share on other sites More sharing options...
ajoo Posted November 29, 2020 Author Share Posted November 29, 2020 17 hours ago, Barand said: Meanwhile, here's an alternative solution to my previous one, this one without the SQL variables. SELECT SUM(CASE WHEN DATE(datein) > DATE(dateout) THEN DATEDIFF(datein, dateout) - 1 ELSE 0 END ) as tot_absent FROM ( SELECT a.dateout , MIN(b.datein) as datein FROM ajoo_login a LEFT JOIN ajoo_login b ON a.dateout < b.datein GROUP BY a.dateout ) logins; +------------+ | tot_absent | +------------+ | 327 | +------------+ Thank you Guru Barand. May I request you to clear some doubts that I expressed above in my earlier message. Thank you.🙏 Quote Link to comment Share on other sites More sharing options...
Barand Posted November 29, 2020 Share Posted November 29, 2020 22 hours ago, ajoo said: Shouldn't it be initailized someplace before it is used at DATE(@prevout) ? Also I cannot really visualize what this join is doing. Some explanation would really help. Because of the JOIN, it is initialized before it is used. As an alternative to the join you could have two queries ... SELECT @prevout := NULL; -- initialize @prevout SELECT SUM(diff) AS tot_absent FROM ( SELECT CASE WHEN DATE(datein) > DATE(@prevout) THEN DATEDIFF(datein, @prevout) - 1 ELSE 0 END AS diff , datein , @prevout := dateout AS dateout -- store dateout in @prevout FROM ajoo_login ) logins; @vars are just like variables in any other language - you assign values to them (and use those values) as the query processor loops through each of the records 1 Quote Link to comment Share on other sites More sharing options...
Barand Posted November 29, 2020 Share Posted November 29, 2020 Introducing your "studentlogin" column to the data ... +----+---------------------+---------------------+--------------+ | id | datein | dateout | studentlogin | +----+---------------------+---------------------+--------------+ | 1 | 2019-03-30 17:05:24 | 2019-03-30 17:09:47 | mina1111 | | 2 | 2019-04-01 15:13:32 | 2019-04-01 15:19:46 | abcd1234 | | 3 | 2019-04-04 23:37:21 | 2019-04-04 23:50:51 | xyz12345 | | 4 | 2019-04-18 15:28:35 | 2019-04-18 15:33:10 | mina1111 | | 5 | 2019-04-23 16:35:20 | 2019-04-23 16:42:35 | xyz12345 | | 6 | 2019-04-24 12:03:07 | 2019-04-24 12:10:28 | abcd1234 | | 7 | 2019-05-01 08:05:48 | 2019-05-01 08:20:28 | abcd1234 | | 8 | 2019-05-08 18:04:04 | 2019-05-08 18:14:57 | xyz12345 | | 9 | 2019-05-09 08:18:15 | 2019-05-09 08:29:38 | mina1111 | | 10 | 2019-06-18 12:49:01 | 2019-06-18 13:10:15 | xyz12345 | | 11 | 2019-09-05 17:17:33 | 2019-09-13 15:24:28 | abcd1234 | | 12 | 2019-09-28 07:05:03 | 2019-09-28 08:12:26 | mina1111 | | 13 | 2019-09-28 12:55:56 | 2019-09-28 13:21:15 | xyz12345 | | 14 | 2019-09-28 16:47:52 | 2019-10-01 16:28:18 | abcd1234 | | 15 | 2019-10-03 13:11:44 | 2019-12-10 17:56:25 | mina1111 | | 16 | 2020-05-22 12:08:32 | 2020-08-27 17:21:02 | xyz12345 | +----+---------------------+---------------------+--------------+ then SELECT studentlogin , SUM(CASE WHEN DATE(datein) > DATE(dateout) THEN DATEDIFF(datein, dateout) - 1 ELSE 0 END ) as tot_absent FROM ( SELECT a.dateout , MIN(b.datein) as datein , a.studentlogin FROM ajoo_login a LEFT JOIN ajoo_login b ON a.dateout < b.datein AND a.studentlogin = b.studentlogin GROUP BY a.studentlogin, a.dateout ) logins GROUP BY studentlogin; +--------------+------------+ | studentlogin | tot_absent | +--------------+------------+ | abcd1234 | 168 | | mina1111 | 183 | | xyz12345 | 409 | +--------------+------------+ Quote Link to comment Share on other sites More sharing options...
ajoo Posted November 29, 2020 Author Share Posted November 29, 2020 There are 2 gap difference that both these solutions have overlooked. A look at row 10 and 11 reveals that there a gap of 8 days between dateout on 10th row and datein on 11th row. which is unaccounted for. So on 11th row the diff should be 22 and not 14. Further on the last row there is a gap which could be calculated between dateout on 14th row and datein on 15th row i.e. DATEDIFF( '2020-08-27', '2020-05-22'). +-----+---------------+------------------------+ | Row | dateout | datein |diff | +-----+---------------+------------------------+ | 1 | '2019-03-30' | '2019-04-01' | 1 | | 2 | '2019-04-01' | '2019-04-04' | 2 | | 3 | '2019-04-04' | '2019-04-18' | 13 | | 4 | '2019-04-18' | '2019-04-23' | 4 | | 5 | '2019-04-23' | '2019-04-24' | 0 | | 6 | '2019-04-24' | '2019-05-01' | 6 | | 7 | '2019-05-01' | '2019-05-08' | 6 | | 8 | '2019-05-08' | '2019-05-09' | 0 | | 9 | '2019-05-09' | '2019-06-18' | 39 | | 10 | '2019-06-18' | '2019-09-05' | 78 | | 11 | '2019-09-13' | '2019-09-28' | 14 | | 12 | '2019-09-28' | '2019-10-03' | 4 | | 13 | '2019-10-01' | '2019-10-03' | 1 | | 14 | '2019-12-10' | '2020-05-22' | 163 | | 15 | '2020-08-27' | (null) 0 | | +-----+----------------------------------------+ Thanks ! Quote Link to comment Share on other sites More sharing options...
Barand Posted November 29, 2020 Share Posted November 29, 2020 As I have no idea what the query was that produced those results, or what the input data to the query was, what is it you expect? Quote Link to comment Share on other sites More sharing options...
ajoo Posted December 1, 2020 Author Share Posted December 1, 2020 Sir, I am referring to the examples above. If I may refer to the very first solution you proposed as below: An examination of output table produced on running the subquery in your example shows a total absents of 327 days. However inspecting row 11 showa that a gap of 8 days between the datein and dateout of record 11 is unaccounted for. This will happen for all cases where the datein and dateout of a given row are different. This can be rectified by using only the datein of each row and ignoring dateout Finally the absents of the last row (dateout - datein)-1 are not being accounted for in this and subsequent solutions. I hope I am clear in defining the issue this time ! 😅 Thanks loads !🙏 On 11/28/2020 at 4:11 PM, Barand said: OK, I loaded your data into a test table INSERT INTO ajoo_login (datein, dateout) VALUES ('2019-03-30 17:05:24', '2019-03-30 17:09:47'), ('2019-04-01 15:13:32', '2019-04-01 15:19:46'), ('2019-04-04 23:37:21', '2019-04-04 23:50:51'), ('2019-04-18 15:28:35', '2019-04-18 15:33:10'), ('2019-04-23 16:35:20', '2019-04-23 16:42:35'), ('2019-04-24 12:03:07', '2019-04-24 12:10:28'), ('2019-05-01 08:05:48', '2019-05-01 08:20:28'), ('2019-05-08 18:04:04', '2019-05-08 18:14:57'), ('2019-05-09 08:18:15', '2019-05-09 08:29:38'), ('2019-06-18 12:49:01', '2019-06-18 13:10:15'), ('2019-09-05 17:17:33', '2019-09-13 15:24:28'), ('2019-09-28 07:05:03', '2019-09-28 08:12:26'), ('2019-09-28 12:55:56', '2019-09-28 13:21:15'), ('2019-09-28 16:47:52', '2019-10-01 16:28:18'), ('2019-10-03 13:11:44', '2019-12-10 17:56:25'), ('2020-05-22 12:08:32', '2020-08-27 17:21:02'); Running the query gives SELECT SUM(diff) AS tot_absent FROM ( SELECT CASE WHEN DATE(datein) > DATE(@prevout) THEN DATEDIFF(datein, @prevout) - 1 ELSE 0 END AS diff , datein , @prevout := dateout AS dateout -- store dateout in @prevout FROM ajoo_login JOIN (SELECT @prevout := NULL) init -- initialize @prevout ) logins; +------------+ | tot_absent | +------------+ | 327 | +------------+ Running just the subquery portion gives mysql> SELECT -> CASE WHEN DATE(datein) > DATE(@prevout) -> THEN DATEDIFF(datein, @prevout) - 1 -> ELSE 0 -> END AS diff -> , datein -> , @prevout := dateout AS dateout -> FROM ajoo_login -> JOIN (SELECT @prevout := NULL) init; +------+---------------------+---------------------+ | diff | datein | dateout | +------+---------------------+---------------------+ | 0 | 2019-03-30 17:05:24 | 2019-03-30 17:09:47 | | 1 | 2019-04-01 15:13:32 | 2019-04-01 15:19:46 | | 2 | 2019-04-04 23:37:21 | 2019-04-04 23:50:51 | | 13 | 2019-04-18 15:28:35 | 2019-04-18 15:33:10 | | 4 | 2019-04-23 16:35:20 | 2019-04-23 16:42:35 | | 0 | 2019-04-24 12:03:07 | 2019-04-24 12:10:28 | | 6 | 2019-05-01 08:05:48 | 2019-05-01 08:20:28 | | 6 | 2019-05-08 18:04:04 | 2019-05-08 18:14:57 | | 0 | 2019-05-09 08:18:15 | 2019-05-09 08:29:38 | | 39 | 2019-06-18 12:49:01 | 2019-06-18 13:10:15 | | 78 | 2019-09-05 17:17:33 | 2019-09-13 15:24:28 | | 14 | 2019-09-28 07:05:03 | 2019-09-28 08:12:26 | | 0 | 2019-09-28 12:55:56 | 2019-09-28 13:21:15 | | 0 | 2019-09-28 16:47:52 | 2019-10-01 16:28:18 | | 1 | 2019-10-03 13:11:44 | 2019-12-10 17:56:25 | | 163 | 2020-05-22 12:08:32 | 2020-08-27 17:21:02 | +------+---------------------+---------------------+ Quote Link to comment Share on other sites More sharing options...
Barand Posted December 1, 2020 Share Posted December 1, 2020 The whole purpose of this exercise was to to find the number of days between the date the user logging out and the date of their next login 6 hours ago, ajoo said: However inspecting row 11 showa that a gap of 8 days between the datein and dateout of record 11 is unaccounted for. If you think that then you are nor understanding your own data or your initial description of the problem was wrong. 8 days is the number of days logged in in row 11, not the the absent time between logout/next login. As far as I am concerned, the query and results are correct. Quote Link to comment Share on other sites More sharing options...
ajoo Posted December 2, 2020 Author Share Posted December 2, 2020 16 hours ago, Barand said: The whole purpose of this exercise was to to find the number of days between the date the user logging out and the date of their next login If you think that then you are nor understanding your own data or your initial description of the problem was wrong. 8 days is the number of days logged in in row 11, not the the absent time between logout/next login. As far as I am concerned, the query and results are correct. Hello Sir, I am quite sure that you are correct. Possibly my description of the problem was flawed. Though I had remarked once or twice in my replies that we should use just the days between logins to find the correct number of gaps or absents and then for the edge condition, add (dateout - datein) of the last record for that user to the total count. Sir I am not saying that your solution is wrong anywhere. I was only trying to spell out what is desired by me. Thank you so much. Quote Link to comment Share on other sites More sharing options...
Barand Posted December 2, 2020 Share Posted December 2, 2020 That, according to Excel, would give a total absence of 503 days. Is that your expected result? Quote Link to comment Share on other sites More sharing options...
ajoo Posted December 2, 2020 Author Share Posted December 2, 2020 I am getting 502 as the sum below. The last date is the last logout date. datein diff 30-03-2019 0 01-04-2019 1 04-04-2019 2 18-04-2019 13 23-04-2019 4 24-04-2019 0 01-05-2019 6 08-05-2019 6 09-05-2019 0 18-06-2019 39 05-09-2019 78 28-09-2019 22 28-09-2019 0 28-09-2019 0 03-10-2019 4 22-05-2020 231 27-08-2020 96 total 502 Quote Link to comment Share on other sites More sharing options...
Barand Posted December 2, 2020 Share Posted December 2, 2020 I've given you a couple of alternative methods - the main change is to to use the previous log in instead of the log out time so it shouldn't be to difficult for you to alter the queries to your requirements. The tricky bit is your "edge condition", but give it a go. (my 503 was because I forgot to subtract the 1 to get the days inbetween when getting the last record's time difference, so I had 97) I still don't understand why, when you are calculating days between logins, you decide to add in the duration of the login for the last record (15 apples + 1 orange). If anything I would have the thought the last record's time would be from final login to the current date. Just my 0.02 worth Quote Link to comment Share on other sites More sharing options...
ajoo Posted December 3, 2020 Author Share Posted December 3, 2020 SIr, I think I will get the solution from all the solutions provided by you. In case I run into a roadblock. i'll ask again. 11 hours ago, Barand said: Just my 0.02 worth is worth 2K Guru Barand. You are absolutely correct . The current date should be used as the last date for the edge condition. Thank you.🙏 Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.