MegamanNL Posted February 12, 2009 Share Posted February 12, 2009 Hi guys! This is my first post on PHP Freaks, please be gentle.. I have a similar situation in a few applications so I've made sample tables and query to explain the issue I'm facing. Using MySQL 5.0.51a-3ubuntu5.4 Tables: CREATE TABLE `issues` ( `id` INTEGER(11) UNSIGNED NOT NULL AUTO_INCREMENT, `name` VARCHAR(255) COLLATE utf8_general_ci NOT NULL DEFAULT '', `status` VARCHAR(255) COLLATE utf8_general_ci NOT NULL DEFAULT '', PRIMARY KEY (`id`), UNIQUE KEY `id` (`id`) )ENGINE=MyISAM AUTO_INCREMENT=6 CHARACTER SET 'utf8' COLLATE 'utf8_general_ci'; COMMIT; /* Data for the `issues` table (Records 1 - 5) */ INSERT INTO `issues` (`id`, `name`, `status`) VALUES (1, 'IE Crash', 'closed'), (2, 'Mail problem', 'open'), (3, 'Dead laptop', 'closed'), (4, 'Blue screens', 'closed'), (5, 'Driver update', 'open'); CREATE TABLE `issues_log` ( `id` INTEGER(11) UNSIGNED NOT NULL AUTO_INCREMENT, `issue_id` INTEGER(11) UNSIGNED NOT NULL, `status_date` DATE NOT NULL, `status` VARCHAR(255) COLLATE utf8_general_ci NOT NULL DEFAULT '', PRIMARY KEY (`id`), UNIQUE KEY `id` (`id`) )ENGINE=MyISAM AUTO_INCREMENT=15 CHARACTER SET 'utf8' COLLATE 'utf8_general_ci'; COMMIT; /* Data for the `issues_log` table (Records 1 - 12) */ INSERT INTO `issues_log` (`id`, `issue_id`, `status_date`, `status`) VALUES (1, 1, '2009-02-09', 'open'), (2, 1, '2009-02-11', 'closed'), (3, 2, '2009-02-09', 'open'), (8, 4, '2009-02-09', 'open'), (6, 4, '2009-02-10', 'open'), (7, 4, '2009-02-10', 'closed'), (9, 4, '2009-02-09', 'closed'), (10, 3, '2009-02-08', 'open'), (11, 3, '2009-02-11', 'closed'), (12, 5, '2009-02-09', 'open'), (13, 5, '2009-02-10', 'closed'), (14, 5, '2009-02-11', 'open'); Now, what I'm trying to retrieve are statistics for the closed issues. So I'd like to know day-by-day how many issues are closed but without counting reopened issues. Query: SELECT il.status_date AS sdt, COUNT(i.id) AS cnt FROM issues AS i LEFT JOIN issues_log AS il ON il.issue_id = i.id WHERE i.status = 'closed' AND il.status='closed' GROUP BY sdt ORDER BY sdt ASC Explain output: id select_type table type possible_keys key key_len ref rows Extra 1 SIMPLE il ALL 9 Using where; Using temporary; Using filesort 1 SIMPLE i ALL PRIMARY,id 3 Using where Query output: sdt cnt 09-02-2009 1 10-02-2009 1 11-02-2009 2 Desired output: sdt cnt 10-02-2009 1 11-02-2009 2 The record with id 9 for 09-02-2009 (dd-mm-yyy) should be omitted. I've tried selecting from just the issues_log table, but then I'm stuck when trying to sort the records so that I get only the newest one. Grouping/distinct/etc. will count record 9, in stead of 7 which is the most recent one. Is there any method other then looping that would give the desired result? Quote Link to comment https://forums.phpfreaks.com/topic/144916-solved-counting-records-from-joined-tables-without-duplicates/ Share on other sites More sharing options...
fenway Posted February 15, 2009 Share Posted February 15, 2009 A few things: 1) You don't need to have two duplicate indexes for "id" -- PRIMARY KEY will suffice, drop the other one. 2) You should really index the fields you're using for JOIN. 3) because it's a left join, you need to make any conditions from the where clause of the left join-ed table to the ON clause. Quote Link to comment https://forums.phpfreaks.com/topic/144916-solved-counting-records-from-joined-tables-without-duplicates/#findComment-762691 Share on other sites More sharing options...
MegamanNL Posted February 15, 2009 Author Share Posted February 15, 2009 A few things: 1) You don't need to have two duplicate indexes for "id" -- PRIMARY KEY will suffice, drop the other one. Hi fenway, thanks for your reply. EMS MySQL manager always creates those double indexes automatically, not sure why. I usually remove them from production DBs. 2) You should really index the fields you're using for JOIN. These were just sample tables to simplify the issue I'm trying to resolve. The real tables have a lot more fields and are indexed properly. But thanks for the reminder. 3) because it's a left join, you need to make any conditions from the where clause of the left join-ed table to the ON clause. So you mean this would be better? SELECT il.status_date AS sdt, COUNT(i.id) AS cnt FROM issues AS i LEFT JOIN issues_log AS il ON i.id = il.issue_id WHERE i.status = 'closed' AND il.status='closed' GROUP BY sdt ORDER BY sdt ASC I still can't figure out how I can make the counting work the way I'd like. Or am I misunderstanding you completely? Quote Link to comment https://forums.phpfreaks.com/topic/144916-solved-counting-records-from-joined-tables-without-duplicates/#findComment-762792 Share on other sites More sharing options...
fenway Posted February 16, 2009 Share Posted February 16, 2009 no, i mean: SELECT il.status_date AS sdt, COUNT(i.id) AS cnt FROM issues AS i LEFT JOIN issues_log AS il ON ( i.id = il.issue_id AND il.status='closed' ) WHERE i.status = 'closed' GROUP BY sdt ORDER BY sdt ASC Quote Link to comment https://forums.phpfreaks.com/topic/144916-solved-counting-records-from-joined-tables-without-duplicates/#findComment-763208 Share on other sites More sharing options...
MegamanNL Posted February 16, 2009 Author Share Posted February 16, 2009 Okay, I'll keep that in mind. Any ideas on how to get the desired output? Quote Link to comment https://forums.phpfreaks.com/topic/144916-solved-counting-records-from-joined-tables-without-duplicates/#findComment-763213 Share on other sites More sharing options...
fenway Posted February 16, 2009 Share Posted February 16, 2009 Wait a minute, that doesn'twork? Quote Link to comment https://forums.phpfreaks.com/topic/144916-solved-counting-records-from-joined-tables-without-duplicates/#findComment-763249 Share on other sites More sharing options...
MegamanNL Posted February 16, 2009 Author Share Posted February 16, 2009 Well, unless I'm missing something, your query works exactly like the first one I posted. When I run it I get: sdt cnt 09-02-2009 1 10-02-2009 1 11-02-2009 2 What I'd like to see is the result without the duplicate row for issue_id 4. In other words just 10-02 and 11-02. I might be able to group by issue_id, but then I won't have the most recent date... Quote Link to comment https://forums.phpfreaks.com/topic/144916-solved-counting-records-from-joined-tables-without-duplicates/#findComment-763305 Share on other sites More sharing options...
fenway Posted February 16, 2009 Share Posted February 16, 2009 I'm sorry... I'm quite lost now. Is there a counting issue or a grouping issue? And why should certain rows be excluded? Your sample data doesn't match your output, either. Quote Link to comment https://forums.phpfreaks.com/topic/144916-solved-counting-records-from-joined-tables-without-duplicates/#findComment-763317 Share on other sites More sharing options...
MegamanNL Posted February 16, 2009 Author Share Posted February 16, 2009 I'm sorry... I'm quite lost now. Is there a counting issue or a grouping issue? If only I knew... I think a bit of both. And why should certain rows be excluded? When I graph the number of closed issues by date, the total of my query exceeds the total of all closed issues. In the resultset I get 1+1+2 = 4 closed issues... But there are only 3 issues with closed status in my `issues` table. That's caused by the issue with id 4. That one has been opened, closed and then opened and closed again one day later. I only want to count the records from `issues_log` with the most recent date for each closed issue. Your sample data doesn't match your output, either. Ahum? It should, I actually created the sample table and copied everything from there... CREATE TABLE `issues` ( `id` INTEGER(11) UNSIGNED NOT NULL AUTO_INCREMENT, `name` VARCHAR(255) COLLATE utf8_general_ci NOT NULL DEFAULT '', `status` VARCHAR(255) COLLATE utf8_general_ci NOT NULL DEFAULT '', PRIMARY KEY (`id`), KEY `status` (`status`) )ENGINE=MyISAM AUTO_INCREMENT=6 CHARACTER SET 'utf8' COLLATE 'utf8_general_ci'; CREATE TABLE `issues_log` ( `id` INTEGER(11) UNSIGNED NOT NULL AUTO_INCREMENT, `issue_id` INTEGER(11) UNSIGNED NOT NULL, `status_date` DATE NOT NULL, `status` VARCHAR(255) COLLATE utf8_general_ci NOT NULL DEFAULT '', PRIMARY KEY (`id`), KEY `issue_id` (`issue_id`), KEY `status` (`status`) )ENGINE=MyISAM AUTO_INCREMENT=15 CHARACTER SET 'utf8' COLLATE 'utf8_general_ci'; INSERT INTO `issues` (`id`, `name`, `status`) VALUES (1,'IE Crash','closed'), (2,'Mail problem','open'), (3,'Dead laptop','closed'), (4,'Blue screens','closed'), (5,'Driver update','open'); COMMIT; INSERT INTO `issues_log` (`id`, `issue_id`, `status_date`, `status`) VALUES (1,1,'2009-02-09','open'), (2,1,'2009-02-11','closed'), (3,2,'2009-02-09','open'), (6,4,'2009-02-10','open'), (7,4,'2009-02-10','closed'), (8,4,'2009-02-09','open'), (9,4,'2009-02-09','closed'), (10,3,'2009-02-08','open'), (11,3,'2009-02-11','closed'), (12,5,'2009-02-09','open'), (13,5,'2009-02-10','closed'), (14,5,'2009-02-11','open'); COMMIT; SELECT il.status_date AS sdt, COUNT(i.id) AS cnt FROM issues AS i LEFT JOIN issues_log AS il ON (i.id = il.issue_id AND il.status = 'closed') WHERE i.status = 'closed' GROUP BY sdt ORDER BY sdt ASC; Your query returns: sdt cnt 09-02-2009 1 10-02-2009 1 11-02-2009 2 The query counts record ids 2, 7, 9 and 11 from `issues_log`. And I want it to count only 2, 7 and 11. Because 9 is a duplicate issue on a different date (records 7 and 9 from `issues_log` both belong to the issue with id 4). I hope my description is clearer now. Quote Link to comment https://forums.phpfreaks.com/topic/144916-solved-counting-records-from-joined-tables-without-duplicates/#findComment-763332 Share on other sites More sharing options...
fenway Posted February 17, 2009 Share Posted February 17, 2009 Oh, I see now -- we need to use the most recent closed status. BTW, why do you have two status flags? Quote Link to comment https://forums.phpfreaks.com/topic/144916-solved-counting-records-from-joined-tables-without-duplicates/#findComment-764046 Share on other sites More sharing options...
MegamanNL Posted February 17, 2009 Author Share Posted February 17, 2009 Oh, I see now -- we need to use the most recent closed status. That would be a correct assessment. BTW, why do you have two status flags? One of the tables in production has many thousands of records and select queries running continuously. It seems faster to me to add a small field compared to using a join. I'd rather not use join if I don't have to. But what do I know... Quote Link to comment https://forums.phpfreaks.com/topic/144916-solved-counting-records-from-joined-tables-without-duplicates/#findComment-764057 Share on other sites More sharing options...
fenway Posted February 17, 2009 Share Posted February 17, 2009 Well, this will retrieve the "desired" rows, correct? SELECT il.* FROM issues_log AS il INNER JOIN ( SELECT issue_id, MAX(status_date) AS status_date FROM issues_log WHERE status = 'closed' GROUP BY issue_id ) AS sub USING( issue_id, status_date ) Shouldn't be too hard to count and join it back to issues table. Quote Link to comment https://forums.phpfreaks.com/topic/144916-solved-counting-records-from-joined-tables-without-duplicates/#findComment-764118 Share on other sites More sharing options...
MegamanNL Posted February 17, 2009 Author Share Posted February 17, 2009 You Sir, are a hero! With some modification I now get the data I need. Thanks! Quote Link to comment https://forums.phpfreaks.com/topic/144916-solved-counting-records-from-joined-tables-without-duplicates/#findComment-764181 Share on other sites More sharing options...
fenway Posted February 17, 2009 Share Posted February 17, 2009 Glad to help... would you mind posting the final working query? You'd be amazed how often this question gets asked around here.. Quote Link to comment https://forums.phpfreaks.com/topic/144916-solved-counting-records-from-joined-tables-without-duplicates/#findComment-764206 Share on other sites More sharing options...
MegamanNL Posted February 17, 2009 Author Share Posted February 17, 2009 Working query for one of the applications (field names are different but similar table structure): SELECT COUNT(LinkId) AS InstallCount, DATE(InDateTime) AS InstallDate FROM ots_report_status AS ors INNER JOIN ( SELECT LinkId, MAX(InDateTime) AS InDateTime FROM ots_report_status WHERE LinkTable = 'ots_device' GROUP BY LinkId ) AS sub USING( LinkId, InDateTime ) WHERE ors.To_Ots_System_Status_Id = 1 GROUP BY DATE(InDateTime) Quote Link to comment https://forums.phpfreaks.com/topic/144916-solved-counting-records-from-joined-tables-without-duplicates/#findComment-764273 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.