lunarul Posted March 8, 2007 Share Posted March 8, 2007 I have the following query: SELECT `m`.*, COUNT(DISTINCT `d`.`id`) AS `num_a`, COUNT(DISTINCT `d1`.`id`) AS `num_s`, COUNT(DISTINCT `d2`.`id`) AS `num_o`, COUNT(DISTINCT `t`.`id_email`) AS `num_c`, COUNT(DISTINCT `t`.`id`) AS `num_l` FROM `mailout` AS `m` LEFT JOIN `mailout_details` AS `d` ON `m`.`id` = `d`.`id_mailout` LEFT JOIN `mailout_details` AS `d1` ON `m`.`id` = `d1`.`id_mailout` AND `d1`.`sent`='1' LEFT JOIN `mailout_details` AS `d2` ON `m`.`id` = `d2`.`id_mailout` AND `d2`.`opened`='1' LEFT JOIN `mail_tracking` AS `t` ON `d`.`id` = `t`.`id_email` GROUP BY `m`.`id` which reads details about mails sent and counts all kinds of statistics the problem is that now mailout_details has some 3000 records (which I don't think is a large number and shouldn't be a problem) and the query just takes forever to execute (on my local computer it takes 100% of my cpu). what is the problem? ??? how can I optimize this query? i thought about storing `num_*` as columns and update them in my scripts so I don't have to do the counts, but this won't work as I have other queries that do the counts but based on other `group by`'s. P.S. the DISTINCT's are not the problem. dropping them will lead to the same 100% CPU and never-ending query execution. Please help, I really need to solve this quick, as client is complaining (page not working anymore) Quote Link to comment Share on other sites More sharing options...
artacus Posted March 8, 2007 Share Posted March 8, 2007 How many results are you getting? When this happens, its usually because tables are incorrectly joined and you get a Cartesian product. So if you had 3000 records in each table, you could get 3000 * 3000 or worse yet 3000 * 3000 * 3000 ... that will stuff up your server in a hurry. Quote Link to comment Share on other sites More sharing options...
lunarul Posted March 8, 2007 Author Share Posted March 8, 2007 The query is correct. It worked when there were less rows in the mailout_details table. The number of rows returned is equal to the number of records in the mailout table, witch is small (less than 10, or maybe even less than 5). But for each row it needs to get about 300 rows from each join and count them. That would be (5 joins * 300 rows each) * 10 main rows = 15000 rows fetched (but only 10 actual rows displayed due to the GROUP BY clause). Quote Link to comment Share on other sites More sharing options...
fenway Posted March 8, 2007 Share Posted March 8, 2007 Please post the EXPLAIN output... Quote Link to comment Share on other sites More sharing options...
lunarul Posted March 9, 2007 Author Share Posted March 9, 2007 Here is the EXPLAIN output: +----+-------------+-------+------+---------------------------------------------+--------------+---------+--------------------+------+----------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+------+---------------------------------------------+--------------+---------+--------------------+------+----------------+ | 1 | SIMPLE | m | ALL | NULL | NULL | NULL | NULL | 15 | Using filesort | | 1 | SIMPLE | d | ref | id_mailout,id_mailout_2,id_mailout_3 | id_mailout | 4 | theatre.m.id | 305 | | | 1 | SIMPLE | d1 | ref | id_mailout,sent,id_mailout_2,id_mailout_3 | id_mailout_2 | 5 | theatre.m.id,const | 280 | | | 1 | SIMPLE | d2 | ref | id_mailout,opened,id_mailout_2,id_mailout_3 | id_mailout_3 | 5 | theatre.m.id,const | 280 | Using where | | 1 | SIMPLE | t | ref | id_email | id_email | 4 | theatre.d.id | 2 | | +----+-------------+-------+------+---------------------------------------------+--------------+---------+--------------------+------+----------------+ Quote Link to comment Share on other sites More sharing options...
fenway Posted March 9, 2007 Share Posted March 9, 2007 Yeah, it's just ugly because you can't take advantage of the indexes for the group by operator when you join tables that way... post the explain for a single table count(), and you'll see what i mean. Quote Link to comment Share on other sites More sharing options...
lunarul Posted March 9, 2007 Author Share Posted March 9, 2007 got help from another forum I rewrote the query as: SELECT m.*, det.* FROM mailout m LEFT JOIN ( SELECT d.id_mailout, COUNT(d.id_mailout) AS `num_a`, SUM( CASE WHEN d.sent = '1' THEN 1 ELSE 0 END ) AS `num_s`, SUM( CASE WHEN d.opened = '1' THEN 1 ELSE 0 END ) AS `num_o`, COUNT(tr.id_email) AS `num_c`, SUM(tr.num_c) AS `num_l` FROM mailout_details d LEFT JOIN (SELECT t.id_email, COUNT(t.id_email) AS `num_c` FROM mail_tracking `t` GROUP BY t.id_email) tr ON tr.id_email = d.id GROUP BY d.id_mailout) det ON det.id_mailout = m.id and now it works instantly. but I have another slow query that doesn't optimize the same way: SELECT `r`.*, COUNT(DISTINCT `d1`.`id`) AS `num_s`, COUNT(DISTINCT `d2`.`id`) AS `num_o`, COUNT(DISTINCT `t`.`id`) AS `num_c` FROM `recipients` AS `r` LEFT JOIN `mailout_details` AS `d` ON `r`.`email` = `d`.`email` LEFT JOIN `mail_tracking` AS `t` ON `d`.`id` = `t`.`id_email` LEFT JOIN `mailout_details` AS `d1` ON `r`.`email` = `d1`.`email` AND `d1`.`sent` = '1' LEFT JOIN `mailout_details` AS `d2` ON `r`.`email` = `d2`.`email` AND `d2`.`opened` = '1' $where GROUP BY `r`.`id` Quote Link to comment Share on other sites More sharing options...
fenway Posted March 9, 2007 Share Posted March 9, 2007 That's basically the same thing that i was saying... you have to break down each piece.... and I'm going to say the same for this 'new' query of yours. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.