DaShiVa Posted May 18, 2010 Share Posted May 18, 2010 Hi. My first post here, and it's one mother of a query. Well The original query is, however I've stripped it down to the core slowdown, so hopefully it's not too bad now. Running MySQL Server version: 5.0.87-community; Protocol version: 10 I have a query I need to optimize, it's a search results. The query is dynamically generated based on a fairly complex set of options, filters and sort orders, but I'll get a simple version of the query that is running slow, along with the explain, in the hopes someone can help me speed it up. I've stripped out most of the fields it collects, as this doesn't impact the speed (however bear in mind I still need to join all the tables shown, as I am actually collecting data from each and every one of them in the final query - and quite a few more tables, that I've removed as they don't impact the query speed) Query: SELECT websites.website_id, product_webpage_keywords.product_webpage_keyword_quantity, ( SELECT COUNT( product_deals.product_deal_id ) FROM product_deals WHERE product_deals.product_webpage_keyword_id = product_webpage_keywords.product_webpage_keyword_id AND product_deals.product_deal_status IN ( 1, 3 ) ) AS num_product_keyword_deals FROM websites INNER JOIN product_webpages ON product_webpages.website_id = websites.website_id INNER JOIN product_webpage_keywords ON product_webpage_keywords.product_webpage_id = product_webpages.product_webpage_id INNER JOIN product_keywords ON product_webpage_keywords.product_keyword_id = product_keywords.product_keyword_id WHERE product_keywords.product_keyword_name LIKE 'Search' HAVING num_product_keyword_deals < product_webpage_keywords.product_webpage_keyword_quantity ORDER BY websites.website_url LIMIT 0, 25 EXPLAIN for above: id select_type table type possible_keys key key_len ref rows Extra 1 PRIMARY product_webpages ALL PRIMARY,website_id NULL NULL NULL 269459 Using temporary; Using filesort 1 PRIMARY websites eq_ref PRIMARY PRIMARY 4 product_webpages.website_id 1 1 PRIMARY product_webpage_keywords ref product_webpage_id,product_keyword_id product_webpage_id 4 product_webpages.product_webpage_id 19 1 PRIMARY product_keywords eq_ref PRIMARY PRIMARY 4 product_webpage_keywords.product_keyword_id 1 Using where 2 DEPENDENT SUBQUERY product_deals ref product_webpage_keyword_id product_webpage_keyword_id 4 product_webpage_keywords.product_webpage_keyword_id 1 Using where #rows in relevant tables: websites has ~100,000 rows product_deals has ~10,000 rows product_webpage_keywords has ~5,000,000 rows (this is the big one) product_webpages has ~400,000 rows product_keywords has ~10,000 rows This takes over 30 seconds to run. So far I have tried restructuring the query a half a dozen different ways, and I've added and changed indexes on the tables. Nothing has made a noticable difference. Caching of the query is not helpful, as the query is run on a different set of filters and page (for many different users, which adds other filters) every time it is run. Table structure changing is a remote option - I did not design the original tables, and changing them would require rewriting a lot of someone elses code, which I don't really have the time for. If it is a small-ish set of structural changes however, I will definitly consider it. I've read all over about explain and how to read it, but I haven't had a lot of luck grokking this particular aspect of MYSQL, and I'm hoping that the explain shows something obvious to someone to help me out. I can provide more information if it is useful, just ask. Thanks, Dan. Quote Link to comment https://forums.phpfreaks.com/topic/202198-query-optimization/ Share on other sites More sharing options...
andrewgauger Posted May 18, 2010 Share Posted May 18, 2010 This topic may be too advanced for me, but what I see is in this subquery: SELECT COUNT( product_deals.product_deal_id ) FROM product_deals WHERE product_deals.product_webpage_keyword_id = product_webpage_keywords.product_webpage_keyword_id AND product_deals.product_deal_status IN ( 1, 3 ) You reference two tables without a join. This would mean many queries would need to be compiled. *Try* taking the first part of the where out and see if the query executes faster. If I am shooting in the wrong direction, sorry. Quote Link to comment https://forums.phpfreaks.com/topic/202198-query-optimization/#findComment-1060284 Share on other sites More sharing options...
DaShiVa Posted May 18, 2010 Author Share Posted May 18, 2010 The subquery links into the main query through the 'second table' in the where. removing the 'where' would just end up counting all entries in the product_deals for each and every website row, rather than the deals that relate to that website (and other related filters) only. Quote Link to comment https://forums.phpfreaks.com/topic/202198-query-optimization/#findComment-1060291 Share on other sites More sharing options...
andrewgauger Posted May 18, 2010 Share Posted May 18, 2010 Alright, I just wasn't sure if you could use the main query's tables in the subquery. I assumed this would cause an iteration of queries. Quote Link to comment https://forums.phpfreaks.com/topic/202198-query-optimization/#findComment-1060305 Share on other sites More sharing options...
DaShiVa Posted May 19, 2010 Author Share Posted May 19, 2010 Anyone have any ideas? Or is able to "explain the EXPLAIN" to me (i.e. what parts of it I would want to change and how that would be accomplished)? Or does anyone know if I could somehow turn the subquery in the SELECT into a dynamically generated table in the FROM to somehow move the HAVING into a WHERE while maintaining the same GROUP BY (which is needed due to the other tables that are JOINed in the final query) Quote Link to comment https://forums.phpfreaks.com/topic/202198-query-optimization/#findComment-1060635 Share on other sites More sharing options...
andrewgauger Posted May 19, 2010 Share Posted May 19, 2010 mysqlcheck -Aa -u"user" -p"pass" Quote Link to comment https://forums.phpfreaks.com/topic/202198-query-optimization/#findComment-1060642 Share on other sites More sharing options...
DaShiVa Posted May 19, 2010 Author Share Posted May 19, 2010 mysqlcheck -Aa -u"user" -p"pass" I don't have shell access, but I am assuming you want me to analyze all tables? I've done this with the relevant tables on phpMyAdmin and they're checking out. Also run repair and optimize. Do you want me to add the analyze reports here? or was that just a facetious post? Quote Link to comment https://forums.phpfreaks.com/topic/202198-query-optimization/#findComment-1060660 Share on other sites More sharing options...
Mchl Posted May 19, 2010 Share Posted May 19, 2010 Subquery in column list IS a bad idea. Try rewriting it as a JOIN SELECT websites.website_id, product_webpage_keywords.product_webpage_keyword_quantity, sq.num_product_keyword_deals FROM websites INNER JOIN product_webpages ON product_webpages.website_id = websites.website_id INNER JOIN product_webpage_keywords ON product_webpage_keywords.product_webpage_id = product_webpages.product_webpage_id INNER JOIN product_keywords ON product_webpage_keywords.product_keyword_id = product_keywords.product_keyword_id INNER JOIN ( SELECT product_webpage_keyword_id, COUNT( product_deals.product_deal_id ) AS num_product_keyword_deals FROM product_deals WHERE product_deals.product_deal_status IN ( 1, 3 ) GROUP BY product_webpage_keyword_id ) AS sq ON sq.product_webpage_keyword_id = product_webpage_keywords.product_webpage_keyword_id WHERE product_keywords.product_keyword_name LIKE 'Search' HAVING num_product_keyword_deals < product_webpage_keywords.product_webpage_keyword_quantity ORDER BY websites.website_url LIMIT 0, 25 Quote Link to comment https://forums.phpfreaks.com/topic/202198-query-optimization/#findComment-1060740 Share on other sites More sharing options...
DaShiVa Posted May 19, 2010 Author Share Posted May 19, 2010 Subquery in column list IS a bad idea. Try rewriting it as a JOIN SELECT websites.website_id, product_webpage_keywords.product_webpage_keyword_quantity, sq.num_product_keyword_deals FROM websites INNER JOIN product_webpages ON product_webpages.website_id = websites.website_id INNER JOIN product_webpage_keywords ON product_webpage_keywords.product_webpage_id = product_webpages.product_webpage_id INNER JOIN product_keywords ON product_webpage_keywords.product_keyword_id = product_keywords.product_keyword_id INNER JOIN ( SELECT product_webpage_keyword_id, COUNT( product_deals.product_deal_id ) AS num_product_keyword_deals FROM product_deals WHERE product_deals.product_deal_status IN ( 1, 3 ) GROUP BY product_webpage_keyword_id ) AS sq ON sq.product_webpage_keyword_id = product_webpage_keywords.product_webpage_keyword_id WHERE product_keywords.product_keyword_name LIKE 'Search' HAVING num_product_keyword_deals < product_webpage_keywords.product_webpage_keyword_quantity ORDER BY websites.website_url LIMIT 0, 25 The good news: it's a hell of a lot faster: 0.07-odd seconds. The bad news: it returns 0 results :/ I haven't managed to figure out how to make it into a FROM without changing the results (And changing it from an INNER to a LEFT join still returns no results, but returns it back up to ~30 seconds to execute) Quote Link to comment https://forums.phpfreaks.com/topic/202198-query-optimization/#findComment-1060751 Share on other sites More sharing options...
Mchl Posted May 19, 2010 Share Posted May 19, 2010 Does this: SELECT product_webpage_keyword_id, COUNT( product_deals.product_deal_id ) AS num_product_keyword_deals FROM product_deals WHERE product_deals.product_deal_status IN ( 1, 3 ) GROUP BY product_webpage_keyword_idp return any rows? Quote Link to comment https://forums.phpfreaks.com/topic/202198-query-optimization/#findComment-1060753 Share on other sites More sharing options...
DaShiVa Posted May 19, 2010 Author Share Posted May 19, 2010 Does this: SELECT product_webpage_keyword_id, COUNT( product_deals.product_deal_id ) AS num_product_keyword_deals FROM product_deals WHERE product_deals.product_deal_status IN ( 1, 3 ) GROUP BY product_webpage_keyword_idp return any rows? ~200 rows returned Quote Link to comment https://forums.phpfreaks.com/topic/202198-query-optimization/#findComment-1060756 Share on other sites More sharing options...
Mchl Posted May 19, 2010 Share Posted May 19, 2010 That would mean no rows match the join condition: ON sq.product_webpage_keyword_id = product_webpage_keywords.product_webpage_keyword_id Quote Link to comment https://forums.phpfreaks.com/topic/202198-query-optimization/#findComment-1060779 Share on other sites More sharing options...
DaShiVa Posted May 19, 2010 Author Share Posted May 19, 2010 That would mean no rows match the join condition: ON sq.product_webpage_keyword_id = product_webpage_keywords.product_webpage_keyword_id Which is acceptable, as a 0 value for #deals means there are deals available, thus a valid result. this subquery is used to exclude matches what have no available deals (the having clause: subtracts used deals (this subquery) from available deals (out of product_webpage_keywords)) The problem is when I run the query with the subquery in the FROM tableset, as suggested, I get no results at all, but when it is in the SELECT fields, I get the full and proper results. (As I am running this on test data, on a low instance keyword, it is not surprising no deals exist for this particular query's candidates, but in the live environment there almost always will be some, and they must be filtered out) Quote Link to comment https://forums.phpfreaks.com/topic/202198-query-optimization/#findComment-1060786 Share on other sites More sharing options...
Mchl Posted May 19, 2010 Share Posted May 19, 2010 I get it now I think. Try like this. SELECT websites.website_id, product_webpage_keywords.product_webpage_keyword_quantity, sq.num_product_keyword_deals FROM websites INNER JOIN product_webpages ON product_webpages.website_id = websites.website_id INNER JOIN product_webpage_keywords ON product_webpage_keywords.product_webpage_id = product_webpages.product_webpage_id INNER JOIN product_keywords ON product_webpage_keywords.product_keyword_id = product_keywords.product_keyword_id INNER JOIN ( SELECT product_webpage_keyword_id, COUNT( product_deals.product_deal_id ) AS num_product_keyword_deals FROM product_deals WHERE product_deals.product_deal_status IN ( 1, 3 ) GROUP BY product_webpage_keyword_id ) AS sq ON sq.product_webpage_keyword_id = product_webpage_keywords.product_webpage_keyword_id WHERE product_keywords.product_keyword_name LIKE 'Search' AND (sq.num_product_keyword_deals < product_webpage_keywords.product_webpage_keyword_quantity OR sq.num_product_keyword_deals IS NULL) ORDER BY websites.website_url LIMIT 0, 25 Quote Link to comment https://forums.phpfreaks.com/topic/202198-query-optimization/#findComment-1060792 Share on other sites More sharing options...
DaShiVa Posted May 19, 2010 Author Share Posted May 19, 2010 I get it now I think. Try like this. SELECT websites.website_id, product_webpage_keywords.product_webpage_keyword_quantity, sq.num_product_keyword_deals FROM websites INNER JOIN product_webpages ON product_webpages.website_id = websites.website_id INNER JOIN product_webpage_keywords ON product_webpage_keywords.product_webpage_id = product_webpages.product_webpage_id INNER JOIN product_keywords ON product_webpage_keywords.product_keyword_id = product_keywords.product_keyword_id INNER JOIN ( SELECT product_webpage_keyword_id, COUNT( product_deals.product_deal_id ) AS num_product_keyword_deals FROM product_deals WHERE product_deals.product_deal_status IN ( 1, 3 ) GROUP BY product_webpage_keyword_id ) AS sq ON sq.product_webpage_keyword_id = product_webpage_keywords.product_webpage_keyword_id WHERE product_keywords.product_keyword_name LIKE 'Search' AND (sq.num_product_keyword_deals < product_webpage_keywords.product_webpage_keyword_quantity OR sq.num_product_keyword_deals IS NULL) ORDER BY websites.website_url LIMIT 0, 25 Change the INNER to a LEFT (as you had implied with your NULL check) and it brings back the exact same results, meaning it works, and thanks, I prefer it to the select/having. The query, however still runs at ~30 seconds (albeit about a second faster, but that's without doing extended testing) Give me 10 minutes or so to rework the updated EXPLAIN output and I'll post it here. Quote Link to comment https://forums.phpfreaks.com/topic/202198-query-optimization/#findComment-1060811 Share on other sites More sharing options...
DaShiVa Posted May 19, 2010 Author Share Posted May 19, 2010 Updated Query: SELECT websites.website_id, product_webpage_keywords.product_webpage_keyword_quantity, sq.num_product_keyword_deals FROM websites INNER JOIN product_webpages ON product_webpages.website_id = websites.website_id INNER JOIN product_webpage_keywords ON product_webpage_keywords.product_webpage_id = product_webpages.product_webpage_id INNER JOIN product_keywords ON product_webpage_keywords.product_keyword_id = product_keywords.product_keyword_id LEFT JOIN ( SELECT product_webpage_keyword_id, COUNT( product_deals.product_deal_id ) AS num_product_keyword_deals FROM product_deals WHERE product_deals.product_deal_status IN ( 1, 3 ) GROUP BY product_webpage_keyword_id ) AS sq ON sq.product_webpage_keyword_id = product_webpage_keywords.product_webpage_keyword_id WHERE product_keywords.product_keyword_name LIKE 'Search' AND (sq.num_product_keyword_deals < product_webpage_keywords.product_webpage_keyword_quantity OR sq.num_product_keyword_deals IS NULL) ORDER BY websites.website_url LIMIT 0, 25 EXPLAIN: id select_type table type possible_keys key key_len ref rows Extra 1 PRIMARY product_webpage_keywords ALL product_webpage_id,product_keyword_id NULL NULL NULL 4371444 Using temporary; Using filesort 1 PRIMARY product_webpages eq_ref PRIMARY,website_id PRIMARY 4 product_webpage_keywords.product_webpage_id 1 1 PRIMARY websites eq_ref PRIMARY PRIMARY 4 product_webpages.website_id 1 1 PRIMARY product_keywords eq_ref PRIMARY PRIMARY 4 product_webpage_keywords.product_keyword_id 1 Using where 1 PRIMARY <derived2> ALL NULL NULL NULL NULL 120 Using where 2 DERIVED product_deals ALL NULL NULL NULL NULL 1286 Using where; Using temporary; Using filesort Results as good as the same as original (some NULLs in place of 0's, but that works fine) Execution time seems to be a hair faster, but still ~30 seconds Quote Link to comment https://forums.phpfreaks.com/topic/202198-query-optimization/#findComment-1060820 Share on other sites More sharing options...
DaShiVa Posted May 20, 2010 Author Share Posted May 20, 2010 Well, after a couple days more digging and experimenting I've figured it out: It was the MySQL optomizer. So, rewriting teh query slightly (for optimum table order) and adding the STRAIGHT_JOIN keyword after SELECT (to override the optomizer) and it's gone from ~30 to ~1.5 seconds. And, as the original query inner joins in about a dozen tables and left joins a dozen more, and is dynamically generated, blah blah blah, I'm about to spend the next couple of days manually optimizing the whole query manually and rewriting the code that puts it together, which is ~5,000 lines (well, could probably be squished into ~2-3,000 but it's still a lot) Quote Link to comment https://forums.phpfreaks.com/topic/202198-query-optimization/#findComment-1061247 Share on other sites More sharing options...
Mchl Posted May 20, 2010 Share Posted May 20, 2010 Good job on teaching yourself cool stuff Quote Link to comment https://forums.phpfreaks.com/topic/202198-query-optimization/#findComment-1061264 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.