ajlisowski Posted March 8, 2010 Share Posted March 8, 2010 Hi everyone, Ive been tasked with building a spreadsheet type application that allows users to modify columns and change filters on the fly. Its a pretty intense database with lots of different tables so in order to access any of the datas columns, there has to be joins. It worked great when there were maybe 20 or so rows per table, but now theres 10k-30k in most of them, and needless to say its running much slower. However I figured I could index any column that the user can filter by, to speed things up. This does not seem to have helped.... (note, I know there are un-needed joins in the query right now. I am testing for a worst case scenerio where the user is requesting data from all available tables) SELECT p.`sku` AS `id` , p.`luna_sku` , hp.`edges` , f.`fiber` , p.`style_name` FROM `lunapr_c1price` AS `p` LEFT JOIN `lunapr_c1hard` AS `hp` ON p.`luna_sku` = hp.`luna_sku` LEFT JOIN `lunapr_c1soft` AS `sp` ON p.`luna_sku` = sp.`luna_sku` LEFT JOIN `lunapr_c1vendor` AS `v` ON ( p.`manufacturer` = v.`manufacturer` AND p.`division_dist` = v.`division_dist` ) LEFT JOIN `lunapr_c1geoprice` AS `pr` ON ( p.`luna_sku` = pr.`luna_sku` AND pr.`current` = '1' ) LEFT JOIN `lunapr_c1geocost` AS `c` ON ( p.`luna_sku` = c.`luna_sku` AND c.`current` = '1' ) LEFT JOIN `lunapr_c1fiber` AS `f` ON sp.`fiber` = f.`id` WHERE ( p.`style_name` != 'Alaska Sg Maple2.5 ' ) ORDER BY f.`fiber` DESC LIMIT 0 , 10 This query takes like 24 seconds to complete...thats no good. SELECT p.`sku` AS `id` , p.`luna_sku` , hp.`edges` , f.`fiber` , p.`style_name` FROM `lunapr_c1price` AS `p` LEFT JOIN `lunapr_c1hard` AS `hp` ON p.`luna_sku` = hp.`luna_sku` LEFT JOIN `lunapr_c1soft` AS `sp` ON p.`luna_sku` = sp.`luna_sku` LEFT JOIN `lunapr_c1vendor` AS `v` ON ( p.`manufacturer` = v.`manufacturer` AND p.`division_dist` = v.`division_dist` ) LEFT JOIN `lunapr_c1geoprice` AS `pr` ON ( p.`luna_sku` = pr.`luna_sku` AND pr.`current` = '1' ) LEFT JOIN `lunapr_c1geocost` AS `c` ON ( p.`luna_sku` = c.`luna_sku` AND c.`current` = '1' ) LEFT JOIN `lunapr_c1fiber` AS `f` ON sp.`fiber` = f.`id` WHERE ( p.`style_name` != ' ' ) ORDER BY f.`fiber` DESC LIMIT 0 , 10 This query takes like 1.4 seconds. Still not great but bareable. Any idea why they would be so drastically different? SELECT p.`sku` AS `id` , p.`luna_sku` , hp.`edges` , f.`fiber` , p.`style_name` FROM `lunapr_c1price` AS `p` LEFT JOIN `lunapr_c1hard` AS `hp` ON p.`luna_sku` = hp.`luna_sku` LEFT JOIN `lunapr_c1soft` AS `sp` ON p.`luna_sku` = sp.`luna_sku` LEFT JOIN `lunapr_c1fiber` AS `f` ON sp.`fiber` = f.`id` WHERE ( p.`style_name` != 'Alaska Sg Maple2.5 ' ) ORDER BY f.`fiber` DESC LIMIT 0 , 10 This one only takes .8 seconds. So obviously removing the other joins helps. (while writing this post I unset and reset some indexes and made things, worse, the big query now takes 140+seconds...) This is the explain on the nightmare query id select_type table type possible_keys key key_len ref rows Extra 1 SIMPLE p range style_name style_name 27 NULL 13607 Using where; Using temporary; Using filesort 1 SIMPLE hp ref luna_sku luna_sku 9 luna_filepro.p.luna_sku 1 1 SIMPLE sp ref luna_sku luna_sku 9 luna_filepro.p.luna_sku 1 Using index 1 SIMPLE v ref division_dist,manufacturer division_dist 27 luna_filepro.p.division_dist 2 1 SIMPLE pr ref luna_sku luna_sku 9 luna_filepro.p.luna_sku 8 1 SIMPLE c ref luna_sku luna_sku 9 luna_filepro.p.luna_sku 2 1 SIMPLE f eq_ref PRIMARY PRIMARY 4 luna_filepro.sp.fiber 1 I assume it is because its only using division_dist as a ref for `v` instead of the combination of division_dist and manufacturer. Any ideas? If remove that one join (the vendor join) it is decent, once I include that its a nightmare. Quote Link to comment https://forums.phpfreaks.com/topic/194529-mysql-query-optimization/ Share on other sites More sharing options...
Mchl Posted March 8, 2010 Share Posted March 8, 2010 Try creating indexes covering both cturmanufaer and division_dist columns ALTER TABLE tableName ADD INDEX cturmanufaer_division_dist(cturmanufaer,division_dist ); Quote Link to comment https://forums.phpfreaks.com/topic/194529-mysql-query-optimization/#findComment-1023189 Share on other sites More sharing options...
gizmola Posted March 8, 2010 Share Posted March 8, 2010 The only way you limit down the size of result set is to have where clauses that use indexes. If you want to join a bunch of tables that have thousands of rows and produce result sets of thousands of rows, things are going to be slow. In your explain plan, you can see that: 1 SIMPLE p range style_name style_name 27 NULL 13607 Using where; Using temporary; Using filesort Which is making a temporary table with 13607 rows, which it then sorts. Does the style_name column have an index on it? Even with an index it may not use it if this is a low cardinality column (only a few style_names). Quote Link to comment https://forums.phpfreaks.com/topic/194529-mysql-query-optimization/#findComment-1023213 Share on other sites More sharing options...
ajlisowski Posted March 8, 2010 Author Share Posted March 8, 2010 I put an index on style_name, and it is a pretty high cardinality (6804, so about a 1:2) However it still does not seem to be using it. Putting an index that is both manufactuer and dist worked. In fact I went back and made that combination the primary key. I also attemepted to make a foreign key to those fields, but I am not sure if that helped. The query time is drastically lower with those two fields. I still can not figure out why style_name is not using its index though. Quote Link to comment https://forums.phpfreaks.com/topic/194529-mysql-query-optimization/#findComment-1023236 Share on other sites More sharing options...
gizmola Posted March 8, 2010 Share Posted March 8, 2010 I'm not sure we're on the same page in regards to cardinality. When you do: select style_name, count(*) as countof FROM lunapr_c1price GROUP BY countof What do you get? Quote Link to comment https://forums.phpfreaks.com/topic/194529-mysql-query-optimization/#findComment-1023240 Share on other sites More sharing options...
ajlisowski Posted March 8, 2010 Author Share Posted March 8, 2010 I get an error..."Can't group on 'countof'" In case it helps, there are 7,935 distinct values of "style_name" and there are a total of 13,608 distinct rows in the table. if I run select style_name, count(*) as countof FROM lunapr_c1price GROUP BY style_name I get 7,935 results, most of which have a countof value of 1, some have more then that. One had 10...but a vast majority had 1. Quote Link to comment https://forums.phpfreaks.com/topic/194529-mysql-query-optimization/#findComment-1023243 Share on other sites More sharing options...
gizmola Posted March 8, 2010 Share Posted March 8, 2010 Yeah I air balled on the group by, but you figured it out. Ok, so what I would say is that there are so many values that are NOT EQUAL to the one you want, that mysql has decided to dump out the entire table. So at any rate we're back to the simple fact that you have a base result that is returning a lot of rows (select count(*) from lunapr_c1price where sytle_name != 'Alaska Sg Maple2.5'. This is being used to join to all the other tables, so there's a lot of rows being examined, unless I'm missing something. When you do an explain on just that simple select statement, what do you get? Quote Link to comment https://forums.phpfreaks.com/topic/194529-mysql-query-optimization/#findComment-1023264 Share on other sites More sharing options...
ajlisowski Posted March 9, 2010 Author Share Posted March 9, 2010 id select_type table type possible_keys key key_len ref rows Extra 1 SIMPLE lunapr_c1price range style_name style_name 27 NULL 13607 Using where; Using index Quote Link to comment https://forums.phpfreaks.com/topic/194529-mysql-query-optimization/#findComment-1023620 Share on other sites More sharing options...
gizmola Posted March 9, 2010 Share Posted March 9, 2010 So --- as you can see, you have a result set of 13607 rows being joined N times. This isn't going to be fast. You may be able to get it to stop using the temporary table and filesort by hinting the use of the appropriate index and see if that significantly improves the performance: http://dev.mysql.com/doc/refman/4.1/en/index-hints.html Quote Link to comment https://forums.phpfreaks.com/topic/194529-mysql-query-optimization/#findComment-1023725 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.