shamoon Posted October 31, 2007 Share Posted October 31, 2007 So I have a query... the MySQL Slow log caught it and outputted: # Time: 071031 11:31:51 # User@Host: bookswim[bookswim] @ D3350.servadmin.com [12.47.45.169] # Query_time: 33 Lock_time: 0 Rows_sent: 100 Rows_examined: 163196 use bsdb; SELECT b.title, b.title_html, b.authors, b.description, b.book_id, b.cover_id, b.publication_date, b.avg_rating AS rating, IF( b.total_rentals > 0, b.copies_owned / b.total_rentals, 0) AS usage_ratio FROM books b, books_to_categories AS btc WHERE b.total_rentals > 0 AND btc.node_id='17' AND btc.book_id=b.book_id GROUP BY b.book_id ORDER BY usage_ratio DESC, copies_effective DESC, publication_date DESC LIMIT 0, 100; It looks like it has to examine 163K rows. The odd thing is when I do an explain on it. The results are far less. id select_type table type possible_keys key key_len ref rows Extra 1 SIMPLE btc ref book_id,node_id node_id 32 const 26234 Using where; Using temporary; Using filesort 1 SIMPLE b ref PRIMARY,book_id PRIMARY 22 bsdb.btc.book_id 1 Using where Only 26K rows. Why the difference? Quote Link to comment Share on other sites More sharing options...
fenway Posted October 31, 2007 Share Posted October 31, 2007 Have you run OPTIMIZE TABLES lately? Quote Link to comment Share on other sites More sharing options...
shamoon Posted October 31, 2007 Author Share Posted October 31, 2007 Every night. Quote Link to comment Share on other sites More sharing options...
fenway Posted October 31, 2007 Share Posted October 31, 2007 Sorry, I mean ANALYZE... OPTIMIZE is the expensive one, and rarely needs to be run (especially if you're not deleting records). Also: Estimated number of rows may be very inaccurate It is well known this number is aproximate but I guess not everyone knows how much wrong it could be. If number of rows is estimated performing btree dive with certainly structured tree you still can get number of rows off 3-10 times. It gets even worse if cardinality is used. In this case if data distribution is skewed and you statement looks at portion of data with different distribution results can be off by 10-100 times and even more. Quote Link to comment Share on other sites More sharing options...
shamoon Posted October 31, 2007 Author Share Posted October 31, 2007 Every night as well. I run 'em both at odd hours just to keep things sane. We do A LOT of deleting as well, so I figure it's better to be safe than sorry. Quote Link to comment Share on other sites More sharing options...
aschk Posted November 1, 2007 Share Posted November 1, 2007 I've looked at this, and you're not using an aggregate function, yet you ARE using GROUP BY. Why? Perhaps the query you are doing is not the one you want to be doing Quote Link to comment Share on other sites More sharing options...
fenway Posted November 1, 2007 Share Posted November 1, 2007 I've looked at this, and you're not using an aggregate function, yet you ARE using GROUP BY. Why? Perhaps the query you are doing it not the one you want to be doing Interesting... didn't even notice that. Quote Link to comment Share on other sites More sharing options...
aschk Posted November 1, 2007 Share Posted November 1, 2007 By the looks of it, after further examination you want the top 100 books where the total_rentals is > 0 and book_to_category.node_id is 17, implying you want the top 100 books for a particular category. Also, i'm assuming you have the following relationship book -> book_to_categories -> book_categories, i.e. one -> many -> one or a nomalised many -> many relationship. My questions are this : 1) How many books do you have? 2) How many categories do you have? 3) Can 1 book be part of many categories (and vice versa)? Quote Link to comment Share on other sites More sharing options...
aschk Posted November 1, 2007 Share Posted November 1, 2007 Also have you got a PRIMARY COMPOSITE KEY on (book_id,node_id) in the btc table? One category cannot have the same book in it twice and one book can not be in the same category twice, thus the key is valid and unique. Quote Link to comment Share on other sites More sharing options...
shamoon Posted November 1, 2007 Author Share Posted November 1, 2007 My questions are this : 1) How many books do you have? Over 200K 2) How many categories do you have? Each category can be a child of other categories, so after all is said and done there are 15K categories 3) Can 1 book be part of many categories (and vice versa)? Yes. One book can be part of SEVERAL categories. The reason for the GROUP BY, by the way, is to ensure that I get unique book_id's. Is there another way? Quote Link to comment Share on other sites More sharing options...
fenway Posted November 1, 2007 Share Posted November 1, 2007 The reason for the GROUP BY, by the way, is to ensure that I get unique book_id's. Is there another way? But you're asking for a given category... how can one book be in a single category twice? Quote Link to comment Share on other sites More sharing options...
aschk Posted November 1, 2007 Share Posted November 1, 2007 So if you do the following what do you get : SELECT COUNT(book_id) FROM books_to_categories WHERE node_id = 17 UNION SELECT COUNT(book_id) FROM books_to_categories WHERE node_id = 17 GROUP BY book_id Note how i have put 17 in as an INT and NOT a '17' VARCHAR Quote Link to comment Share on other sites More sharing options...
aschk Posted November 1, 2007 Share Posted November 1, 2007 Then do an EXPLAIN on each of those SELECT parts of the query and post it here along with the EXPLAIN from SELECT book_id,node_id FROM books_to_categories WHERE node_id = 17 I'm curious to see how it's working with your indexes. Quote Link to comment Share on other sites More sharing options...
fenway Posted November 1, 2007 Share Posted November 1, 2007 Note how i have put 17 in as an INT and NOT a '17' VARCHAR The quoting comes from good practice against injection... it makes no different at all. Quote Link to comment Share on other sites More sharing options...
aschk Posted November 1, 2007 Share Posted November 1, 2007 Depends what slack sort of settings you have in MySQL. He's obviously not got STRICT_TRANS_TABLES set on his SQL mode settings. Port over his PHP to another database system and you might find he starts having errors Hmm, i stand corrected, i've just tested that and it's not the case. I could have sworn it throws an error... You re-learn something new every day. Quote Link to comment Share on other sites More sharing options...
shamoon Posted November 2, 2007 Author Share Posted November 2, 2007 So if you do the following what do you get : SELECT COUNT(book_id) FROM books_to_categories WHERE node_id = 17 UNION SELECT COUNT(book_id) FROM books_to_categories WHERE node_id = 17 GROUP BY book_id Note how i have put 17 in as an INT and NOT a '17' VARCHAR The explain is: id select_type table type possible_keys key key_len ref rows Extra 1 PRIMARY books_to_categories ALL node_id NULL NULL NULL 4310306 Using where 2 UNION books_to_categories ALL node_id NULL NULL NULL 4310306 Using where; Using temporary; Using filesort NULL UNION RESULT <union1,2> ALL NULL NULL NULL NULL NULL Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.