lostmouse Posted October 26, 2006 Share Posted October 26, 2006 I cannot seem to properly optimize this statement used to show topics across all forums that have been recently replied to:SELECT t.topic_id, t.topic_title, t.topic_sticky, t.topic_views, t.num_posts, f.forum_icon, f.forum_name, p.poster_name, p.post_timeFROM minibb_topics AS tLEFT JOIN minibb_forums AS f ON t.forum_id = f.forum_idINNER JOIN minibb_posts AS p ON p.post_id = t.topic_last_post_idWHERE 5 >= forum_level_to_view && ( forum_site =0 || forum_site =1 )ORDER BY t.topic_sticky DESC , t.topic_last_post_id DESCLIMIT 20I have isolated the issue to the ORDER BY clause. If I remove the ORDER BY clause, the query takes 1/10th of the time to execute. Both ORDER BY fields are indexed. Also, if I limit the query to one forum such as, "WHERE f.forum_id = 1", the statement executes in 1/10th of the time.This is the explain:id select_type table type possible_keys key key_len ref rows Extra1 SIMPLE f ALL PRIMARY,forum_level_to_view,forum_site NULL NULL NULL 12 Using where; Using temporary; Using filesort1 SIMPLE t ref forum_id,topic_last_post_id forum_id 2 aanime_aanimedb.f.forum_id 651 1 SIMPLE p eq_ref PRIMARY PRIMARY 4 aanime_aanimedb.t.topic_last_post_id 1 This is the table structure:-- -- Table structure for table 'minibb_forums'-- CREATE TABLE minibb_forums ( forum_id smallint(6) NOT NULL auto_increment, forum_name varchar(150) NOT NULL default '', forum_desc varchar(255) NOT NULL default '', forum_order tinyint(2) NOT NULL default '0', forum_icon varchar(255) NOT NULL default 'default.gif', forum_group varchar(30) NOT NULL default '', forum_site tinyint(1) NOT NULL default '0', forum_level_to_view tinyint(2) unsigned NOT NULL default '0', forum_level_to_post tinyint(2) NOT NULL default '0', forum_level_to_topic tinyint(2) NOT NULL default '0', num_topics mediumint(8) unsigned NOT NULL default '0', num_posts mediumint(8) unsigned NOT NULL default '0', is_spam tinyint(1) NOT NULL default '0', PRIMARY KEY (forum_id), KEY forum_order (forum_order), KEY forum_level_to_topic (forum_level_to_topic), KEY forum_level_to_view (forum_level_to_view), KEY forum_site (forum_site)) ENGINE=MyISAM DEFAULT CHARSET=latin1;-- ---------------------------------------------------------- -- Table structure for table 'minibb_posts'-- CREATE TABLE minibb_posts ( post_id int(11) NOT NULL auto_increment, topic_id int(10) NOT NULL default '1', poster_name varchar(16) NOT NULL default 'Anonymous', post_text text NOT NULL, post_time datetime NOT NULL default '0000-00-00 00:00:00', poster_ip varchar(15) NOT NULL default '', post_status tinyint(1) NOT NULL default '0', PRIMARY KEY (post_id), KEY topic_id (topic_id), KEY poster_name (poster_name)) ENGINE=MyISAM DEFAULT CHARSET=latin1;-- ---------------------------------------------------------- -- Table structure for table 'minibb_topics'-- CREATE TABLE minibb_topics ( topic_id int(10) NOT NULL auto_increment, topic_title varchar(100) NOT NULL default '', topic_sticky tinyint(1) NOT NULL default '0', topic_poster_name varchar(16) NOT NULL default 'Anonymous', topic_time datetime NOT NULL default '0000-00-00 00:00:00', topic_views mediumint(9) NOT NULL default '0', forum_id smallint(6) NOT NULL default '1', topic_status tinyint(1) NOT NULL default '0', topic_last_post_id int(10) NOT NULL default '1', num_posts mediumint(8) unsigned NOT NULL default '0', PRIMARY KEY (topic_id), KEY forum_id (forum_id), KEY topic_poster_name (topic_poster_name), KEY topic_sticky (topic_sticky), KEY topic_last_post_id (topic_last_post_id)) ENGINE=MyISAM DEFAULT CHARSET=latin1;Does anyone have any ideas about this?Thanks. Quote Link to comment Share on other sites More sharing options...
btherl Posted October 27, 2006 Share Posted October 27, 2006 Without an ORDER BY, the database can just fetch any 20 results and give them to you. But once you order the results, the database has to fetch ALL results, sort them, then give you the first 20. That's probably what's slowing you down.To fix it, you can try reducing the amount of data it needs to order. Instead of ordering by sticky, you can do a seperate query or subquery (using an index) to fetch all the sticky topics. That should be fast. But the volume of results for the rest could be quite high. One way to help is to partition your database, such as having one table for each month worth of topics. Then you query the first month, see if you have enough topics, if not then you query the previous month, and so on until you get 20 topic worth. Sticky topics would need their own table too, since stickiness is the most important criterion.Another option is to cache results. Caching per-forum may be a big win, especially if some forums are not updated often but others are. Then again, it's the most frequently updated forums that will be the largest.Maybe someone else has some other ideas? I use postgresql usually, but a lot of the same principles apply. Quote Link to comment Share on other sites More sharing options...
lostmouse Posted October 27, 2006 Author Share Posted October 27, 2006 Thanks for the reply.It's a creative idea, but if I start restructuring my database through separate tables, I will have to rewrite a lot of my existing sql statements. Also, it is indeed the most active forums that are shown.Is the size the reason why when I add "WHERE f.forum_id = 1" the statement executes quickly? Quote Link to comment Share on other sites More sharing options...
fenway Posted October 28, 2006 Share Posted October 28, 2006 I would say the real problem is that no index will be utilized if you use DESC, since the index is stored ASC. Yeah, it's dump that MySQL doesn't handle this automatically, but it doesn't. Quote Link to comment Share on other sites More sharing options...
lostmouse Posted October 28, 2006 Author Share Posted October 28, 2006 I've tried ordering by both, but there is no speed increase if I use ASC rather than DESC. Quote Link to comment Share on other sites More sharing options...
fenway Posted October 29, 2006 Share Posted October 29, 2006 Interesting... your primary table isn't using any index... and should be using "forum_level_to_view". Quote Link to comment Share on other sites More sharing options...
lostmouse Posted October 30, 2006 Author Share Posted October 30, 2006 That does seem to be the case.I know there are times when mysql will do a filesort when there are very few rows, but in this case using an index is necessary and I cannot force it to use an index. I feel like I need to restructure something, but I don't know where to start with this. Quote Link to comment Share on other sites More sharing options...
shoz Posted October 30, 2006 Share Posted October 30, 2006 First, what is the speed difference? 0.01 seconds compared to 0.1 seconds?[quote author=fenway link=topic=112852.msg459098#msg459098 date=1162051461]I would say the real problem is that no index will be utilized if you use DESC, since the index is stored ASC. Yeah, it's dump that MySQL doesn't handle this automatically, but it doesn't.[/quote]MYSQL apparently reads the index in reverse when you use DESC. Perhaps a multi-column index on (topic_sticky, topic_last_post_id) may be useful, although I'm not sure if it'll be used in the current query.[quote=mysql.com]To sort or group a table if the sorting or grouping is done on a leftmost prefix of a usable key (for example, ORDER BY key_part1, key_part2). If all key parts are followed by DESC, the key is read in reverse order. See Section 7.2.12, “ORDER BY Optimization”.[/quote]http://dev.mysql.com/doc/refman/5.1/en/mysql-indexes.html[quote author=fenway link=topic=112852.msg459627#msg459627 date=1162149894]Interesting... your primary table isn't using any index... and should be using "forum_level_to_view".[/quote]I also think it should be using the index. Try running "ANALYZE TABLE minibb_forums" and see if that helps MYSQL use the index. Actually, try running "ANALYZE TABLE" on each table.http://dev.mysql.com/doc/refman/5.1/en/analyze-table.html[quote author=lostmouse link=topic=112852.msg459744#msg459744 date=1162169227]That does seem to be the case.I know there are times when mysql will do a filesort when there are very few rows, but in this case using an index is necessary and I cannot force it to use an index. I feel like I need to restructure something, but I don't know where to start with this.[/quote]You can force it but I wouldn't do that just yet or did you mean "FORCE INDEX" doesn't help?[code]SELECT ... FROM tablename FORCE INDEX(index_name) INNER JOIN ...[/code]Btw, to be clear. Is the purpose of the query to pull the 20 most recent "posts" and what version of MYSQL are you using?EDIT: Also, how is forum_level_view used. Is it possible for a topic be related to a forum_id with forum_level_view greater than 5? Quote Link to comment Share on other sites More sharing options...
fenway Posted October 30, 2006 Share Posted October 30, 2006 My bad about the DESC thing.. I was confusing that when you mix ASC and DESC, and I guess that's not MySQL's fault. Maybe it's not optimizing "5 >= forum_level_to_view" to "forum_level_to_view <= 5", and therefore not using the index? A force shouldn't be necessary here. As for the order by, there's no reason why you couldn't use a covering index, but it would be tricky with that OR in there. Quote Link to comment Share on other sites More sharing options...
lostmouse Posted October 30, 2006 Author Share Posted October 30, 2006 Mysql version 4.1.10The purpose of the query is to pull the 20 most recently replied to topics (in other words, the 20 most recently active topics). Any sticky topics have priority.forum_level_to_view is used to allow only certain members to view certain forums (for example, the administration forum has a permission "forum_level_to_view" of 4). It makes no difference in speed whether it's left in or out of the query. At this point, there is no forum with forum_level_to_view greater than 5 as there are no permission levels greater than 5.I've analyzed all of the tables, there is no improvement in this query's execution speed.I made a mistake about the improvement in speed difference when the ORDER BY clause is removed. With the clause in, the query takes about 0.2 - 0.4 seconds. Removed it takes 0.001 to 0.002 seconds.I tried using FORCE INDEX again with a very reasonable improvement in speed to 0.02 to 0.04 seconds. The query now looks like this:[code]SELECT t.topic_id, t.topic_title, t.topic_sticky, t.topic_views, t.num_posts, f.forum_icon, f.forum_name, p.poster_name, p.post_time FROM minibb_topics AS t FORCE INDEX(topic_last_post_id, topic_sticky) LEFT JOIN minibb_forums AS f on t.forum_id=f.forum_id INNER JOIN minibb_posts as p ON p.post_id = t.topic_last_post_id WHERE 5 >= forum_level_to_view && (forum_site = 0 || forum_site = 1) ORDER BY t.topic_sticky DESC, t.topic_last_post_id DESC LIMIT 20[/code]And the explain looks like this:[code]1 SIMPLE t ALL topic_last_post_id NULL NULL NULL 7819 Using filesort1 SIMPLE f eq_ref PRIMARY,forum_level_to_view,forum_site PRIMARY 2 aanime_aanimedb.t.forum_id 1 Using where1 SIMPLE p eq_ref PRIMARY PRIMARY 4 aanime_aanimedb.t.topic_last_post_id 1 [/code]From the look of the explain, it seems the query could still further be optimized. Still, a x10 improvement in execution speed is quite good.Note that if I remove either of the ORDER BY conditions, the speed is improved to about 0.0015 seconds. It does not matter if topic_sticky or topic_last_post_id is removed. Quote Link to comment Share on other sites More sharing options...
shoz Posted October 31, 2006 Share Posted October 31, 2006 Create the following multi-column index.[code]ALTER TABLE topics ADD INDEX(forum_id, topic_sticky, topic_last_post_id)[/code]Remove the FORCE INDEX from the query and run the query again. This index should be used if the "forums" table is chosen as the first table in the JOIN.If MYSQL chooses the topics table as the first table then make the index on (topic_sticky, topic_last_post_id) instead and you can DROP the previous INDEX.If you're still not getting any better results try the JOINs without the "posts" table and post if there is a significant difference.Looking back at the previous EXPLAIN, the forum_level_view index may not have been used because as you said all the forums fall under that number and even if they didn't the table is so small that a table scan is estimated as being faster.EDIT: You may also want to test with "SELECT STRAIGHT_JOIN * FROM ..." and keep both indexes mentioned earlier on the table to see what MYSQL does when you try different queries. Quote Link to comment Share on other sites More sharing options...
lostmouse Posted November 2, 2006 Author Share Posted November 2, 2006 Adding the composite index had no effect. It did not seem to matter how I structured the query or what composite indexes I had on the topics table.Removing the post table had a significant difference reducing the execution time to 1/10th of the original time (0.020). If I added the "STRAIGHT_JOIN" keyword instead, it had a very similar effect on the execution time (about 1/5th to 1/10th).This is the EXPLAIN for the straight_join:[code]1 SIMPLE t ALL forum_id,topic_last_post_id NULL NULL NULL 7823 Using filesort1 SIMPLE f eq_ref PRIMARY,forum_level_to_view,forum_site PRIMARY 2 aanime_aanimedb.t.forum_id 1 Using where1 SIMPLE p eq_ref PRIMARY PRIMARY 4 aanime_aanimedb.t.topic_last_post_id 1 [/code]The "forum" table is still being selected first. Quote Link to comment Share on other sites More sharing options...
shoz Posted November 3, 2006 Share Posted November 3, 2006 [quote author=lostmouse link=topic=112852.msg462050#msg462050 date=1162508690]This is the EXPLAIN for the straight_join:[code]1 SIMPLE t ALL forum_id,topic_last_post_id NULL NULL NULL 7823 Using filesort1 SIMPLE f eq_ref PRIMARY,forum_level_to_view,forum_site PRIMARY 2 aanime_aanimedb.t.forum_id 1 Using where1 SIMPLE p eq_ref PRIMARY PRIMARY 4 aanime_aanimedb.t.topic_last_post_id 1 [/code]The "forum" table is still being selected first.[/quote]The EXPLAIN shows the "topics" table as the first table.I can't say that you'll be able to see the difference using this query with the current size of the DB/topics table, but this should be faster and hold up as the table size grows. I've used a large test table and made a similar query which gives good results. You may want to consider setting up a large topics table of your own to query.Requires MYSQL >= 4.1.[code]SELECTt1.*, p.poster_name, p.post_time FROM( SELECT STRAIGHT JOIN t.topic_id, t.topic_title, t.topic_sticky, t.topic_views, t.num_posts, t.topic_last_post_id, f.forum_icon, f.forum_name FROM minibb_topics AS t INNER JOIN minibb_forums AS f ON t.forum_id=f.forum_id WHERE 5 >= forum_level_to_view && (forum_site = 0 || forum_site = 1) ORDER BY t.topic_sticky DESC, t.topic_last_post_id DESC LIMIT 20) AS t1INNER JOINminibb_posts AS pONp.post_id = t1.topic_last_post_id [/code]You should still have the multi-column index on (topic_sticky, topic_last_post_id). If it's still not being used then the query will most likely be slow. Assuming you don't think that 0.02 seconds is a slow query but are only concerned about the difference between ordering and not ordering the results, then revisting the query if or when it becomes slow is an option that you should consider. Quote Link to comment Share on other sites More sharing options...
lostmouse Posted November 3, 2006 Author Share Posted November 3, 2006 Wow, this query works very well! The execution speed is actually more around 0.002 for my server which is more than acceptable.Thank you so much for the help! Quote Link to comment Share on other sites More sharing options...
shoz Posted November 3, 2006 Share Posted November 3, 2006 There are 2 things I should mention. First, after thinking about what you're after note that the query will show sticky topics even if they have not been recently replied to. Meaning that if there are 20 or more sticky topics in total across all forums, then only those will be shown.If what you'd like is to have the 20 most recently replied to topics shown and then if any sticky topics are a part of that 20 that they show first, then you should change the query to something similar to the following.[code]SELECTt1.*, p.poster_name, p.post_timeFROM( SELECT STRAIGHT JOIN t.topic_id, t.topic_title, t.topic_sticky, t.topic_views, t.num_posts, t.topic_last_post_id, f.forum_icon, f.forum_name FROM minibb_topics AS t INNER JOIN minibb_forums AS f ON t.forum_id=f.forum_id WHERE 5 >= forum_level_to_view && (forum_site = 0 || forum_site = 1) ORDER BY t.topic_last_post_id DESC LIMIT 20) AS t1INNER JOINminibb_posts AS pONp.post_id = t1.topic_last_post_idORDER BYt1.topic_sticky DESC[/code]Note the ordering inside the subquery and then the ordering after the JOIN.Secondly, I've noticed that the OPTIMIZER in MYSQL 5 won't use the index on the ORDER BY column even after using STRAIGHT_JOIN. You should be able to have MYSQL use the index by changing the structure to the following.[code]SELECTt1.*, f.*, p.*FROM( SELECT * FROM topics WHERE forum_id IN (SELECT forum_id FROM forums WHERE 5 >= ....) ORDER BY topic_last.... LIMIT 20) AS t1INNER JOINforumsON forum_id = forum_idINNER JOINpostsON post_id = post_idORDER BYsticky .....[/code]The query should still be quick and it can be used in both versions. Note that an individual index should be put on the topic_last... column to accommodate the new ordering and that the multi-column index on (sticky , topic_last) can serve as an index on the "sticky" column. I'd guess that you'd use the multi-column index on sticky and topic_last when selecting topics from individual forums. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.