roopurt18 Posted May 22, 2007 Share Posted May 22, 2007 I've got this query that reports details on a group of users within our system I'd like to optimize. On average, the query takes 3.5 - 4.25s to run. SELECT you.user_wvid AS UserID, LOWER(you.login) AS Login, LOWER(you.password) AS `Password`, you.subname AS SubName, s.full_name AS FullName, @onsites:=SUM(IF(t.onsite=1, 1, 0)) AS HasOnSites, @offsites:=SUM(IF(t.onsite=0, 1, 0)) AS HasOffSites, CASE WHEN @onsites > 0 AND @offsites > 0 THEN 'both' WHEN @onsites > 0 AND @offsites = 0 THEN 'onsite' WHEN @offsites > 0 AND @onsites = 0 THEN 'offsite' END AS SubType, MAX(a.TimeRecord) AS LastActiveStamp, IFNULL(DATE_FORMAT(a.TimeRecord, '%a. %b. %D, %Y'), '-') AS LastActiveDisp FROM wv_user you CROSS JOIN wssubc s ON you.subname=s.subname CROSS JOIN SubProjectAccess pa ON you.subname=pa.SubName CROSS JOIN wstrades t ON pa.Trade=t.tcode LEFT JOIN UserActivity a ON you.login=a.UserLogin WHERE you.role_wvid=2 GROUP BY you.login ORDER BY you.login LIMIT 0,25 EXPLAIN gives... I'm trying to get rid of the using temporary in the EXPLAIN results but not having much luck. Any suggestions? Quote Link to comment https://forums.phpfreaks.com/topic/52532-solved-query-optimization/ Share on other sites More sharing options...
bubblegum.anarchy Posted May 22, 2007 Share Posted May 22, 2007 Do you have an index on wv_user.login and UserActivity.UserLogin ? Quote Link to comment https://forums.phpfreaks.com/topic/52532-solved-query-optimization/#findComment-259376 Share on other sites More sharing options...
roopurt18 Posted May 22, 2007 Author Share Posted May 22, 2007 wv_user UserActivity Quote Link to comment https://forums.phpfreaks.com/topic/52532-solved-query-optimization/#findComment-259428 Share on other sites More sharing options...
shoz Posted May 22, 2007 Share Posted May 22, 2007 Try putting a multi-column index on (role_wvid, login) and try the query again. Use an EXPLAIN and post the results. If MYSQL does not use the new index, try forcing it with "SELECT .... FROM wv_user AS you FORCE INDEX(indexname) CROSS JOIN ....". If the index is still not used then you can SELECT and LIMIT the rows from the wv_user table first in a subquery and then do the JOIN SELECT you.*, s.full_name .... etc etc FROM ( SELECT sub_you.user_wvid AS UserID, LOWER(sub_you.login) AS Login, LOWER(sub_you.password) AS `Password`, sub_you.subname AS SubName FROM wv_user AS sub_you WHERE sub_you.role_wvid = 2 ORDER BY sub_you.login LIMIT 0,25 ) AS you ....... CROSS JOIN ... CROSS JOIN .... ... GROUP BY you.login ORDER BY you.login Note that there is no second WHERE clause. The ORDER BY is repeated because it shouldn't be assumed that the JOIN will leave the result in the order created by the subquery. Could you also post each EXPLAIN after each attempt, even if one of the solution works. Btw, the following line should be. IFNULL(DATE_FORMAT(MAX(a.TimeRecord), '%a. %b. %D, %Y'), '-') AS LastActiveDisp I'd also recommend against using the variables and instead do. CASE WHEN SUM(IF(t.onsite=1, 1, 0)) > 0 AND SUM(IF(t.onsite=0, 1, 0)) > 0 THEN 'both' etc Quote Link to comment https://forums.phpfreaks.com/topic/52532-solved-query-optimization/#findComment-259453 Share on other sites More sharing options...
roopurt18 Posted May 23, 2007 Author Share Posted May 23, 2007 I'm not at work anymore so I'll have to try your suggestions tomorrow, but... MAX(a.TimeRecord) AS LastActiveStamp, IFNULL(DATE_FORMAT(a.TimeRecord, '%a. %b. %D, %Y'), '-') AS LastActiveDisp I can't be sure without the code in front of me, but I think the program itself needs the timestamp as well, which is why I'm returning two separate fields there. As for the variables, my original query wasn't using them and I had added them thinking it would probably be more optimal. Is there any reason why that's not the case? I'll report back on the other stuff some time tomorrow. Thanks for the responses. Quote Link to comment https://forums.phpfreaks.com/topic/52532-solved-query-optimization/#findComment-259465 Share on other sites More sharing options...
shoz Posted May 23, 2007 Share Posted May 23, 2007 I'm not at work anymore so I'll have to try your suggestions tomorrow, but... MAX(a.TimeRecord) AS LastActiveStamp, IFNULL(DATE_FORMAT(a.TimeRecord, '%a. %b. %D, %Y'), '-') AS LastActiveDisp I can't be sure without the code in front of me, but I think the program itself needs the timestamp as well, which is why I'm returning two separate fields there. I meant that the second line should have DATE_FORMAT(MAX(a.timeRecord), the MAX() is missing. As for the variables, my original query wasn't using them and I had added them thinking it would probably be more optimal. Is there any reason why that's not the case? My first reason for suggesting you not use the variables is The general rule is never to assign a value to a user variable in one part of a statement and use the same variable in some other part of the same statement. You might get the results you expect, but this is not guaranteed. I also believe that identical expressions are evaluated only once but I can't find the entry in the manual (assuming it exists) that corroborates that. Assume that I'm wrong for now. Quote Link to comment https://forums.phpfreaks.com/topic/52532-solved-query-optimization/#findComment-259472 Share on other sites More sharing options...
bubblegum.anarchy Posted May 23, 2007 Share Posted May 23, 2007 Have a look at reply # 7 from the following thread - something similar may be applicable. http://www.phpfreaks.com/forums/index.php/topic,141742.msg605260.html#new Quote Link to comment https://forums.phpfreaks.com/topic/52532-solved-query-optimization/#findComment-259528 Share on other sites More sharing options...
roopurt18 Posted May 23, 2007 Author Share Posted May 23, 2007 I placed a multi-column index (role_wvid, login) on wv_user and the using temporary; using filesort has been eliminated. Would someone be willing to offer a brief explanation or link to an explanation of why the multi-column index made a difference in this case? I'm still a bit of a noob when it comes to DB optimization. Also, if the LIMIT is left off the query or if the second part of the limit extends the actual number of records (LIMIT 100,25 on a table with 102 records), I notice that using temporary; using filesort makes its way back into EXPLAIN. What would be the cause of this? Would it be more efficient to precede this query with another that counts the number of records in wv_user and stops the LIMIT from extending beyond the actual number of records? Quote Link to comment https://forums.phpfreaks.com/topic/52532-solved-query-optimization/#findComment-260263 Share on other sites More sharing options...
roopurt18 Posted May 23, 2007 Author Share Posted May 23, 2007 I've now removed the multi-column index I created earlier on wv_user and restructured the query: SELECT u.user_wvid AS UserID, LOWER(u.login) AS Login, LOWER(u.password) AS `Password`, u.subname AS SubName, s.full_name AS FullName, SUM(IF(t.onsite=1, 1, 0)) AS HasOnSites, SUM(IF(t.onsite=0, 1, 0)) AS HasOffSites, CASE WHEN SUM(IF(t.onsite=1, 1, 0)) > 0 AND SUM(IF(t.onsite=0, 1, 0)) > 0 THEN 'both' WHEN SUM(IF(t.onsite=1, 1, 0)) > 0 AND SUM(IF(t.onsite=0, 1, 0)) = 0 THEN 'onsite' WHEN SUM(IF(t.onsite=0, 1, 0)) > 0 AND SUM(IF(t.onsite=1, 1, 0)) = 0 THEN 'offsite' END AS SubType, IFNULL( ( SELECT a.TimeRecord AS LastActiveStamp FROM UserActivity a WHERE a.UserLogin=u.login ORDER BY a.TimeRecord DESC LIMIT 1 ), '-') AS LastActiveStamp, IFNULL( DATE_FORMAT( ( SELECT a.TimeRecord AS LastActiveStamp FROM UserActivity a WHERE a.UserLogin=u.login ORDER BY a.TimeRecord DESC LIMIT 1 ), '%a. %b. %D, %Y' ), '-') AS LastActiveDisp FROM wv_user u CROSS JOIN wssubc s ON u.subname=s.subname CROSS JOIN SubProjectAccess pa ON u.subname=pa.SubName CROSS JOIN wstrades t ON pa.Trade=t.tcode WHERE u.role_wvid=2 GROUP BY u.login ORDERY BY _xyz_ LIMIT 0,25 I've structured the query in this manner because there is an external parameter that determines the ORDER BY, which can be any of the fields returned by the query (UserID, Login, LastActiveStamp, etc). Using any of the fields with ORDER BY will cause using temporary to show up in EXPLAIN, but it's consistently faster than 1s so I'm happy with the results for now. Also, I had forgotten about using sub-queries since our last host was MySQL 4.0.x, thanks for reminding me! Quote Link to comment https://forums.phpfreaks.com/topic/52532-solved-query-optimization/#findComment-260314 Share on other sites More sharing options...
shoz Posted May 23, 2007 Share Posted May 23, 2007 I placed a multi-column index (role_wvid, login) on wv_user and the using temporary; using filesort has been eliminated. Would someone be willing to offer a brief explanation or link to an explanation of why the multi-column index made a difference in this case? I'm still a bit of a noob when it comes to DB optimization. These are the relevant parts of the query that make the multi-column index useful. WHERE you.role_wvid=2 ORDER BY you.login LIMIT 0,25 The basic explanation is that without the multi-column index MYSQL had to retrieve ALL the rows from the disk that had a role_wvid with a value of 2 and then ORDER/sort and LIMIT the results using a temporary table. With an index which can be thought of as being similar to the following role_wvid login ------------------------- 1 john 2 bill 2 jim 2 suzan 3 betty 3 mark you'll notice that MYSQL will be able to lookup the rows with a lot_wvid value of 2 and retrieve the rows in the order of the "login" without much effort using only the index. This also allows MYSQL to retrieve only the first 25 rows in the table that match. The EXPLAIN thinks that it will have to look at 101 rows but when the actual query is run it doesn't(shouldn't) because of the LIMIT. Many times I've put a multi-column index on the ORDER BY column before the WHERE clause column to achieve similar results as well. Keep in mind that this is my "understanding" of what happens and shouldn't be thought of as an authoritative answer. Also, if the LIMIT is left off the query or if the second part of the limit extends the actual number of records (LIMIT 100,25 on a table with 102 records), I notice that using temporary; using filesort makes its way back into EXPLAIN. What would be the cause of this? I don't know why that occurs. It could be a bug that doesn't allow MYSQL to recognize that it can use the same plan to execute the query when the LIMIT extends beyond the number of records in the table. If the EXPLAIN also shows that it doesn't use the multi-column index, then a FORCE INDEX will probably help. Would it be more efficient to precede this query with another that counts the number of records in wv_user and stops the LIMIT from extending beyond the actual number of records? Usually you'd be using the LIMIT to create some kind of pagination for the results. Generally you'd retrieve the COUNT before hand to determine whether or not a "next" button should be displayed. Even if that's not the case retrieving the COUNT of the records in the wv_user table with a role_wvid of the specified value should be relatively quick. In regard to your post about restructuring the query. Another approach in light of the variable ORDER BY column may be to create specific queries based on which column is used in the ORDER BY. SELECTing and LIMITing the results in a subquery using the relevant table. I haven't worked it out in my mind completely so it may or may not be practical in this instance. You should find the following site helpful btw Optimizing MYSQL Queries Quote Link to comment https://forums.phpfreaks.com/topic/52532-solved-query-optimization/#findComment-260335 Share on other sites More sharing options...
roopurt18 Posted May 23, 2007 Author Share Posted May 23, 2007 I probably won't revisit this query for a while since I'm happy with the current results. The whole page used to take 7 or 8 seconds to load and now it's down to about 1.5s. Another approach in light of the variable ORDER BY column may be to create specific queries based on which column is used in the ORDER BY. SELECTing and LIMITing the results in a subquery using the relevant table. I'll keep this in the back of my mind in case I do have to come back to it or face a similar situation elsewhere. Again, thanks to everyone for all their help. Quote Link to comment https://forums.phpfreaks.com/topic/52532-solved-query-optimization/#findComment-260341 Share on other sites More sharing options...
shoz Posted May 23, 2007 Share Posted May 23, 2007 Another approach in light of the variable ORDER BY column may be to create specific queries based on which column is used in the ORDER BY. SELECTing and LIMITing the results in a subquery using the relevant table. I'll keep this in the back of my mind in case I do have to come back to it or face a similar situation elsewhere. After putting more thought into it I don't believe the suggestion is usable. Other ideas of JOINing the wv_user and activity table in a subquery may be better to keep in mind. Quote Link to comment https://forums.phpfreaks.com/topic/52532-solved-query-optimization/#findComment-260378 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.