drbigfresh Posted March 12, 2008 Share Posted March 12, 2008 I have a large database, 5.5 Million + rows, and I am trying to select a few thousand random records every 10 minutes or so. I was having a TON of performance issues, so I added a field to the database called randomnum, and populated it with a random number.... so now I do the select and ordering it by the randomnum. Although it is still a performace boost, my queries are taking upwards of 30 seconds. Does anyone have any suggestions for optimizing these types of queries? (And yes, the field is indexed). When I run the query with and explain I get: 1 SIMPLE wedding_user range Active Active 1 NULL 3443380 Using where; Using filesort If I remove the "where active='Y'" the query does have a nice speed bump. Quote Link to comment Share on other sites More sharing options...
effigy Posted March 12, 2008 Share Posted March 12, 2008 Have you tried this? Quote Link to comment Share on other sites More sharing options...
drbigfresh Posted March 12, 2008 Author Share Posted March 12, 2008 I did actually see that, and it is working nicely for me. I think the real performance hit I am taking is in the where clause and the subquery. My query is like this: select id,fname,lname,address1,terms from site_user where id not in(select userid from send_log where campaignid=4) and active='Y' limit 5000 if I remove [where id not in(select userid from send_log where campaignid=4) and active=Y] it flies along pretty nice.... What can I do to make this part run better? Someone suggested trying a union.... Quote Link to comment Share on other sites More sharing options...
fenway Posted March 12, 2008 Share Posted March 12, 2008 Can I see the EXPLAIN for that query/ Quote Link to comment Share on other sites More sharing options...
drbigfresh Posted March 12, 2008 Author Share Posted March 12, 2008 id select_type table type possible_keys key key_len ref rows Extra 1 PRIMARY site_user ref Active Active 1 const 3443380 Using where 2 DEPENDENT SUBQUERY send_log index_subquery campaignid,userid userid 5 func 4 Using where Quote Link to comment Share on other sites More sharing options...
fenway Posted March 12, 2008 Share Posted March 12, 2008 That's what I thought. Even though that subquery really shouldn't be dependent on the outer one (since the where clause checks an inner table field, correct?), that NOT IN means it has to run that every time, AFAIK. That's probably why it's so slow. You'd be better off with a LEFT JOIN. Quote Link to comment Share on other sites More sharing options...
drbigfresh Posted March 12, 2008 Author Share Posted March 12, 2008 You're correct, the where is checking an inner-table field on site_user. If I remove the where all together, and just have the not in, would a left join still see a performance boost? Quote Link to comment Share on other sites More sharing options...
fenway Posted March 13, 2008 Share Posted March 13, 2008 NOT IN() is just plain bad... no index usage. But with a left join, you won't need either. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.