Jump to content

Performance Advice (Random Record on Large Dataset)


drbigfresh

Recommended Posts

I have a large database, 5.5 Million + rows, and I am trying to select a few thousand random records every 10 minutes or so. I was having a TON of performance issues, so I added a field to the database called randomnum, and populated it with a random number.... so now I do the select and ordering it by the randomnum. Although it is still a performace boost, my queries are taking upwards of 30 seconds. Does anyone have any suggestions for optimizing these types of queries? (And yes, the field is indexed).

 

When I run the query with and explain I get:

1  SIMPLE  wedding_user  range  Active  Active  1  NULL  3443380  Using where; Using filesort

 

If I remove the "where active='Y'" the query does have a nice speed bump.

 

 

Link to comment
Share on other sites

I did actually see that, and it is working nicely for me. I think the real performance hit I am taking is in the where clause and the subquery. My query is like this:

 

select  id,fname,lname,address1,terms from site_user where id not in(select userid from send_log where campaignid=4)  and active='Y'  limit 5000

 

if I remove [where id not in(select userid from send_log where campaignid=4)  and active=Y] it flies along pretty nice.... What can I do to make this part run better? Someone suggested trying a union....

Link to comment
Share on other sites

id

select_type

table

 

type

possible_keys

key

key_len

ref

rows

Extra

 

 

   

   

1

   

PRIMARY

 

   

site_user

   

ref

   

Active

   

Active

   

1

   

const

 

   

3443380

   

Using where

           

   

   

2

   

DEPENDENT SUBQUERY

   

send_log

 

   

index_subquery

   

campaignid,userid

   

userid

   

5

   

func

   

4

 

   

Using where

 

Link to comment
Share on other sites

That's what I thought.  Even though that subquery really shouldn't be dependent on the outer one (since the where clause checks an inner table field, correct?), that NOT IN means it has to run that every time, AFAIK.  That's probably why it's so slow.  You'd be better off with a LEFT JOIN.

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.