Jump to content

SELECT ... NOT IN (...) issues


x55

Recommended Posts

Hey guys how's it going? I'm having trouble with speed issues, this is taking 1+ hours to complete. Here's my query:

 

SELECT DISTINCT bookmurls.url, subject FROM bookmurls WHERE bookmurls.url NOT IN (SELECT bookmused.url FROM bookmused);

 

 

i'm running "mysql 4.1.22-standard"

 

table structures:

 

CREATE TABLE `bookmused` (
  `url` varchar(255) NOT NULL default '',
  `subject` varchar(255) NOT NULL default ''
) ENGINE=MyISAM DEFAULT CHARSET=latin1


CREATE TABLE `bookmurls` (
  `url` varchar(255) NOT NULL default '',
  `subject` varchar(255) NOT NULL default ''
) ENGINE=MyISAM DEFAULT CHARSET=latin1

 

 

 

my bookmused table has 80k rows, and will grow by 1-2k every day. my bookmurls will always be around 36k, i've also tried:

 

SELECT DISTINCT bookmurls AS a LEFT JOIN bookmused AS c USING (url) where c.url IS NULL

 

taking cue from http://www.phpfreaks.com/forums/index.php/topic,234132.0.html

 

havnt seen results from that to see if the queries correct or not, but thats been running for well over an hour also... so that doesnt seem to fit my needs  :-\

 

Could this be my server specs or is there a way to speed this up?

 

 

 

Link to comment
Share on other sites

this is what I get with EXPLAIN+original query

 

"id","select_type","table","type","possible_keys","key","key_len","ref","rows","Extra"
"1","PRIMARY","bookmurls","ALL",NULL,NULL,NULL,NULL,"36106","Using where; Using temporary"
"2","DEPENDENT SUBQUERY","bookmused","ALL",NULL,NULL,NULL,NULL,"88335","Using where"

 

Link to comment
Share on other sites

Hi

 

Suspect MySQL will not appreciate trying to use a varchar 255 as a key field.

 

I would suggest putting the url var char into a seperate table keyed with a unique numeric, and then use that unique numeric instead of the full url in both of your abobe tables.

 

At the moment at best you are doing a join on a very long variable.

 

All the best

 

Keith

Link to comment
Share on other sites

That's not possible... show the explain output again.

 

"id","select_type","table","type","possible_keys","key","key_len","ref","rows","Extra"
"1","PRIMARY","bookmurls","index",NULL,"PRIMARY","514",NULL,"35910","Using where; Using index; Using temporary"
"2","DEPENDENT SUBQUERY","bookmused","index",NULL,"PRIMARY","314",NULL,"136841","Using where; Using index"

 

Query: 13522.79935 Seconds, my current table structure:

 

"Table","Create Table"
"bookmurls","CREATE TABLE `bookmurls` (
  `id` int(10) unsigned NOT NULL auto_increment,
  `url` varchar(255) NOT NULL default '',
  `subject` varchar(255) NOT NULL default '',
  PRIMARY KEY  (`id`,`url`)
) ENGINE=MyISAM AUTO_INCREMENT=35911 DEFAULT CHARSET=latin1"

"Table","Create Table"
"bookmused","CREATE TABLE `bookmused` (
  `id` int(10) unsigned NOT NULL auto_increment,
  `url` varchar(255) NOT NULL default '',
  `subject` varchar(55) NOT NULL default '',
  PRIMARY KEY  (`id`,`url`)
) ENGINE=MyISAM AUTO_INCREMENT=136843 DEFAULT CHARSET=latin1"

 

Hi

 

Suspect MySQL will not appreciate trying to use a varchar 255 as a key field.

 

I would suggest putting the url var char into a seperate table keyed with a unique numeric, and then use that unique numeric instead of the full url in both of your abobe tables.

 

At the moment at best you are doing a join on a very long variable.

 

All the best

 

Keith

 

thanks, I wish I could do that, but bookmurls is truncated and refilled everytime its ran, so i'm not seeing a way to track that.

 

I may try to grab highest id#, insert where not exists,select id > grabbed id... not sure if that "insert where not exists" will speed things up but im hoping its better than this 3 1/2 hour query

Link to comment
Share on other sites

  • 2 weeks later...

Crikey, i'm surprised MySQL didn't complain about the auto_increment id being part of a composite key. Didn't realise you could do that...

You might want a unique index on just the URL field.

 

You also might want to consider modifying your query to something more like this:

 

SELECT DISTINCT bookmurls.url, subject FROM bookmurls WHERE bookmurls.url NOT IN (SELECT bookmused.url FROM bookmused);

SELECT url.url
      ,url.subject
FROM bookmurls url
LEFT JOIN bookmused used ON url.url = used.url
WHERE used.url IS NULL
GROUP BY url.url, url.subject

Link to comment
Share on other sites

Crikey, i'm surprised MySQL didn't complain about the auto_increment id being part of a composite key. Didn't realise you could do that...

No issues with this... it's actually helpful sometimes (though rarely).

 

I simply meant for index usage -- and I think I never saw the explain for the LEFT JOINed version.

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.