Jump to content

[SOLVED] Need help optimizing join


binindex

Recommended Posts

I have 3 tables.  One is a collections table that gets searched.  It

contains collections of posts from another table.  Fulltext index on

the subject.  Second one is a lot bigger table that has the posts.

Then I have a third table that has each collection id matched with the

post id.  each collection id could have anywhere from 1-500 posts

connected to it.  In the posts table there is 2 columns with numbers.

I would like to do a fulltext search on the collections table and get

the results from it but also get the sum of the 2 number columns.

 

Here is the query I have come up with but it is very slow and very different each time on how long it takes.  I will also give the full create tables for the 3 tables.

 

If anyone can help me with this It would be much appreciated.

 

SELECT

`collections`.`id`,`collections`.`subject`,`collections`.`numfiles`,`collec­tions`.`groups`,`collections`.`size`,`collections`.`from`,`collections`.`da­te`,`collections`.`nfo`,sum(mainsubjects.num),sum(mainsubjects.`max`)

FROM temp.collections, temp.mainsubjects, collectionpartids WHERE

match (`collections`.`subject`) against ('dvdr' in

boolean mode) and collections.id = collectionpartids.cid and

collectionpartids.fid = mainsubjects.id group by collections.id order

by `collections`.`date` desc

 

CREATE TABLE `collections` (

  `id` mediumint(7) unsigned NOT NULL auto_increment,

  `subject` text NOT NULL,

  `filename` text NOT NULL,

  `numfiles` smallint(6) unsigned NOT NULL default '0',

  `groups` text NOT NULL,

  `partids` mediumtext NOT NULL,

  `size` bigint(20) unsigned NOT NULL default '0',

  `from` varchar(90) NOT NULL default '',

  `date` int(10) unsigned NOT NULL default '0',

  `group1` mediumint(8) unsigned NOT NULL,

  `group2` mediumint(8) unsigned default NULL,

  `group3` mediumint(8) unsigned default NULL,

  `group4` mediumint(8) unsigned default NULL,

  `group5` mediumint(8) unsigned default NULL,

  `group6` mediumint(8) unsigned default NULL,

  `group7` mediumint(8) unsigned default NULL,

  `group8` mediumint(8) unsigned default NULL,

  `group9` mediumint(8) unsigned default NULL,

  `group10` mediumint(8) unsigned default NULL,

  `nfo` int(10) unsigned NOT NULL default '0',

  PRIMARY KEY  (`id`),

  UNIQUE KEY `filename_U` (`filename`(255)),

  KEY `from` (`from`(50)),

  KEY `date` (`date`),

  KEY `groupindex` (`group1`,`group2`,`group3`,`group4`,`group5`,`group6`,`group7`,`group8`,`group9`,`group10`),

  KEY `Size` (`size`),

  KEY `groups` (`groups`(250)),

  KEY `nfo` (`nfo`),

  FULLTEXT KEY `Subject_full` (`subject`)

) ENGINE=MyISAM AUTO_INCREMENT=424366 DEFAULT CHARSET=latin1 MIN_ROWS=100000 MAX_ROWS=200000 AVG_ROW_LENGTH=650

 

CREATE TABLE `collectionpartids` (

  `cid` int(10) unsigned NOT NULL,

  `fid` int(10) unsigned NOT NULL,

  PRIMARY KEY  (`cid`,`fid`)

) ENGINE=MyISAM DEFAULT CHARSET=latin1

 

CREATE TABLE `mainsubjects` (

  `id` mediumint(8) unsigned NOT NULL default '0',

  `subject` text NOT NULL,

  `totalbytes` int(10) unsigned NOT NULL,

  `date` int(10) unsigned NOT NULL default '0',

  `groups` text NOT NULL,

  `num` smallint(5) unsigned NOT NULL default '0',

  `max` smallint(5) unsigned NOT NULL default '0',

  `from` tinytext NOT NULL,

  PRIMARY KEY  (`id`),

  KEY `date` (`date`),

  KEY `numfiles` (`num`,`max`)

) ENGINE=MyISAM DEFAULT CHARSET=latin1 MIN_ROWS=3000000 MAX_ROWS=10000000 AVG_ROW_LENGTH=195 PACK_KEYS=1

Link to comment
Share on other sites

It's kinda hard to read but here it is.

 

+----+-------------+-------------------+----------+----------------------+------

--------+---------+----------------------------+------+-------------------------

---------------------+

| id | select_type | table            | type    | possible_keys        | key

        | key_len | ref                        | rows | Extra

                    |

+----+-------------+-------------------+----------+----------------------+------

--------+---------+----------------------------+------+-------------------------

---------------------+

|  1 | SIMPLE      | collections      | fulltext | PRIMARY,Subject_full | Subje

ct_full | 0      |                            |    1 | Using where; Using tempo

rary; Using filesort |

|  1 | SIMPLE      | collectionpartids | ref      | PRIMARY              | PRIMA

RY      | 4      | temp.collections.id        |  10 | Using where; Using index

                    |

|  1 | SIMPLE      | mainsubjects      | eq_ref  | PRIMARY              | PRIMA

RY      | 3      | temp.collectionpartids.fid |    1 | Using where

                    |

+----+-------------+-------------------+----------+----------------------+------

--------+---------+----------------------------+------+-------------------------

---------------------+

3 rows in set (0.00 sec)

Link to comment
Share on other sites

the match wuery with order by is super fast.  There are just over 400k rows and it usually does the query in .1 secs or less.

 

Also the other part with the other 2 tables is lightning fast if I just use them and one collection id.  I have people searching for anywhere from 50-500 results and it takes forever anywhere from 5-60 secs per query.

 

Link to comment
Share on other sites

ok since there is a size limit I can only submit 400 collections and the corresponding data for the other tables.  Just for a fyi.  The collections table has around 400k+ rows.  The collectionpartids and mainsubjects tables both have around 4 million + rows.  Like takes a whole lot longer.  The match is for the fulltext index that is used on the collections table.

 

[attachment deleted by admin]

Link to comment
Share on other sites

The following appears to be quicker on a small record base:

 

SELECT collections.id
     , collections.subject
     , collections.numfiles
     , collections.groups
     , collections.size
     , collections.from
     , collections.date
     , collections.nfo
     , ( SELECT sum(mainsubjects.num) FROM mainsubjects INNER JOIN collectionpartids ON mainsubjects.id = collectionpartids.fid AND collectionpartids.cid = collections.id ) AS mainsubjects_num
     , ( SELECT sum(mainsubjects.max) FROM mainsubjects INNER JOIN collectionpartids ON mainsubjects.id = collectionpartids.fid AND collectionpartids.cid = collections.id ) AS mainsubjects_max
FROM collections
WHERE MATCH (collections.subject) AGAINST ('dvdr' IN BOOLEAN MODE) 
GROUP BY collections.id 
ORDER BY collections.date desc

Link to comment
Share on other sites

4.1.21-community-nt

 

I copied and pasted the following query into SQLyog

 

SELECT collections.id
    , collections.subject
    , collections.numfiles
    , collections.groups
    , collections.size
    , collections.from
    , collections.date
    , collections.nfo
    , ( SELECT sum(mainsubjects.num) FROM mainsubjects INNER JOIN collectionpartids ON mainsubjects.id = collectionpartids.fid AND collectionpartids.cid = collections.id ) AS mainsubjects_num
    , ( SELECT sum(mainsubjects.max) FROM mainsubjects INNER JOIN collectionpartids ON mainsubjects.id = collectionpartids.fid AND collectionpartids.cid = collections.id ) AS mainsubjects_max
FROM collections
WHERE MATCH (collections.subject) AGAINST ('dvdr' IN BOOLEAN MODE) 
GROUP BY collections.id 
ORDER BY collections.date desc;

 

... and got the following result:

 

id          numfiles    size                 from                                                                                       date        nfo         mainsubjects_num  mainsubjects_max  
269         107         5120108628           nosugar@tea.cup (Team TeaBag)                                                              1175864267  97566       13170             13181             
104         1           51733781             nosugar@tea.cup (Team TeaBag)                                                              1175857722  0           133               133               
265         2           163922               nosugar@tea.cup (Team TeaBag)                                                              1175857214  0           2                 2                 
12          1           51696885             nosugar@tea.cup (Team TeaBag)                                                              1175857049  0           133               133               
240         113         5336665215           nosugar@tea.cup (Team TeaBag)                                                              1175854563  97157       13740             14045             
172         113         5314060198           nosugar@tea.cup (Team TeaBag)                                                              1175853137  97072       13675             13993             
85          113         5386441836           nosugar@tea.cup (Team TeaBag)                                                              1175852973  96951       13860             14075             
338         2           19414                nvnvrlnd@ovrdarainbow.com (disorder)                                                       1175840232  96210       2                 2                 
90          106         5086235798           nosugar@tea.cup (Team TeaBag)                                                              1175827371  94315       13084             13084             
194         2           6035845              nosugar@tea.cup (Team TeaBag)                                                              1175820296  0           17                17                
210         1           38831745             nosugar@tea.cup (Team TeaBag)                                                              1175820239  0           100               100               
117         1           36125626             nosugar@tea.cup (Team TeaBag)                                                              1175820002  0           93                93                
5           106         5087442717           nosugar@tea.cup (Team TeaBag)                                                              1175783674  93924       13085             13085             
47          1           2978                 nosugar@tea.cup (Team TeaBag)                                                              1175682268  90545       1                 1                 
114         2           929738               nosugar@tea.cup (Team TeaBag)                                                              1175682173  0           3                 3                 
306         105         5014443546           nosugar@tea.cup (Team TeaBag)                                                              1175676395  90728       12901             12901             
152         1           185791               nosugar@tea.cup (Team TeaBag)                                                              1175671382  0           1                 1                 
323         1           42488051             nosugar@tea.cup (Team TeaBag)                                                              1175671305  0           109               109               
14          106         5090985562           nosugar@tea.cup (Team TeaBag)                                                              1175667861  88228       13098             13098             
396         1           51726714             nosugar@tea.cup (Team TeaBag)                                                              1175573600  0           133               133               
179         106         5078252211           nosugar@tea.cup (Team TeaBag)                                                              1175428830  75897       13091             13091             
290         76          3322073885           nosugar@tea.cup (Team TeaBag)                                                              1175419718  76055       8562              8562              
161         1           33289201             nosugar@tea.cup (Team TeaBag)                                                              1175408043  0           86                86                
287         107         5139767976           nosugar@tea.cup (Team TeaBag)                                                              1175405620  74049       13232             13232             
141         1           126979               nosugar@tea.cup (Team TeaBag)                                                              1175404705  0           1                 1                 

 

EDIT: for some reason the subject and groups columns did not copy across in this post but the columns appear in the result set.

Link to comment
Share on other sites

ok here are the 2 queries I have managed to get working.  They both take the same time to run and vary big time.

 

SELECT collections.id
, collections.subject
, collections.numfiles
, collections.groups
, collections.size
, collections.from
, collections.date
, collections.nfo
, ( SELECT sum(mainsubjects.num) FROM mainsubjects INNER JOIN collectionpartids ON mainsubjects.id = collectionpartids.fid where collectionpartids.cid = collections.id ) AS small
, ( SELECT sum(mainsubjects.max) FROM mainsubjects INNER JOIN collectionpartids ON mainsubjects.id = collectionpartids.fid where collectionpartids.cid = collections.id ) AS big
FROM collections
WHERE MATCH (collections.subject) AGAINST ('dvdr' IN BOOLEAN MODE)
GROUP BY collections.id
order by `date` desc
limit 500

 

SELECT `collections`.`id`,`collections`.`subject`,`collections`.`numfiles`,
`collections`.`groups`,`collections`.`size`,`collections`.`from`,`collections`.`date`,
`collections`.`nfo`,sum(mainsubjects.num) as small,sum(mainsubjects.`max`) as big
FROM temp.collections, temp.mainsubjects, collectionpartids
WHERE match (`collections`.`subject`) against ('dvdr' in boolean mode)
and collections.id = collectionpartids.cid
and collectionpartids.fid = mainsubjects.id
group by collections.id
order by `collections`.`date` desc
limit 500

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.