Jump to content

Recommended Posts

I have 3 tables.  One is a collections table that gets searched.  It

contains collections of posts from another table.  Fulltext index on

the subject.  Second one is a lot bigger table that has the posts.

Then I have a third table that has each collection id matched with the

post id.  each collection id could have anywhere from 1-500 posts

connected to it.  In the posts table there is 2 columns with numbers.

I would like to do a fulltext search on the collections table and get

the results from it but also get the sum of the 2 number columns.

 

Here is the query I have come up with but it is very slow and very different each time on how long it takes.  I will also give the full create tables for the 3 tables.

 

If anyone can help me with this It would be much appreciated.

 

SELECT

`collections`.`id`,`collections`.`subject`,`collections`.`numfiles`,`collec­tions`.`groups`,`collections`.`size`,`collections`.`from`,`collections`.`da­te`,`collections`.`nfo`,sum(mainsubjects.num),sum(mainsubjects.`max`)

FROM temp.collections, temp.mainsubjects, collectionpartids WHERE

match (`collections`.`subject`) against ('dvdr' in

boolean mode) and collections.id = collectionpartids.cid and

collectionpartids.fid = mainsubjects.id group by collections.id order

by `collections`.`date` desc

 

CREATE TABLE `collections` (

  `id` mediumint(7) unsigned NOT NULL auto_increment,

  `subject` text NOT NULL,

  `filename` text NOT NULL,

  `numfiles` smallint(6) unsigned NOT NULL default '0',

  `groups` text NOT NULL,

  `partids` mediumtext NOT NULL,

  `size` bigint(20) unsigned NOT NULL default '0',

  `from` varchar(90) NOT NULL default '',

  `date` int(10) unsigned NOT NULL default '0',

  `group1` mediumint(8) unsigned NOT NULL,

  `group2` mediumint(8) unsigned default NULL,

  `group3` mediumint(8) unsigned default NULL,

  `group4` mediumint(8) unsigned default NULL,

  `group5` mediumint(8) unsigned default NULL,

  `group6` mediumint(8) unsigned default NULL,

  `group7` mediumint(8) unsigned default NULL,

  `group8` mediumint(8) unsigned default NULL,

  `group9` mediumint(8) unsigned default NULL,

  `group10` mediumint(8) unsigned default NULL,

  `nfo` int(10) unsigned NOT NULL default '0',

  PRIMARY KEY  (`id`),

  UNIQUE KEY `filename_U` (`filename`(255)),

  KEY `from` (`from`(50)),

  KEY `date` (`date`),

  KEY `groupindex` (`group1`,`group2`,`group3`,`group4`,`group5`,`group6`,`group7`,`group8`,`group9`,`group10`),

  KEY `Size` (`size`),

  KEY `groups` (`groups`(250)),

  KEY `nfo` (`nfo`),

  FULLTEXT KEY `Subject_full` (`subject`)

) ENGINE=MyISAM AUTO_INCREMENT=424366 DEFAULT CHARSET=latin1 MIN_ROWS=100000 MAX_ROWS=200000 AVG_ROW_LENGTH=650

 

CREATE TABLE `collectionpartids` (

  `cid` int(10) unsigned NOT NULL,

  `fid` int(10) unsigned NOT NULL,

  PRIMARY KEY  (`cid`,`fid`)

) ENGINE=MyISAM DEFAULT CHARSET=latin1

 

CREATE TABLE `mainsubjects` (

  `id` mediumint(8) unsigned NOT NULL default '0',

  `subject` text NOT NULL,

  `totalbytes` int(10) unsigned NOT NULL,

  `date` int(10) unsigned NOT NULL default '0',

  `groups` text NOT NULL,

  `num` smallint(5) unsigned NOT NULL default '0',

  `max` smallint(5) unsigned NOT NULL default '0',

  `from` tinytext NOT NULL,

  PRIMARY KEY  (`id`),

  KEY `date` (`date`),

  KEY `numfiles` (`num`,`max`)

) ENGINE=MyISAM DEFAULT CHARSET=latin1 MIN_ROWS=3000000 MAX_ROWS=10000000 AVG_ROW_LENGTH=195 PACK_KEYS=1

Link to comment
https://forums.phpfreaks.com/topic/52415-solved-need-help-optimizing-join/
Share on other sites

It's kinda hard to read but here it is.

 

+----+-------------+-------------------+----------+----------------------+------

--------+---------+----------------------------+------+-------------------------

---------------------+

| id | select_type | table            | type    | possible_keys        | key

        | key_len | ref                        | rows | Extra

                    |

+----+-------------+-------------------+----------+----------------------+------

--------+---------+----------------------------+------+-------------------------

---------------------+

|  1 | SIMPLE      | collections      | fulltext | PRIMARY,Subject_full | Subje

ct_full | 0      |                            |    1 | Using where; Using tempo

rary; Using filesort |

|  1 | SIMPLE      | collectionpartids | ref      | PRIMARY              | PRIMA

RY      | 4      | temp.collections.id        |  10 | Using where; Using index

                    |

|  1 | SIMPLE      | mainsubjects      | eq_ref  | PRIMARY              | PRIMA

RY      | 3      | temp.collectionpartids.fid |    1 | Using where

                    |

+----+-------------+-------------------+----------+----------------------+------

--------+---------+----------------------------+------+-------------------------

---------------------+

3 rows in set (0.00 sec)

the match wuery with order by is super fast.  There are just over 400k rows and it usually does the query in .1 secs or less.

 

Also the other part with the other 2 tables is lightning fast if I just use them and one collection id.  I have people searching for anywhere from 50-500 results and it takes forever anywhere from 5-60 secs per query.

 

Please provide a small sample data set of each table (in code quotes)

 

What is there a difference between using:

 

WHERE match (`collections`.`subject`) against ('dvdr' in boolean mode) 

 

and:

 

WHERE collections.subject LIKE '%dvdr%'

 

? ...I have very little knowledge regarding MATCH...AGAINST

ok since there is a size limit I can only submit 400 collections and the corresponding data for the other tables.  Just for a fyi.  The collections table has around 400k+ rows.  The collectionpartids and mainsubjects tables both have around 4 million + rows.  Like takes a whole lot longer.  The match is for the fulltext index that is used on the collections table.

 

[attachment deleted by admin]

The following appears to be quicker on a small record base:

 

SELECT collections.id
     , collections.subject
     , collections.numfiles
     , collections.groups
     , collections.size
     , collections.from
     , collections.date
     , collections.nfo
     , ( SELECT sum(mainsubjects.num) FROM mainsubjects INNER JOIN collectionpartids ON mainsubjects.id = collectionpartids.fid AND collectionpartids.cid = collections.id ) AS mainsubjects_num
     , ( SELECT sum(mainsubjects.max) FROM mainsubjects INNER JOIN collectionpartids ON mainsubjects.id = collectionpartids.fid AND collectionpartids.cid = collections.id ) AS mainsubjects_max
FROM collections
WHERE MATCH (collections.subject) AGAINST ('dvdr' IN BOOLEAN MODE) 
GROUP BY collections.id 
ORDER BY collections.date desc

4.1.21-community-nt

 

I copied and pasted the following query into SQLyog

 

SELECT collections.id
    , collections.subject
    , collections.numfiles
    , collections.groups
    , collections.size
    , collections.from
    , collections.date
    , collections.nfo
    , ( SELECT sum(mainsubjects.num) FROM mainsubjects INNER JOIN collectionpartids ON mainsubjects.id = collectionpartids.fid AND collectionpartids.cid = collections.id ) AS mainsubjects_num
    , ( SELECT sum(mainsubjects.max) FROM mainsubjects INNER JOIN collectionpartids ON mainsubjects.id = collectionpartids.fid AND collectionpartids.cid = collections.id ) AS mainsubjects_max
FROM collections
WHERE MATCH (collections.subject) AGAINST ('dvdr' IN BOOLEAN MODE) 
GROUP BY collections.id 
ORDER BY collections.date desc;

 

... and got the following result:

 

id          numfiles    size                 from                                                                                       date        nfo         mainsubjects_num  mainsubjects_max  
269         107         5120108628           [email protected] (Team TeaBag)                                                              1175864267  97566       13170             13181             
104         1           51733781             [email protected] (Team TeaBag)                                                              1175857722  0           133               133               
265         2           163922               [email protected] (Team TeaBag)                                                              1175857214  0           2                 2                 
12          1           51696885             [email protected] (Team TeaBag)                                                              1175857049  0           133               133               
240         113         5336665215           [email protected] (Team TeaBag)                                                              1175854563  97157       13740             14045             
172         113         5314060198           [email protected] (Team TeaBag)                                                              1175853137  97072       13675             13993             
85          113         5386441836           [email protected] (Team TeaBag)                                                              1175852973  96951       13860             14075             
338         2           19414                [email protected] (disorder)                                                       1175840232  96210       2                 2                 
90          106         5086235798           [email protected] (Team TeaBag)                                                              1175827371  94315       13084             13084             
194         2           6035845              [email protected] (Team TeaBag)                                                              1175820296  0           17                17                
210         1           38831745             [email protected] (Team TeaBag)                                                              1175820239  0           100               100               
117         1           36125626             [email protected] (Team TeaBag)                                                              1175820002  0           93                93                
5           106         5087442717           [email protected] (Team TeaBag)                                                              1175783674  93924       13085             13085             
47          1           2978                 [email protected] (Team TeaBag)                                                              1175682268  90545       1                 1                 
114         2           929738               [email protected] (Team TeaBag)                                                              1175682173  0           3                 3                 
306         105         5014443546           [email protected] (Team TeaBag)                                                              1175676395  90728       12901             12901             
152         1           185791               [email protected] (Team TeaBag)                                                              1175671382  0           1                 1                 
323         1           42488051             [email protected] (Team TeaBag)                                                              1175671305  0           109               109               
14          106         5090985562           [email protected] (Team TeaBag)                                                              1175667861  88228       13098             13098             
396         1           51726714             [email protected] (Team TeaBag)                                                              1175573600  0           133               133               
179         106         5078252211           [email protected] (Team TeaBag)                                                              1175428830  75897       13091             13091             
290         76          3322073885           [email protected] (Team TeaBag)                                                              1175419718  76055       8562              8562              
161         1           33289201             [email protected] (Team TeaBag)                                                              1175408043  0           86                86                
287         107         5139767976           [email protected] (Team TeaBag)                                                              1175405620  74049       13232             13232             
141         1           126979               [email protected] (Team TeaBag)                                                              1175404705  0           1                 1                 

 

EDIT: for some reason the subject and groups columns did not copy across in this post but the columns appear in the result set.

ok here are the 2 queries I have managed to get working.  They both take the same time to run and vary big time.

 

SELECT collections.id
, collections.subject
, collections.numfiles
, collections.groups
, collections.size
, collections.from
, collections.date
, collections.nfo
, ( SELECT sum(mainsubjects.num) FROM mainsubjects INNER JOIN collectionpartids ON mainsubjects.id = collectionpartids.fid where collectionpartids.cid = collections.id ) AS small
, ( SELECT sum(mainsubjects.max) FROM mainsubjects INNER JOIN collectionpartids ON mainsubjects.id = collectionpartids.fid where collectionpartids.cid = collections.id ) AS big
FROM collections
WHERE MATCH (collections.subject) AGAINST ('dvdr' IN BOOLEAN MODE)
GROUP BY collections.id
order by `date` desc
limit 500

 

SELECT `collections`.`id`,`collections`.`subject`,`collections`.`numfiles`,
`collections`.`groups`,`collections`.`size`,`collections`.`from`,`collections`.`date`,
`collections`.`nfo`,sum(mainsubjects.num) as small,sum(mainsubjects.`max`) as big
FROM temp.collections, temp.mainsubjects, collectionpartids
WHERE match (`collections`.`subject`) against ('dvdr' in boolean mode)
and collections.id = collectionpartids.cid
and collectionpartids.fid = mainsubjects.id
group by collections.id
order by `collections`.`date` desc
limit 500

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.