binindex Posted May 22, 2007 Share Posted May 22, 2007 I have 3 tables. One is a collections table that gets searched. It contains collections of posts from another table. Fulltext index on the subject. Second one is a lot bigger table that has the posts. Then I have a third table that has each collection id matched with the post id. each collection id could have anywhere from 1-500 posts connected to it. In the posts table there is 2 columns with numbers. I would like to do a fulltext search on the collections table and get the results from it but also get the sum of the 2 number columns. Here is the query I have come up with but it is very slow and very different each time on how long it takes. I will also give the full create tables for the 3 tables. If anyone can help me with this It would be much appreciated. SELECT `collections`.`id`,`collections`.`subject`,`collections`.`numfiles`,`collections`.`groups`,`collections`.`size`,`collections`.`from`,`collections`.`date`,`collections`.`nfo`,sum(mainsubjects.num),sum(mainsubjects.`max`) FROM temp.collections, temp.mainsubjects, collectionpartids WHERE match (`collections`.`subject`) against ('dvdr' in boolean mode) and collections.id = collectionpartids.cid and collectionpartids.fid = mainsubjects.id group by collections.id order by `collections`.`date` desc CREATE TABLE `collections` ( `id` mediumint(7) unsigned NOT NULL auto_increment, `subject` text NOT NULL, `filename` text NOT NULL, `numfiles` smallint(6) unsigned NOT NULL default '0', `groups` text NOT NULL, `partids` mediumtext NOT NULL, `size` bigint(20) unsigned NOT NULL default '0', `from` varchar(90) NOT NULL default '', `date` int(10) unsigned NOT NULL default '0', `group1` mediumint( unsigned NOT NULL, `group2` mediumint( unsigned default NULL, `group3` mediumint( unsigned default NULL, `group4` mediumint( unsigned default NULL, `group5` mediumint( unsigned default NULL, `group6` mediumint( unsigned default NULL, `group7` mediumint( unsigned default NULL, `group8` mediumint( unsigned default NULL, `group9` mediumint( unsigned default NULL, `group10` mediumint( unsigned default NULL, `nfo` int(10) unsigned NOT NULL default '0', PRIMARY KEY (`id`), UNIQUE KEY `filename_U` (`filename`(255)), KEY `from` (`from`(50)), KEY `date` (`date`), KEY `groupindex` (`group1`,`group2`,`group3`,`group4`,`group5`,`group6`,`group7`,`group8`,`group9`,`group10`), KEY `Size` (`size`), KEY `groups` (`groups`(250)), KEY `nfo` (`nfo`), FULLTEXT KEY `Subject_full` (`subject`) ) ENGINE=MyISAM AUTO_INCREMENT=424366 DEFAULT CHARSET=latin1 MIN_ROWS=100000 MAX_ROWS=200000 AVG_ROW_LENGTH=650 CREATE TABLE `collectionpartids` ( `cid` int(10) unsigned NOT NULL, `fid` int(10) unsigned NOT NULL, PRIMARY KEY (`cid`,`fid`) ) ENGINE=MyISAM DEFAULT CHARSET=latin1 CREATE TABLE `mainsubjects` ( `id` mediumint( unsigned NOT NULL default '0', `subject` text NOT NULL, `totalbytes` int(10) unsigned NOT NULL, `date` int(10) unsigned NOT NULL default '0', `groups` text NOT NULL, `num` smallint(5) unsigned NOT NULL default '0', `max` smallint(5) unsigned NOT NULL default '0', `from` tinytext NOT NULL, PRIMARY KEY (`id`), KEY `date` (`date`), KEY `numfiles` (`num`,`max`) ) ENGINE=MyISAM DEFAULT CHARSET=latin1 MIN_ROWS=3000000 MAX_ROWS=10000000 AVG_ROW_LENGTH=195 PACK_KEYS=1 Quote Link to comment Share on other sites More sharing options...
btherl Posted May 22, 2007 Share Posted May 22, 2007 Can you show output for "explain select ..." for your query? Quote Link to comment Share on other sites More sharing options...
binindex Posted May 22, 2007 Author Share Posted May 22, 2007 It's kinda hard to read but here it is. +----+-------------+-------------------+----------+----------------------+------ --------+---------+----------------------------+------+------------------------- ---------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------------------+----------+----------------------+------ --------+---------+----------------------------+------+------------------------- ---------------------+ | 1 | SIMPLE | collections | fulltext | PRIMARY,Subject_full | Subje ct_full | 0 | | 1 | Using where; Using tempo rary; Using filesort | | 1 | SIMPLE | collectionpartids | ref | PRIMARY | PRIMA RY | 4 | temp.collections.id | 10 | Using where; Using index | | 1 | SIMPLE | mainsubjects | eq_ref | PRIMARY | PRIMA RY | 3 | temp.collectionpartids.fid | 1 | Using where | +----+-------------+-------------------+----------+----------------------+------ --------+---------+----------------------------+------+------------------------- ---------------------+ 3 rows in set (0.00 sec) Quote Link to comment Share on other sites More sharing options...
btherl Posted May 22, 2007 Share Posted May 22, 2007 Hmm.. is the query fast if you don't join to those two other tables? Just the match? Quote Link to comment Share on other sites More sharing options...
binindex Posted May 22, 2007 Author Share Posted May 22, 2007 the match wuery with order by is super fast. There are just over 400k rows and it usually does the query in .1 secs or less. Also the other part with the other 2 tables is lightning fast if I just use them and one collection id. I have people searching for anywhere from 50-500 results and it takes forever anywhere from 5-60 secs per query. Quote Link to comment Share on other sites More sharing options...
bubblegum.anarchy Posted May 22, 2007 Share Posted May 22, 2007 Please provide a small sample data set of each table (in code quotes) What is there a difference between using: WHERE match (`collections`.`subject`) against ('dvdr' in boolean mode) and: WHERE collections.subject LIKE '%dvdr%' ? ...I have very little knowledge regarding MATCH...AGAINST Quote Link to comment Share on other sites More sharing options...
binindex Posted May 22, 2007 Author Share Posted May 22, 2007 ok since there is a size limit I can only submit 400 collections and the corresponding data for the other tables. Just for a fyi. The collections table has around 400k+ rows. The collectionpartids and mainsubjects tables both have around 4 million + rows. Like takes a whole lot longer. The match is for the fulltext index that is used on the collections table. [attachment deleted by admin] Quote Link to comment Share on other sites More sharing options...
bubblegum.anarchy Posted May 23, 2007 Share Posted May 23, 2007 The following appears to be quicker on a small record base: SELECT collections.id , collections.subject , collections.numfiles , collections.groups , collections.size , collections.from , collections.date , collections.nfo , ( SELECT sum(mainsubjects.num) FROM mainsubjects INNER JOIN collectionpartids ON mainsubjects.id = collectionpartids.fid AND collectionpartids.cid = collections.id ) AS mainsubjects_num , ( SELECT sum(mainsubjects.max) FROM mainsubjects INNER JOIN collectionpartids ON mainsubjects.id = collectionpartids.fid AND collectionpartids.cid = collections.id ) AS mainsubjects_max FROM collections WHERE MATCH (collections.subject) AGAINST ('dvdr' IN BOOLEAN MODE) GROUP BY collections.id ORDER BY collections.date desc Quote Link to comment Share on other sites More sharing options...
fenway Posted May 23, 2007 Share Posted May 23, 2007 I was going to suggest something similar, basically finding the matches first in a subquery, then joining the derived table to the other two. Quote Link to comment Share on other sites More sharing options...
binindex Posted May 25, 2007 Author Share Posted May 25, 2007 ok I get an error of unknown column `collections.id` in `on clause` when I try this. Quote Link to comment Share on other sites More sharing options...
bubblegum.anarchy Posted May 25, 2007 Share Posted May 25, 2007 I do not. Quote Link to comment Share on other sites More sharing options...
binindex Posted May 25, 2007 Author Share Posted May 25, 2007 what version of mysql do you have? I have 5.0.41. I have tried mysql query browser and the mysql.exe console. Quote Link to comment Share on other sites More sharing options...
bubblegum.anarchy Posted May 25, 2007 Share Posted May 25, 2007 4.1.21-community-nt I copied and pasted the following query into SQLyog SELECT collections.id , collections.subject , collections.numfiles , collections.groups , collections.size , collections.from , collections.date , collections.nfo , ( SELECT sum(mainsubjects.num) FROM mainsubjects INNER JOIN collectionpartids ON mainsubjects.id = collectionpartids.fid AND collectionpartids.cid = collections.id ) AS mainsubjects_num , ( SELECT sum(mainsubjects.max) FROM mainsubjects INNER JOIN collectionpartids ON mainsubjects.id = collectionpartids.fid AND collectionpartids.cid = collections.id ) AS mainsubjects_max FROM collections WHERE MATCH (collections.subject) AGAINST ('dvdr' IN BOOLEAN MODE) GROUP BY collections.id ORDER BY collections.date desc; ... and got the following result: id numfiles size from date nfo mainsubjects_num mainsubjects_max 269 107 5120108628 nosugar@tea.cup (Team TeaBag) 1175864267 97566 13170 13181 104 1 51733781 nosugar@tea.cup (Team TeaBag) 1175857722 0 133 133 265 2 163922 nosugar@tea.cup (Team TeaBag) 1175857214 0 2 2 12 1 51696885 nosugar@tea.cup (Team TeaBag) 1175857049 0 133 133 240 113 5336665215 nosugar@tea.cup (Team TeaBag) 1175854563 97157 13740 14045 172 113 5314060198 nosugar@tea.cup (Team TeaBag) 1175853137 97072 13675 13993 85 113 5386441836 nosugar@tea.cup (Team TeaBag) 1175852973 96951 13860 14075 338 2 19414 nvnvrlnd@ovrdarainbow.com (disorder) 1175840232 96210 2 2 90 106 5086235798 nosugar@tea.cup (Team TeaBag) 1175827371 94315 13084 13084 194 2 6035845 nosugar@tea.cup (Team TeaBag) 1175820296 0 17 17 210 1 38831745 nosugar@tea.cup (Team TeaBag) 1175820239 0 100 100 117 1 36125626 nosugar@tea.cup (Team TeaBag) 1175820002 0 93 93 5 106 5087442717 nosugar@tea.cup (Team TeaBag) 1175783674 93924 13085 13085 47 1 2978 nosugar@tea.cup (Team TeaBag) 1175682268 90545 1 1 114 2 929738 nosugar@tea.cup (Team TeaBag) 1175682173 0 3 3 306 105 5014443546 nosugar@tea.cup (Team TeaBag) 1175676395 90728 12901 12901 152 1 185791 nosugar@tea.cup (Team TeaBag) 1175671382 0 1 1 323 1 42488051 nosugar@tea.cup (Team TeaBag) 1175671305 0 109 109 14 106 5090985562 nosugar@tea.cup (Team TeaBag) 1175667861 88228 13098 13098 396 1 51726714 nosugar@tea.cup (Team TeaBag) 1175573600 0 133 133 179 106 5078252211 nosugar@tea.cup (Team TeaBag) 1175428830 75897 13091 13091 290 76 3322073885 nosugar@tea.cup (Team TeaBag) 1175419718 76055 8562 8562 161 1 33289201 nosugar@tea.cup (Team TeaBag) 1175408043 0 86 86 287 107 5139767976 nosugar@tea.cup (Team TeaBag) 1175405620 74049 13232 13232 141 1 126979 nosugar@tea.cup (Team TeaBag) 1175404705 0 1 1 EDIT: for some reason the subject and groups columns did not copy across in this post but the columns appear in the result set. Quote Link to comment Share on other sites More sharing options...
binindex Posted May 25, 2007 Author Share Posted May 25, 2007 ok it must be the mysql version. Cause I just installed sqlyog and I get the same error in it. Quote Link to comment Share on other sites More sharing options...
bubblegum.anarchy Posted May 25, 2007 Share Posted May 25, 2007 Can someone else please confirm mine or binindexs' results? Quote Link to comment Share on other sites More sharing options...
binindex Posted May 25, 2007 Author Share Posted May 25, 2007 that or point me to where I can download ver 5.0.21 of mysql. I will downgrade to get this thing working. Quote Link to comment Share on other sites More sharing options...
binindex Posted May 25, 2007 Author Share Posted May 25, 2007 ok I just installed 5.0.41 and 5.0.21 on another server and get the same error with both. SO not sure what is different with yours over mine. Quote Link to comment Share on other sites More sharing options...
fenway Posted May 30, 2007 Share Posted May 30, 2007 Interesting... I don't see why that would be the case. Quote Link to comment Share on other sites More sharing options...
binindex Posted May 31, 2007 Author Share Posted May 31, 2007 that it don't work or that it works on his yet don't on mine. I ahve tried it on 2 different servers still could not get it to work. Quote Link to comment Share on other sites More sharing options...
fenway Posted May 31, 2007 Share Posted May 31, 2007 It shouldn't vary between minor versions... Quote Link to comment Share on other sites More sharing options...
binindex Posted May 31, 2007 Author Share Posted May 31, 2007 I would just like to find a way to get this to work faster than the query I have now. The one I have now takes anywhere from 10-60 secs each time. Quote Link to comment Share on other sites More sharing options...
fenway Posted May 31, 2007 Share Posted May 31, 2007 So what is the current query that is the issue? Quote Link to comment Share on other sites More sharing options...
binindex Posted June 1, 2007 Author Share Posted June 1, 2007 ok here are the 2 queries I have managed to get working. They both take the same time to run and vary big time. SELECT collections.id , collections.subject , collections.numfiles , collections.groups , collections.size , collections.from , collections.date , collections.nfo , ( SELECT sum(mainsubjects.num) FROM mainsubjects INNER JOIN collectionpartids ON mainsubjects.id = collectionpartids.fid where collectionpartids.cid = collections.id ) AS small , ( SELECT sum(mainsubjects.max) FROM mainsubjects INNER JOIN collectionpartids ON mainsubjects.id = collectionpartids.fid where collectionpartids.cid = collections.id ) AS big FROM collections WHERE MATCH (collections.subject) AGAINST ('dvdr' IN BOOLEAN MODE) GROUP BY collections.id order by `date` desc limit 500 SELECT `collections`.`id`,`collections`.`subject`,`collections`.`numfiles`, `collections`.`groups`,`collections`.`size`,`collections`.`from`,`collections`.`date`, `collections`.`nfo`,sum(mainsubjects.num) as small,sum(mainsubjects.`max`) as big FROM temp.collections, temp.mainsubjects, collectionpartids WHERE match (`collections`.`subject`) against ('dvdr' in boolean mode) and collections.id = collectionpartids.cid and collectionpartids.fid = mainsubjects.id group by collections.id order by `collections`.`date` desc limit 500 Quote Link to comment Share on other sites More sharing options...
bubblegum.anarchy Posted June 1, 2007 Share Posted June 1, 2007 Are you still having trouble with that first query? Quote Link to comment Share on other sites More sharing options...
binindex Posted June 1, 2007 Author Share Posted June 1, 2007 no it works since I switched the and that you had with where. Seems to work pretty good now that I optimized the tables a little bit more. Thank you very much for the help. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.