query advice/discussion

bagnallc · December 7, 2006

Im trying to figure out the best way to run a certain type of query and would appreciate advise.

I have a main table with about a million records in.

the columns are reference, id, prev_id, 2ndprev_id

the reference column is a unique individual and the other three columns are id's containing numeric values which relate to a second table with further information. an id will appear up to 50 times in a column.

so for example three rows which have the same id could be -
anghi 6245 4921 3215
hibng 6245 1520 859
kiljh 6245 4895 4441

i have a primary key (unique) and indexes (id, prev_id , 2ndprev_id)
(prev_id, 2ndprev_id) (2ndprev_id, id)

the second table has twenty columns in it and about 100,000 records
it is extremely fast when running queries on it. the table consists of a unique id and then various details about that id. for the example i am using one column which is named type.

an example of a possible query and where i am having trouble optimizing is in english COUNT RECORDS WHERE ID TYPE WAS 1, PREV_ID TYPE was 2 & 2NDLAST_ID TYPE WAS 3

i have thought of five ways of doing this and none are fast and none are using the multiple indexes i set up on the maintable. they are all only picking up on the first part. in this example only 78 rows return and it is taking a long long time to give that answer.

OPTION 1

select count(*) from maintable where id in (select id from secondtable where type=1) and prev_id in (select id from secondtable where type=2)
and 2ndlast_id in (select id from secondtable where type=3)

OPTION 2

select count(*) from maintable
join secondtable as a on maintable.id=a.id
join secondtable as b on maintable.prev_id=b.id
join secondtable as c on maintable.2ndlast_id=c.id
where a.type=1 and b.type=2 and c.type=3

OPTION 3

select count(*) from maintable
join (select id from secondtable where type=1) as a on maintable.id=a.id
join (select id from secondtable where type=2) as b on maintable.prev_id=b.id
join (select id from secondtable where type=3) as c on maintable.2ndlast_id=c.id

OPTION 4

do three separate queries and then make IN statements in the where clause
this works best in theory but when there are 20000 ids matching a criteria it causes the select query to collapse. the query would look like

select count(*) from maintable where id in ( huge number-range entered )
and prev_id in ( another number range )
and 2ndlast_id in ( yet another large range )

OPTION 5

The only other option i can think of is to have further columns in the main table but if going down this route i would have a maintable with 63 columns (20 columns with id details, 20 with prev_id details and 20 with 2ndlast_id details). Surely this goes against every data normalization rule going?

I appreciate any feedback, advice or criticism.

Many thanks

Chris

drifter · December 8, 2006

in table 2 does your type have an index?

bagnallc · December 8, 2006

yes

fenway · December 9, 2006

Post the explain for the query that you find slow.

Sign In

query advice/discussion

Recommended Posts

bagnallc

Link to comment

Share on other sites

drifter

Link to comment

Share on other sites

bagnallc

Link to comment

Share on other sites

fenway

Link to comment

Share on other sites

Archived

Browse

Activity

Important Information