help with query

bagnallc · August 20, 2008

Hi,

I have a table which holds records of occurences, which is very simple

records

recordid varchar(20) not null primary key

memberid medium int unsigned not null

selection tinyint unsigned not null

characteristics tinyint unsigned not null

there are about 1.5 million records and indexes on each of the columns

querying this table is very fast but where i am having a little trouble is when i want to compare the table to itself.

an example of what i am trying to achieve is to list a memberid and if he made the same selection 2 times in a row

the way i have done this is to create a new table prevdata

recordid varchar(20) not null primary key

prevrecordid varchar(20)

prevrecordid has a unique index on it and does contain null values (as sometimes there is no previous information)

and then i run the following query

select count(*) from records join prevdata on records.recordid=prevdata.recordid

join records as records2 on prevdata.prevrecordid=records2.recordid

where records.selection=records2.selection

the query does run but it takes well over a minute which is no good for the environment it will be used in. am i making a design mistake here

fenway · August 20, 2008

Why do you need a new table? Why not just JOIN the table to itself?

bagnallc · August 20, 2008

Because there is no way of joining it to itself and ensuring accuracy that i can see.

How could i tell mysql which row of data is the previous record for any other row without the new table?

Its the same principle a lot of queires i now have to build. For example

select members with same selection on 4 subsequent occasions

select most common characteristics of members with selection 1 on two of last 3 occasions

what adds to the confusion is the same member could and should appear twice if for example he had made the same selection on 4 subsequent occasions more than once

there is another table called masterrecord which records general data and i used this to create the prevrecord table

masterrecordid mediumint unsigned not null primary key

masterdate date unsigned not null

daybatch tinyint unsigned not null

entrylevel tinyint unsigned not null

owner mediumint unsigned not null

this has about 150k rows and again this is indexed and performs very well individually and also when combined with records

the only other option i can see is to do self joins based on the two tables using a combination of the masterdate fields to get last occurences. i tried this previously though and it took forever

appreciate your time and help

fenway · August 20, 2008

How could i tell mysql which row of data is the previous record for any other row without the new table?

How can you do this currently?

bagnallc · August 20, 2008

By creating the additional table as described in first post

fenway · August 20, 2008

By creating the additional table as described in first post

So you're inserting records into this "additional table" as you go along? I'm confused... why not just timestamp them?

bagnallc · August 20, 2008

yes unfortunately this has been the only way i have found.

timestamp i did not think would work as the data was originally retrieved in batches and therefore not entered in the correct order as such. perhaps i am wrong?

fenway · August 21, 2008

Retrieved in batches? Somewhere the data was entered one at a time...

bagnallc · August 22, 2008

what i mean is there were several different people gathering data and as a result when the info was loaded there were files of say 2003 ready before 2001 as an example

anyway, do you feel the tables are worth rebuilding and if i did how would i write a self join on timestamp. appreciate your help

fenway · August 22, 2008

what i mean is there were several different people gathering data and as a result when the info was loaded there were files of say 2003 ready before 2001 as an example

anyway, do you feel the tables are worth rebuilding and if i did how would i write a self join on timestamp. appreciate your help

I guess I'm trying to figure out exactly what you're trying to query....

bagnallc · August 25, 2008

Hi fenway

there are many types of thing im trying to query

but an example is to select a count of all members who have made the same selection on 2 consecutive selections

fenway · August 25, 2008

If that's the case, I think a self-join is more appropriate... why duplicate the info?

Sign In

help with query

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Archived

Important Information