bagnallc Posted August 20, 2008 Share Posted August 20, 2008 Hi, I have a table which holds records of occurences, which is very simple records recordid varchar(20) not null primary key memberid medium int unsigned not null selection tinyint unsigned not null characteristics tinyint unsigned not null there are about 1.5 million records and indexes on each of the columns querying this table is very fast but where i am having a little trouble is when i want to compare the table to itself. an example of what i am trying to achieve is to list a memberid and if he made the same selection 2 times in a row the way i have done this is to create a new table prevdata recordid varchar(20) not null primary key prevrecordid varchar(20) prevrecordid has a unique index on it and does contain null values (as sometimes there is no previous information) and then i run the following query select count(*) from records join prevdata on records.recordid=prevdata.recordid join records as records2 on prevdata.prevrecordid=records2.recordid where records.selection=records2.selection the query does run but it takes well over a minute which is no good for the environment it will be used in. am i making a design mistake here Quote Link to comment Share on other sites More sharing options...
fenway Posted August 20, 2008 Share Posted August 20, 2008 Why do you need a new table? Why not just JOIN the table to itself? Quote Link to comment Share on other sites More sharing options...
bagnallc Posted August 20, 2008 Author Share Posted August 20, 2008 Because there is no way of joining it to itself and ensuring accuracy that i can see. How could i tell mysql which row of data is the previous record for any other row without the new table? Its the same principle a lot of queires i now have to build. For example select members with same selection on 4 subsequent occasions select most common characteristics of members with selection 1 on two of last 3 occasions what adds to the confusion is the same member could and should appear twice if for example he had made the same selection on 4 subsequent occasions more than once there is another table called masterrecord which records general data and i used this to create the prevrecord table masterrecordid mediumint unsigned not null primary key masterdate date unsigned not null daybatch tinyint unsigned not null entrylevel tinyint unsigned not null owner mediumint unsigned not null this has about 150k rows and again this is indexed and performs very well individually and also when combined with records the only other option i can see is to do self joins based on the two tables using a combination of the masterdate fields to get last occurences. i tried this previously though and it took forever appreciate your time and help Quote Link to comment Share on other sites More sharing options...
fenway Posted August 20, 2008 Share Posted August 20, 2008 How could i tell mysql which row of data is the previous record for any other row without the new table? How can you do this currently? Quote Link to comment Share on other sites More sharing options...
bagnallc Posted August 20, 2008 Author Share Posted August 20, 2008 By creating the additional table as described in first post Quote Link to comment Share on other sites More sharing options...
fenway Posted August 20, 2008 Share Posted August 20, 2008 By creating the additional table as described in first post So you're inserting records into this "additional table" as you go along? I'm confused... why not just timestamp them? Quote Link to comment Share on other sites More sharing options...
bagnallc Posted August 20, 2008 Author Share Posted August 20, 2008 yes unfortunately this has been the only way i have found. timestamp i did not think would work as the data was originally retrieved in batches and therefore not entered in the correct order as such. perhaps i am wrong? Quote Link to comment Share on other sites More sharing options...
fenway Posted August 21, 2008 Share Posted August 21, 2008 Retrieved in batches? Somewhere the data was entered one at a time... Quote Link to comment Share on other sites More sharing options...
bagnallc Posted August 22, 2008 Author Share Posted August 22, 2008 what i mean is there were several different people gathering data and as a result when the info was loaded there were files of say 2003 ready before 2001 as an example anyway, do you feel the tables are worth rebuilding and if i did how would i write a self join on timestamp. appreciate your help Quote Link to comment Share on other sites More sharing options...
fenway Posted August 22, 2008 Share Posted August 22, 2008 what i mean is there were several different people gathering data and as a result when the info was loaded there were files of say 2003 ready before 2001 as an example anyway, do you feel the tables are worth rebuilding and if i did how would i write a self join on timestamp. appreciate your help I guess I'm trying to figure out exactly what you're trying to query.... Quote Link to comment Share on other sites More sharing options...
bagnallc Posted August 25, 2008 Author Share Posted August 25, 2008 Hi fenway there are many types of thing im trying to query but an example is to select a count of all members who have made the same selection on 2 consecutive selections Quote Link to comment Share on other sites More sharing options...
fenway Posted August 25, 2008 Share Posted August 25, 2008 If that's the case, I think a self-join is more appropriate... why duplicate the info? Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.