Vikas Jayna Posted October 11, 2006 Share Posted October 11, 2006 Hi guys!This one's a tricky question - I am running a matrimonial site in which there a million users registered wherein each user specifies the kind of person he/she is looking for by choosing the relevant attributes in some 15 odd parameters. For e.g.Occupation, Place of residence, Language spoken, Religion etc.Now, most of these parameters are multiple select i.e. the person may be open for people of 3 different occupations and here lies the problem - the search that I want to devise is to display those people who are looking for a person with the attributes same as that of the user whose viewing the search results(This is what I mean by reverse search).So, if user A is working as a software engineer in New York, speaks english and is a Born again christian, then I need to find all those people looking for this person. Quite some time back I had developed the following mechanism for finding this out:-For each of the parameters I have a separate table:CREATE TABLE `OCCUPATION` ( `ID` mediumint(11) unsigned NOT NULL default '0', `OCCUPATION` tinyint(3) unsigned NOT NULL default '0', UNIQUE KEY `abc1` (`ID`,`OCCUPATION`)) TYPE=MyISAMCREATE TABLE `RESIDENCE` ( `ID` mediumint(11) unsigned NOT NULL default '0', `RESIDENCE` tinyint(3) unsigned NOT NULL default '0', UNIQUE KEY `abc1` (`ID`,`RESIDENCE`)) TYPE=MyISAMCREATE TABLE `LANGUAGE` ( `ID` mediumint(11) unsigned NOT NULL default '0', `LANGUAGE` tinyint(3) unsigned NOT NULL default '0', UNIQUE KEY `abc1` (`ID`,`LANGUAGE`)) TYPE=MyISAMCREATE TABLE `RELIGION` ( `ID` mediumint(11) unsigned NOT NULL default '0', `RELIGION` tinyint(3) unsigned NOT NULL default '0', UNIQUE KEY `abc1` (`ID`,`RELIGION`)) TYPE=MyISAMwhere ID is the unique identifier for the person. The data type of the second column is unsigned tinyint as the values have been mapped to inetegers.This is how the query looks like:-select ID from OCCUPATION LEFT JOIN RESIDENCE ON OCCUPATION.ID=RESIDENCE.ID LEFT JOIN LANGUAGE ON OCCUPATION.ID=LANGUAGE.ID LEFT JOIN RELIGION ON OCCUPATION.ID=RELIGION.ID WHERE (OCCUPATION=<VALUE OF OCCUPATION - SOFTWARE ENGINEER> OR OCCUPATION IS NULL) AND (RESIDENCE=<VALUE OF RESIDENCE - NEW YORK> OR RESIDENCE IS NULL) AND (LANGUAGE=<VALUE OF LANGUAGE - ENGLISH> OR LANGUAGE IS NULL) AND (RELIGION=<VALUE OF RELIGION - BORN AGAIN CHRISTIAN> OR RELIGION IS NULL)IS NULL has been used as the user is allowed to say that a particular parameter (say LANGUAGE) could be anything, in which case there is no record for that user in the table.This would give all the people whose required attributes are matched by the user whose logged in. This query worked really well, say about 6 months back but now with the number of records going up and the fact that there are 15 such parameters and not just 4, the queries have started taking a lot of time - 20 to 30 seconds on a quadprocessor with 6 GB RAM.Can somebody suggest a better way of doing this - probably a change in the database schema, i.e. how the same could be managed in a single table doing away with the cost of joins?Thanks in advance!Vikas Jayna Quote Link to comment Share on other sites More sharing options...
fenway Posted October 11, 2006 Share Posted October 11, 2006 Well, you can have a single table with all the attributes, like name/value pairs. Quote Link to comment Share on other sites More sharing options...
Vikas Jayna Posted October 13, 2006 Author Share Posted October 13, 2006 Could you elaborate how the table would look like and how it will be searched Quote Link to comment Share on other sites More sharing options...
fenway Posted October 13, 2006 Share Posted October 13, 2006 Instead of having a different table for each attribute, simply have a table of attribute, and have the type as one of the columns. Quote Link to comment Share on other sites More sharing options...
Vikas Jayna Posted October 14, 2006 Author Share Posted October 14, 2006 Then how do I store multiple values in a column against one id and still be able to search on the column? Quote Link to comment Share on other sites More sharing options...
fenway Posted October 14, 2006 Share Posted October 14, 2006 [quote author=Vikas Jayna link=topic=111181.msg451767#msg451767 date=1160803884]Then how do I store multiple values in a column against one id and still be able to search on the column?[/quote]I don't understand.. "multiple values"? Each attribute has only one value, but you can enter multiple records. Quote Link to comment Share on other sites More sharing options...
Vikas Jayna Posted October 23, 2006 Author Share Posted October 23, 2006 Ok! I'll try to explain with an example. Supose a user XYZ chooses the following values:-For Occupation - values A,B and CFor Religion - values D,E and FFor Language - values G,H and IFor Residence - values J,K and LNow there are various ways of creating the database structure:-1) [b]Having multiple tables[/b] ( one each for Occupation, Religion, Language and Residence) - all the four tables will have two columns, one containing the user id and the other the value. In this case, there will be three rows in each table for user XYZ as three values have been selected for each. [b]This is the database structure I have at present[/b] but to search this database for those users that have chosen say - A in Occupation and E in Religion and I in language and L in residence, will require a join between the four tables.2) Another way is to use a [b]single table[/b] and have the values A,B and C stored [b]comma delimited[/b] in a column Occupation. This will require the usage of 'like clause' for searching the user with the criteria given above and 'like clause' will not use indexes which will make the search slower3) Another way is to use a [b]single table[/b] that is [b]de-normalised[/b] such that if the user chooses 3 values for each of the four columns, then the total No. of rows in the table will be 3*3*3*3 = 81 rows for the user and then a simple where condition like 'occupation=A and religion=E and language=I and residence=L' will give the result but the table will become huge and hence even this solution is impractical.Regards,Vikas Jayna Quote Link to comment Share on other sites More sharing options...
fenway Posted October 24, 2006 Share Posted October 24, 2006 [quote author=Vikas Jayna link=topic=111181.msg456161#msg456161 date=1161588610]3) Another way is to use a [b]single table[/b] that is [b]de-normalised[/b] such that if the user chooses 3 values for each of the four columns, then the total No. of rows in the table will be 3*3*3*3 = 81 rows for the user and then a simple where condition like 'occupation=A and religion=E and language=I and residence=L' will give the result but the table will become huge and hence even this solution is impractical.[/quote]This _is_ normalized -- I don't know why you care about the table being huge, and why you think it is impractical. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.