Jump to content

Need a mysql query that'll do a reverse search in quick time


Vikas Jayna

Recommended Posts

Hi guys!

This one's a tricky question - I am running a matrimonial site in which there a million users registered wherein each user specifies the kind of person he/she is looking for by choosing the relevant attributes in some 15 odd parameters. For e.g.

Occupation, Place of residence, Language spoken, Religion etc.

Now, most of these parameters are multiple select i.e. the person may be open for people of 3 different occupations and here lies the problem - the search that I want to devise is to display those people who are looking for a person with the attributes same as that of the user whose viewing the search results(This is what I mean by reverse search).

So, if user A is working as a software engineer in New York, speaks english and is a Born again christian, then I need to find all those people looking for this person. Quite some time back I had developed the following mechanism for finding this out:-

For each of the parameters I have a separate table:

CREATE TABLE `OCCUPATION` (
`ID` mediumint(11) unsigned NOT NULL default '0',
`OCCUPATION` tinyint(3) unsigned NOT NULL default '0',
UNIQUE KEY `abc1` (`ID`,`OCCUPATION`)
) TYPE=MyISAM

CREATE TABLE `RESIDENCE` (
`ID` mediumint(11) unsigned NOT NULL default '0',
`RESIDENCE` tinyint(3) unsigned NOT NULL default '0',
UNIQUE KEY `abc1` (`ID`,`RESIDENCE`)
) TYPE=MyISAM

CREATE TABLE `LANGUAGE` (
`ID` mediumint(11) unsigned NOT NULL default '0',
`LANGUAGE` tinyint(3) unsigned NOT NULL default '0',
UNIQUE KEY `abc1` (`ID`,`LANGUAGE`)
) TYPE=MyISAM

CREATE TABLE `RELIGION` (
`ID` mediumint(11) unsigned NOT NULL default '0',
`RELIGION` tinyint(3) unsigned NOT NULL default '0',
UNIQUE KEY `abc1` (`ID`,`RELIGION`)
) TYPE=MyISAM

where ID is the unique identifier for the person. The data type of the second column is unsigned tinyint as the values have been mapped to inetegers.

This is how the query looks like:-

select ID from OCCUPATION LEFT JOIN RESIDENCE ON OCCUPATION.ID=RESIDENCE.ID LEFT JOIN LANGUAGE ON OCCUPATION.ID=LANGUAGE.ID LEFT JOIN RELIGION ON OCCUPATION.ID=RELIGION.ID WHERE (OCCUPATION=<VALUE OF OCCUPATION - SOFTWARE ENGINEER> OR OCCUPATION IS NULL) AND (RESIDENCE=<VALUE OF RESIDENCE - NEW YORK> OR RESIDENCE IS NULL) AND (LANGUAGE=<VALUE OF LANGUAGE - ENGLISH> OR LANGUAGE IS NULL) AND (RELIGION=<VALUE OF RELIGION - BORN AGAIN CHRISTIAN> OR RELIGION IS NULL)

IS NULL has been used as the user is allowed to say that a particular parameter (say LANGUAGE) could be anything, in which case there is no record for that user in the table.

This would give all the people whose required attributes are matched by the user whose logged in. This query worked really well, say about 6 months back but now with the number of records going up and the fact that there are 15 such parameters and not just 4, the queries have started taking a lot of time - 20 to 30 seconds on a quadprocessor with 6 GB RAM.

Can somebody suggest a better way of doing this - probably a change in the database schema, i.e. how the same could be managed in a single table doing away with the cost of joins?

Thanks in advance!
Vikas Jayna
Link to comment
Share on other sites

[quote author=Vikas Jayna link=topic=111181.msg451767#msg451767 date=1160803884]
Then how do I store multiple values in a column against one id and still be able to search on the column?
[/quote]
I don't understand.. "multiple values"?  Each attribute has only one value, but you can enter multiple records.
Link to comment
Share on other sites

  • 2 weeks later...
Ok! I'll try to explain with an example. Supose a user XYZ chooses the following values:-

For Occupation - values A,B and C
For Religion - values D,E and F
For Language - values G,H and I
For Residence - values J,K and L

Now there are various ways of creating the database structure:-
1) [b]Having multiple tables[/b] ( one each for Occupation, Religion, Language and Residence) - all the four tables will have two columns, one containing the user id and the other the value. In this case, there will be three rows in each table for user XYZ as three values have been selected for each. [b]This is the database structure I have at present[/b] but to search this database for those users that have chosen say - A in Occupation and E in Religion and I in language and L in residence, will require a join between the four tables.

2) Another way is to use a [b]single table[/b] and have the values A,B and C stored [b]comma delimited[/b] in a column Occupation. This will require the usage of 'like clause' for searching the user with the criteria given above and 'like clause' will not use indexes which will make the search slower

3) Another way is to use a [b]single table[/b] that is [b]de-normalised[/b] such that if the user chooses 3 values for each of the four columns, then the total No. of rows in the table will be 3*3*3*3 = 81 rows for the user and then a simple where condition like 'occupation=A and religion=E and language=I and residence=L' will give the result but the table will become huge and hence even this solution is impractical.

Regards,
Vikas Jayna
Link to comment
Share on other sites

[quote author=Vikas Jayna link=topic=111181.msg456161#msg456161 date=1161588610]
3) Another way is to use a [b]single table[/b] that is [b]de-normalised[/b] such that if the user chooses 3 values for each of the four columns, then the total No. of rows in the table will be 3*3*3*3 = 81 rows for the user and then a simple where condition like 'occupation=A and religion=E and language=I and residence=L' will give the result but the table will become huge and hence even this solution is impractical.
[/quote]
This _is_ normalized -- I don't know why you care about the table being huge, and why you think it is impractical.
Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.