Jump to content


Photo

Need a mysql query that'll do a reverse search in quick time


  • Please log in to reply
7 replies to this topic

#1 Vikas Jayna

Vikas Jayna
  • Members
  • PipPipPip
  • Advanced Member
  • 121 posts
  • LocationNoida, Delhi, India

Posted 11 October 2006 - 10:19 AM

Hi guys!

This one's a tricky question - I am running a matrimonial site in which there a million users registered wherein each user specifies the kind of person he/she is looking for by choosing the relevant attributes in some 15 odd parameters. For e.g.

Occupation, Place of residence, Language spoken, Religion etc.

Now, most of these parameters are multiple select i.e. the person may be open for people of 3 different occupations and here lies the problem - the search that I want to devise is to display those people who are looking for a person with the attributes same as that of the user whose viewing the search results(This is what I mean by reverse search).

So, if user A is working as a software engineer in New York, speaks english and is a Born again christian, then I need to find all those people looking for this person. Quite some time back I had developed the following mechanism for finding this out:-

For each of the parameters I have a separate table:

CREATE TABLE `OCCUPATION` (
`ID` mediumint(11) unsigned NOT NULL default '0',
`OCCUPATION` tinyint(3) unsigned NOT NULL default '0',
UNIQUE KEY `abc1` (`ID`,`OCCUPATION`)
) TYPE=MyISAM

CREATE TABLE `RESIDENCE` (
`ID` mediumint(11) unsigned NOT NULL default '0',
`RESIDENCE` tinyint(3) unsigned NOT NULL default '0',
UNIQUE KEY `abc1` (`ID`,`RESIDENCE`)
) TYPE=MyISAM

CREATE TABLE `LANGUAGE` (
`ID` mediumint(11) unsigned NOT NULL default '0',
`LANGUAGE` tinyint(3) unsigned NOT NULL default '0',
UNIQUE KEY `abc1` (`ID`,`LANGUAGE`)
) TYPE=MyISAM

CREATE TABLE `RELIGION` (
`ID` mediumint(11) unsigned NOT NULL default '0',
`RELIGION` tinyint(3) unsigned NOT NULL default '0',
UNIQUE KEY `abc1` (`ID`,`RELIGION`)
) TYPE=MyISAM

where ID is the unique identifier for the person. The data type of the second column is unsigned tinyint as the values have been mapped to inetegers.

This is how the query looks like:-

select ID from OCCUPATION LEFT JOIN RESIDENCE ON OCCUPATION.ID=RESIDENCE.ID LEFT JOIN LANGUAGE ON OCCUPATION.ID=LANGUAGE.ID LEFT JOIN RELIGION ON OCCUPATION.ID=RELIGION.ID WHERE (OCCUPATION=<VALUE OF OCCUPATION - SOFTWARE ENGINEER> OR OCCUPATION IS NULL) AND (RESIDENCE=<VALUE OF RESIDENCE - NEW YORK> OR RESIDENCE IS NULL) AND (LANGUAGE=<VALUE OF LANGUAGE - ENGLISH> OR LANGUAGE IS NULL) AND (RELIGION=<VALUE OF RELIGION - BORN AGAIN CHRISTIAN> OR RELIGION IS NULL)

IS NULL has been used as the user is allowed to say that a particular parameter (say LANGUAGE) could be anything, in which case there is no record for that user in the table.

This would give all the people whose required attributes are matched by the user whose logged in. This query worked really well, say about 6 months back but now with the number of records going up and the fact that there are 15 such parameters and not just 4, the queries have started taking a lot of time - 20 to 30 seconds on a quadprocessor with 6 GB RAM.

Can somebody suggest a better way of doing this - probably a change in the database schema, i.e. how the same could be managed in a single table doing away with the cost of joins?

Thanks in advance!
Vikas Jayna
Vikas Jayna,
Project Manager - Jeevansathi.com
7 yrs. exp. in LAMP
Certified Mysql DBA
Zend Certified Engineer

#2 fenway

fenway
  • Staff Alumni
  • MySQL Si-Fu / PHP Resident Alien
  • 16,199 posts
  • LocationToronto, ON

Posted 11 October 2006 - 07:54 PM

Well, you can have a single table with all the attributes, like name/value pairs.
Seriously... if people don't start reading this before posting, I'm going to consider not answering at all.

#3 Vikas Jayna

Vikas Jayna
  • Members
  • PipPipPip
  • Advanced Member
  • 121 posts
  • LocationNoida, Delhi, India

Posted 13 October 2006 - 09:42 AM

Could you elaborate how the table would look like and how it will be searched
Vikas Jayna,
Project Manager - Jeevansathi.com
7 yrs. exp. in LAMP
Certified Mysql DBA
Zend Certified Engineer

#4 fenway

fenway
  • Staff Alumni
  • MySQL Si-Fu / PHP Resident Alien
  • 16,199 posts
  • LocationToronto, ON

Posted 13 October 2006 - 03:35 PM

Instead of having a different table for each attribute, simply have a table of attribute, and have the type as one of the columns.
Seriously... if people don't start reading this before posting, I'm going to consider not answering at all.

#5 Vikas Jayna

Vikas Jayna
  • Members
  • PipPipPip
  • Advanced Member
  • 121 posts
  • LocationNoida, Delhi, India

Posted 14 October 2006 - 05:31 AM

Then how do I store multiple values in a column against one id and still be able to search on the column?
Vikas Jayna,
Project Manager - Jeevansathi.com
7 yrs. exp. in LAMP
Certified Mysql DBA
Zend Certified Engineer

#6 fenway

fenway
  • Staff Alumni
  • MySQL Si-Fu / PHP Resident Alien
  • 16,199 posts
  • LocationToronto, ON

Posted 14 October 2006 - 05:40 PM

Then how do I store multiple values in a column against one id and still be able to search on the column?

I don't understand.. "multiple values"?  Each attribute has only one value, but you can enter multiple records.
Seriously... if people don't start reading this before posting, I'm going to consider not answering at all.

#7 Vikas Jayna

Vikas Jayna
  • Members
  • PipPipPip
  • Advanced Member
  • 121 posts
  • LocationNoida, Delhi, India

Posted 23 October 2006 - 07:30 AM

Ok! I'll try to explain with an example. Supose a user XYZ chooses the following values:-

For Occupation - values A,B and C
For Religion - values D,E and F
For Language - values G,H and I
For Residence - values J,K and L

Now there are various ways of creating the database structure:-
1) Having multiple tables ( one each for Occupation, Religion, Language and Residence) - all the four tables will have two columns, one containing the user id and the other the value. In this case, there will be three rows in each table for user XYZ as three values have been selected for each. This is the database structure I have at present but to search this database for those users that have chosen say - A in Occupation and E in Religion and I in language and L in residence, will require a join between the four tables.

2) Another way is to use a single table and have the values A,B and C stored comma delimited in a column Occupation. This will require the usage of 'like clause' for searching the user with the criteria given above and 'like clause' will not use indexes which will make the search slower

3) Another way is to use a single table that is de-normalised such that if the user chooses 3 values for each of the four columns, then the total No. of rows in the table will be 3*3*3*3 = 81 rows for the user and then a simple where condition like 'occupation=A and religion=E and language=I and residence=L' will give the result but the table will become huge and hence even this solution is impractical.

Regards,
Vikas Jayna
Vikas Jayna,
Project Manager - Jeevansathi.com
7 yrs. exp. in LAMP
Certified Mysql DBA
Zend Certified Engineer

#8 fenway

fenway
  • Staff Alumni
  • MySQL Si-Fu / PHP Resident Alien
  • 16,199 posts
  • LocationToronto, ON

Posted 24 October 2006 - 02:36 AM

3) Another way is to use a single table that is de-normalised such that if the user chooses 3 values for each of the four columns, then the total No. of rows in the table will be 3*3*3*3 = 81 rows for the user and then a simple where condition like 'occupation=A and religion=E and language=I and residence=L' will give the result but the table will become huge and hence even this solution is impractical.

This _is_ normalized -- I don't know why you care about the table being huge, and why you think it is impractical.
Seriously... if people don't start reading this before posting, I'm going to consider not answering at all.




0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users