Jump to content

[SOLVED] Unique records on a union


kickstart

Recommended Posts

Hi

 

Trying to get my head round this problem. Suspect the solution is something simple that I have missed.

 

I have a table with several fields which I want to search for a particular term, basically name, address and email address. I want a limit of 10 rows returned, I want the matches on name first, then address and then email address (if the total is still under 10). And I don't want any duplicates.

 

SELECT LoaneeId, Name, Address, PhoneNumber, EmailAddress, 'A' as MatchType
FROM loanee_table
WHERE Active = 1 AND upper(Name) LIKE '$searchFieldName' 
UNION 
SELECT LoaneeId, Name, Address, PhoneNumber, EmailAddress, 'B' as MatchType
FROM loanee_table
WHERE Active = 1 AND upper(Address) LIKE '$searchFieldName' 
UNION 
SELECT LoaneeId, Name, Address, PhoneNumber, EmailAddress, 'C' as MatchType
FROM loanee_table
WHERE Active = 1 AND upper(EmailAddress) LIKE '$searchFieldName' 
ORDER BY MatchType, Name

 

Trouble with this is that the UNION brings back rows that are duplicated due to the MatchType field (which I want as I want the matches on name first, then the matches on address and then the matches on email address).

 

Only way I can see round this is to specifically exclude prior matches on the subsequent selects like this :-

 

SELECT LoaneeId, Name, Address, PhoneNumber, EmailAddress, 'A' as MatchType
FROM loanee_table
WHERE Active = 1 AND upper(Name) LIKE '$searchFieldName' 
UNION 
SELECT LoaneeId, Name, Address, PhoneNumber, EmailAddress, 'B' as MatchType
FROM loanee_table
WHERE Active = 1 AND upper(Address) LIKE '$searchFieldName' 
AND LoaneeId NOT IN (SELECT LoaneeId
FROM loanee_table
WHERE Active = 1 AND upper(Name) LIKE '$searchFieldName' )
UNION 
SELECT LoaneeId, Name, Address, PhoneNumber, EmailAddress, 'C' as MatchType
FROM loanee_table
WHERE Active = 1 AND upper(EmailAddress) LIKE '$searchFieldName' 
AND LoaneeId NOT IN (SELECT LoaneeId
FROM loanee_table
WHERE Active = 1 AND upper(Name) LIKE '$searchFieldName' 
UNION
SELECT LoaneeId
FROM loanee_table
WHERE Active = 1 AND upper(Address) LIKE '$searchFieldName' )
ORDER BY MatchType, Name

 

which looks likely to be hideously inefficient to me.

 

Any suggestions for an efficient way of doing this?

 

All the best

 

Keith

Link to comment
Share on other sites

Hi

 

Possible, but think I would have issues as the array items are not unique (due to the Matchtype), as I would have to pass an array of arrays. I could get rid of the matchtype so array_unique would work, but if I did that I could remove it from the SELECT statements and the UNION would take care of the duplicates. However doing that would make the order incorrect (as I want matches on the first field, then matches on the 2nd field, then matches on the 3rd field).

 

Cheers for the suggestion.

 

All the best

 

Keith

Link to comment
Share on other sites

I disagree... the UNION will return the rows in the "order" the select statements are provided;

 

While it may work, the MySQL online pages tell you to use an extra field to force this order, which suggests that this is not certain (and it is heavily recommended against relying on any default order in SQL in general). To cause rows in a UNION result to consist of the sets of rows retrieved by each SELECT one after the other, select an additional column in each SELECT to use as a sort column and add an ORDER BY following the last SELECT:.

 

All the best

 

Keith

Link to comment
Share on other sites

Hi

 

Yes, but that relies on each SELECT in the UNION coming back in order, and the MySQL documentation suggests that to get this to happen you need to add a field to each SELECT and sort on that extra column (which is what I have done and which prevents the elimination of duplicates). While it might work it seems far from certain it working is a design consideration in MySQL.

 

All the best

 

Keith

Link to comment
Share on other sites

While it might work it seems far from certain it working is a design consideration in MySQL.

 

I assume you're referring to this quote from the manual?

However, use of ORDER BY for individual SELECT statements implies nothing about the order in which the rows appear in the final result because UNION by default produces an unordered set of rows

I've never actually had this be a problem... the only "workaround" is to use a GROUP BY and "filter" the rows, taking advantage of mysql's loose group by restrictions.  But then you might get a random A/B/C... do you care?

Link to comment
Share on other sites

Hi

 

Unfortunatly with this I do care. It is to provide a drop down list of people for an Ajax call for the user to select when they type, hence I don't want Mr Smith with the email address fred@smith.com, and I want the matches on name to appear before matches on other fields.

 

There are various ways round it (excluding previous matches but using a NOT IN on subselects that are the same as the previous selects, or manually processing it in php), but all appear to me to be inefficient fudges.

 

All the best

 

Keith

Link to comment
Share on other sites

Hi

 

In my original post you can see I have 3 SELECT statements unioned. I sort the results of this to make sure that the results or the first SELECT come 1st, second SELECT 2nd, etc, and have a constant on each set of results so this sort works (basically what the MySQL pages advise).

 

Imagine searching for "smith" and there is a row for "Mr Smith", of "Smith Street" with an email address of "John@Smith.com". Each one of the SELECTs would bring back that same row.

 

With a normal set of UNIONs they would be treated as duplicates and thus would only appear once. However because I have added the constant to use for the sort to ensure the results come back in the right order the rows are no longer exactly the same so are not eliminated as duplicates.

 

While I can see ways round it none of them are elegant.

 

All the best

 

Keith

Link to comment
Share on other sites

I only now realize that you're UNIONing the *same* table!  This is the wrong approach --- what you want is weighted matching.  Then you don't need any magical union.  Simply assign a higher "score" for a match on name than on e-mail, and sort on this column.

Link to comment
Share on other sites

Hi

 

Inspired by you suggestion I came up with this.

 

SELECT LoaneeId, Name, Address, PhoneNumber, EmailAddress, SUM( MatchType ) 
FROM (
SELECT LoaneeId, Name, Address, PhoneNumber, EmailAddress, 4 AS MatchType
FROM loanee_table
WHERE Active =1
AND upper( Name ) LIKE '%TEST%'
UNION 
SELECT LoaneeId, Name, Address, PhoneNumber, EmailAddress, 2 AS MatchType
FROM loanee_table
WHERE Active =1
AND upper( Address ) LIKE '%TEST%'
UNION 
SELECT LoaneeId, Name, Address, PhoneNumber, EmailAddress, 1 AS MatchType
FROM loanee_table
WHERE Active =1
AND upper( EmailAddress ) LIKE '%TEST%'
) AS Deriv1
GROUP BY LoaneeId
ORDER BY MatchType DESC , Name

 

Not perfect but basically gives me what I want.

 

All the best

 

Keith

Link to comment
Share on other sites

BTW, if you have a case-insensitive collation, you're just killing the index that way.

 

I actually meant this:

 

SELECT LoaneeId, Name, Address, PhoneNumber, EmailAddress,
CASE 
WHEN upper( Name ) LIKE '%TEST%' THEN 4 
WHEN upper( Address ) LIKE '%TEST%' THEN 2 
WHEN upper( EmailAddress ) LIKE '%TEST%' THEN 1 
ELSE 0
END ) AS score
FROM loanee_table
WHERE Active =1
AND ( 
upper( Name ) LIKE '%TEST%'
OR upper( Address ) LIKE '%TEST%'
OR  upper( EmailAddress ) LIKE '%TEST%'
)
ORDER BY score DESC

Link to comment
Share on other sites

Hi

 

Does seem to do it nicely.

 

Only thing I wanted to have a play with was using "HAVING score > 0" (bit of abuse of the HAVING statement) instead of repeating the LIKE clauses in the WHERE. However that seems to slow things down.

 

All the best

 

Keith

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.