Jump to content

UNION DISTINCT not so distinct


dustinnoe

Recommended Posts

I have the following query

 

SELECT qt.*, tag.id AS tag_id, users.username, pg.*
FROM tag, tag_map, tag_type, qt, users, privacy_global AS pg
WHERE tag_map.tag_id = tag.id
AND (tag.tag_name IN ( 'life'))
AND qt.id = tag_map.this_id
GROUP BY qt.id
HAVING COUNT( qt.id )= 1
UNION DISTINCT
SELECT qt.*, tag.id AS tag_id, users.username, pg.*
FROM tag, tag_map, tag_type, qt, users, privacy_global AS pg
WHERE tag_map.tag_id = tag.id
AND (tag.tag_name IN ( 'death'))
AND qt.id = tag_map.this_id
GROUP BY qt.id

 

This query searches for tagged items that include the word life or death.  The problem I am running into is when an item has been tagged with both words (life and death) the same row is returned twice.  I know that by default UNIONs are supposed to drop duplicates but it is not so I tried UNION DISTINCT but get the same result.  Where has my logic gone wrong?

Link to comment
Share on other sites

From what I can tell of what you're trying to do, there's no need for a UNION at all. I am however confused by the 'having' clause.

 

Anyway, try simply this:

 

SELECT qt.*, tag.id AS tag_id, users.username, pg.*

FROM tag, tag_map, tag_type, qt, users, privacy_global AS pg

WHERE tag_map.tag_id = tag.id

AND (tag.tag_name IN ( 'life', 'death'))

AND qt.id = tag_map.this_id

GROUP BY qt.id

 

 

The group by will remove the duplicates.

 

Link to comment
Share on other sites

From what I can tell of what you're trying to do, there's no need for a UNION at all. I am however confused by the 'having' clause.

 

Anyway, try simply this:

 

SELECT qt.*, tag.id AS tag_id, users.username, pg.*

FROM tag, tag_map, tag_type, qt, users, privacy_global AS pg

WHERE tag_map.tag_id = tag.id

AND (tag.tag_name IN ( 'life', 'death'))

AND qt.id = tag_map.this_id

GROUP BY qt.id

 

 

The group by will remove the duplicates.

 

This query will return items tagged with both 'life' AND 'death' I need to return items tagged with  'life' OR 'death'

Link to comment
Share on other sites

dustinnoe, did you try the query toplay suggested?

 

The only thing I see it doesn't do is check your "HAVING" condition.  That's fine if you don't mind applying the condition to both queries, in which case you can just add it to toplay's query.

Link to comment
Share on other sites

I have tried toplay's query it only returns life OR death (I said it backwards in the previous post)

 

My original query works great other than the fact that it returns duplicate rows in certain situations. I just need to know what I am doing wrong with the UNION.

 

The HAVING condition is required to meet the needs of my tag search.  <a href="http://www.pui.ch/phred/archives/2005/04/tags-database-schemas.html">Read this article</a> under the head "'Toxi' Solution" to see why.

 

Thanks.

Link to comment
Share on other sites

Ok, I read the article.  I'm very confused about what you are trying to do though.

 

1.  Why are you using UNION with Toxi's queries?  They already handle OR and AND.

2.  Do you want life AND death or life OR death?  Simple IN works for OR.  IN combined with HAVING works for AND.  What is it that you need to do that cannot be expressed using Toxi's queries?

 

Regardless of that, I don't understand why the union would return duplicate rows in the result.  Are the rows that are duplicated exact duplicates?

Link to comment
Share on other sites

I am anticipating more complex search strings like this:

 

life & run & jump | death | sleep

 

The only way I can see to accomplish this according to the Toxi Solution is by using a UNION like so:

 

SELECT qt.*, tag.id AS tag_id, users.username, pg.*
FROM tag, tag_map, tag_type, qt, users, privacy_global AS pg
WHERE tag_map.tag_id = tag.id
AND (tag.tag_name IN ( 'life', 'run', 'jump'))
AND qt.id = tag_map.this_id
GROUP BY qt.id
HAVING COUNT( qt.id )= 3
UNION DISTINCT
SELECT qt.*, tag.id AS tag_id, users.username, pg.*
FROM tag, tag_map, tag_type, qt, users, privacy_global AS pg
WHERE tag_map.tag_id = tag.id
AND (tag.tag_name IN ( 'death', 'sleep'))
AND qt.id = tag_map.this_id
GROUP BY qt.id

Link to comment
Share on other sites

  • 2 weeks later...

You can't just drop them.. the grouping and having in each clause is how Toxi implements "AND".  If you dropped the inner group bys, you would lose the meaning of "life & run & jump" for example.  That requires the group by and "having count(qt.id) = 3".

 

What fenway is suggesting is just to do the query as it is, accept that it's going to produce duplicates, but then remove the duplicates by wrapping EVERYTHING inside another query that only does duplicate removal.  GROUP BY and DISTINCT are equivalent for straight duplicate removal.  And mysql lets you get away with only specifiying qt.id.  Something like:

 

SELECT * FROM ( ... your entire current query in here ... ) GROUP BY qt.id

 

So we will still have no idea why the union distinct doesn't work, but at least you'll be getting the results you want :)

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.