Jump to content

Help with complex SELECT statement requiring GROUP BY on 2 columns from 2 tables


Spiz

Recommended Posts

Hello all,

 

My SQL knowledge is moderate and this is the most complex SELECT statement I've yet to attempt and any help with it is appreciated in advace.

 

For the record, it looks like the MySQL DB I'm using is version 4.

 

I have 2 tables I need to select from at the same time. Their structures with the relevant columns are as follows:

 

MERCH_CATEGORIES

===============

id (INT, primary key)

name (VARCHAR)

...others

 

MERCHANDISE

===========

item_id (VARCHAR, not unique)

size (VARCHAR)

categories (VARCHAR, for example could be '3,5,12')

...others

 

My goal is to select the different categories from MERCH_CATEGORIES and count the number of rows from MERCHANDISE that contain the MERCH_CATEGORIES.id in MERCHANDISE.categories

 

Some sample data from each table:

 

MERCH_CATEGORIES

===============

3 (id)

Pants (name)

 

MERCHANDISE

==========

71, S, 3 (item_id, size, categories)

71, M, 3

72, S, 3

...etc

 

My existing SELECT statment is valid and is one step from what I need it to be:

 

SELECT merch_categories.id, merch_categories.name, COUNT(merch_categories.name) AS num

FROM merch_categories

LEFT JOIN merchandise

ON merchandise.categories = merch_categories.id OR merchandise.categories LIKE 'merch_categories.id,%' OR merchandise.categories LIKE '%,merch_categories.id,%' OR merchandise.categories LIKE '%,merch_categories.id'

GROUP BY merch_categories.id

 

From the above statement, I get the following result:

merch_categories.id = 3

merch_categories.name = Pants

num = 275

 

This is great, since there are indeed 275 rows in MERCHANDISE that contain a '3' in its categories coulum.

 

The issue: Rows in MERCHANDISE may have duplicate item_id values, because although they are the same item, they may come in multiple sizes, colors etc.

 

What I'd like to do is perform a second GROUP BY (or utilize the UNIQUE identifier) on merchandise.item_id to weed out the duplicates. Although there are 275 rows (styles) in the pants category, there are really only 73 different types of pants.

 

I realize additional table normalization may make this easier to acheive, however I don't have control of the tables and data themselves.

 

Any help on how to solve this with a second GROUP BY, UNIQUE keyword, nested SELECT or other means would be appreciated.

 

Hopefully I've managed to explain the situation well enough  :-\

Link to comment
Share on other sites

Hi

 

Something like this maybe:-

 

SELECT merch_categories.id, merch_categories.name, COUNT(DISTINCT merchandise.item_id) AS num 
FROM merch_categories 
LEFT JOIN merchandise 
ON merchandise.categories = merch_categories.id OR merchandise.categories LIKE 'merch_categories.id,%' OR merchandise.categories LIKE '%,merch_categories.id,%' OR merchandise.categories LIKE '%,merch_categories.id' 
GROUP BY merch_categories.id , merch_categories.name

 

However using LIKE with a comma separated list of keys to do the table join on is likely to be hideously inefficient.

 

All the best

 

Keith

Link to comment
Share on other sites

Keith - brilliant, thank you.

 

I figured a veteran would not only know what neeeded to be added, but also where which can trip me up when these querys start getting large.

 

It seems so obvious now that I can see it.

 

An inefficient statement indeed with all the LIKEs. I'm hoping to have a say in how the categories can be better managed down the road. For what I've been given however, this seems like a viable solution.

 

Thanks again!

Jason

Link to comment
Share on other sites

This is not the method that should be used to store category listings. It should be a simple joining table. The only way to do this now is to explode the value in php after you select the data and count the values then"

 

Jay, thanks for this.

 

Were Keith's SQL solution not viable this is something I would have pursued. My PHP knowledge itself is mediocre, as I'm used to ASP where I would have taken a not-so-optimal approach of using two queries and two recordsets to battle through it.

 

Indeed I realize the true solution is in the table design itself.

 

Thank you for introducing 'explode' to my PHP vocabulary...I will be putting it to good use.

Jason

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.