Jump to content

Hierarchical Categories and Counting Items in One Call


Recommended Posts

I've been searching for a few days and have found a lot of answers very similar to what I'm looking for, but nothing that helps me accomplish what I'm trying to do.  I'm trying to approach in the most efficient way possible, so please let me know if I'm way off base here.

 

I'm trying to pull of a menu system similar to what Newegg has.  For example, check out the menu on the left in the 'Sound Cards' section (http://www.newegg.com/Store/Category.aspx?Category=36&name=Sound-Cards).  I want something like this with more levels, but it would also show the item count next to the subcategory names.

 

I've got a single table of categories ('categories') that goes no deeper than 4 levels in hierarchical structure.  I've got a second table of business listings ('listings') which are basically user accounts.  Finally, I've got a third table ('category_connections') that connects a business listing to as many different categories as I'd like (one category per entry in this table).

 

Here's a simplified version of the 'categories' table:

id 

parent_id 

category_name

1

0

Hunting

2

0

Fishing

3

1

Deer

Child of 'Hunting'

4

1

Duck

Child of 'Hunting'

5

2

Bass

Child of 'Fishing'

6

2

Crappie

Child of 'Fishing'

 

Here's the 'listings' table:

id 

listing_name

1

Joe's Duck Hunting and Bass Fishing

2

Gary's Deer Hunting

3

Bob's Fishing

 

And here's the 'category_connections' table:

id 

listing_id 

category_id 

[/td]

[td]1

1

5

Joe's in 'Bass'

2

1

4

Joe's in 'Duck'

3

2

3

Gary's in 'Deer'

4

3

2

Bob's in the general category 'Fishing.'  You can see here that listings won't always be in the end leaf nodes of the hierarchical structure.

 

What I'm trying to do is come up with a single call that can walk through each category in 'categories' relevant to my search as well as each of their child categories.  It should also join 'category_connections' so that it can return the count for all listings classified under that category (including subcategories), taking into account the fact that a single business listing can be in more than one category (so we can't count them more than once).

 

Below is a flawed demonstration to give you an idea of what I'm trying to achieve.  For simplicity's sake, I'll only join the 'categories' table on itself once here, but normally it'd need to be done 4 times.

SELECT cat1.*, COUNT(category_connections.id) AS count FROM categories AS cat1
     LEFT JOIN categories AS cat2 ON cat2.parent_id = cat1.id
     RIGHT OUTER JOIN category_connections ON (category_connections.category_id = cat1.id OR category_connections.category_id = cat2.id)
     WHERE cat1.parent_id = '0'
     GROUP BY cat1.id ORDER BY cat1.category_name ASC

 

Theoretically, I would like for this to return all the details for the 'Hunting' and 'Fishing' categories as well as a 'count' variable for each indicating the number of listing connections that are classified within that category or at any level below that category.  For example:

  • Hunting (2) - 1 for Joe's and 1 for Gary's
  • Fishing (2) - 1 for Joe's and 1 for Bob's

However, in reality I am getting something like this:

  • Hunting (2)
  • Fishing (3) - 1 more than it should be counting

Joining the 'categories' table on itself causes the number of connections returned for the parent category (in this case, Bob's in 'Fishing') to be multiplied by the number of subcategories within that parent category ('Bass' and 'Crappie').  Thus, I am getting 3 for 'Fishing' in this example rather than 2.  The problem compounds when I join 'categories' on itself more than once.  Looking at the query, I understand why it's doing this.  I'm just not sure what my query should be to eliminate this problem.  Any ideas here?

 

Additionally, I need to add something else to this to make sure that the same business listing is not counted more than once if it is in multiple subcategories of the same category (like if Joe's were to also have a connection to 'Deer').  Naturally, I would go to something like 'GROUP BY category_connections.id', but I'm not sure I can do that since I'm already grouping by 'cat1.id'.

 

I get the feeling that I'm going to have to scrap this and use a bunch of nested SELECT queries, but I figured I'd ask the pros first.  I appreciate the help!

One troubleshooting step I do if I get an incorrect count in an aggregate query like that is to remove the group and do a select * in phpMyAdmin.  Then I look to see what is causing the additional row(s) that the count is returning.  Once you get the select * query to return the correct number of rows if you count them manually, then add the 'group by' and the count() column.

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.