Jump to content


Photo

Multitable query syntax


  • Please log in to reply
15 replies to this topic

#1 yeago

yeago
  • Members
  • PipPip
  • Member
  • 20 posts

Posted 25 December 2005 - 11:00 PM

CREATE TEMPORARY TABLE Results SELECT
(attractiveness + originality + variety + video_quality)/4 as rating,
sites. * , categories.site_id, count( categories.category ) as relev
FROM categories, sites, ratings
WHERE categories.site_id = sites.id AND sites.id = ratings.site_id and ($includes)
GROUP BY site_id,id";


This query takes a site, gets the number of categories it matches (to determine relevance) and then goes to the ratings table and averages the fields, giving a number (to determing ranking).

Problem: Sites that haven't been ranked are not appearing. Yet I must say where sites.id = ratings.site_id or else I get very strange results, such as all ratings appearing the same for all sites.

#2 fenway

fenway
  • Staff Alumni
  • MySQL Si-Fu / PHP Resident Alien
  • 16,199 posts
  • LocationToronto, ON

Posted 26 December 2005 - 01:37 AM

You need to use a LEFT JOIN -- with the multitable select above, you're only including records that have matching rows in both tables. For the non-matching rows, you'll get NULLs, which should be fine for the COUNT() function.

Try the following (UNTESTED):

CREATE TEMPORARY TABLE Results SELECT
(attractiveness + originality + variety + video_quality)/4 as rating,
sites. * , categories.site_id, count( categories.category ) as relev
FROM sites 
LEFT JOIN categories ON ( categories.site_id = sites.id ) 
LEFT JOIN ratings ON ( ratings.site_id = sites.id )
WHERE $includes 
GROUP BY site_id,id";

Seriously... if people don't start reading this before posting, I'm going to consider not answering at all.

#3 yeago

yeago
  • Members
  • PipPip
  • Member
  • 20 posts

Posted 03 January 2006 - 07:03 AM

Perfect except for one thing (which I didn't account for when I posted).

I can't simply take the ratings and divide them by 4 because a site may have been rated by more than one person.

How can I take the ratings (attractiveness, originality, etc...) and divide them by the number of entries which a given id appears?

Instead of:

(a + b + c)/4

I want

(a+b+c)/4/*number of times site has been reviewed*

#4 fenway

fenway
  • Staff Alumni
  • MySQL Si-Fu / PHP Resident Alien
  • 16,199 posts
  • LocationToronto, ON

Posted 03 January 2006 - 09:39 AM

Sounds like you want the average rating; try AVG(....)/4 instead. BTW, you don't need to GROUP BY id at the end, since by definition, it's unique.
Seriously... if people don't start reading this before posting, I'm going to consider not answering at all.

#5 yeago

yeago
  • Members
  • PipPip
  • Member
  • 20 posts

Posted 03 January 2006 - 05:52 PM

[!--quoteo(post=332698:date=Jan 3 2006, 09:39 AM:name=fenway)--][div class=\'quotetop\']QUOTE(fenway @ Jan 3 2006, 09:39 AM) View Post[/div][div class=\'quotemain\'][!--quotec--]
Sounds like you want the average rating; try AVG(....)/4 instead. BTW, you don't need to GROUP BY id at the end, since by definition, it's unique.
[/quote]

I tried that....
AVG(attractiveness + watchability + originality + variety)/4 as rating

Sites that have this rating:

site id blah blah blah blah blah
6 3 3 3 3 3

Return rating=6!



#6 fenway

fenway
  • Staff Alumni
  • MySQL Si-Fu / PHP Resident Alien
  • 16,199 posts
  • LocationToronto, ON

Posted 03 January 2006 - 06:28 PM

Now that doesn't make any sense... try running the AVG() function on a few rows, and you'll see how it's supposed to work. I don't know specifically why you're getting a different output. Remember, this is an average across all of the sites' rating, so I don't understand why you're posting just a single site rating record in your example.
Seriously... if people don't start reading this before posting, I'm going to consider not answering at all.

#7 yeago

yeago
  • Members
  • PipPip
  • Member
  • 20 posts

Posted 04 January 2006 - 03:00 AM

[!--quoteo(post=332804:date=Jan 3 2006, 06:28 PM:name=fenway)--][div class=\'quotetop\']QUOTE(fenway @ Jan 3 2006, 06:28 PM) View Post[/div][div class=\'quotemain\'][!--quotec--]
Now that doesn't make any sense... try running the AVG() function on a few rows, and you'll see how it's supposed to work. I don't know specifically why you're getting a different output. Remember, this is an average across all of the sites' rating, so I don't understand why you're posting just a single site rating record in your example.
[/quote]

I need this:

Site name, url, description, relevancy (number of times it appears in 'categories') rating (average rating divided by the number of times it appears in 'ratings')

#8 fenway

fenway
  • Staff Alumni
  • MySQL Si-Fu / PHP Resident Alien
  • 16,199 posts
  • LocationToronto, ON

Posted 04 January 2006 - 06:44 AM

First, I meant to write AVG( ( .... / 4 ) ), with the division _inside_ the average function. Second, if that doesn't produce the desired output, just post an example of a site_id with multiple ratings -- just those rows -- and I will show you what I mean.
Seriously... if people don't start reading this before posting, I'm going to consider not answering at all.

#9 yeago

yeago
  • Members
  • PipPip
  • Member
  • 20 posts

Posted 04 January 2006 - 04:56 PM

[!--quoteo(post=333046:date=Jan 4 2006, 06:44 AM:name=fenway)--][div class=\'quotetop\']QUOTE(fenway @ Jan 4 2006, 06:44 AM) View Post[/div][div class=\'quotemain\'][!--quotec--]
First, I meant to write AVG( ( .... / 4 ) ), with the division _inside_ the average function. Second, if that doesn't produce the desired output, just post an example of a site_id with multiple ratings -- just those rows -- and I will show you what I mean.
[/quote]

Didn't change anything.

example:

site_id,usability,features, etc

4, 5, 3, 2
4, 4, 1, 2
4, 5, 2, 2

Should be

(site_id) 4, 4.75, 2, 2

#10 fenway

fenway
  • Staff Alumni
  • MySQL Si-Fu / PHP Resident Alien
  • 16,199 posts
  • LocationToronto, ON

Posted 04 January 2006 - 06:51 PM

That's different than before -- you were combining all of the score categories and average over them. Now you want an average _within_ each category. The following would general the desired output given the rows you posted:

SELECT AVG(usability), AVG(features), AVG(etc) FROM some_table GROUP BY site_id

Does this help?
Seriously... if people don't start reading this before posting, I'm going to consider not answering at all.

#11 yeago

yeago
  • Members
  • PipPip
  • Member
  • 20 posts

Posted 06 January 2006 - 04:05 AM

I have changed the way I'm going to do it. I'm just going to have users rate sites by 1-5, no different categories.

I have a problem, however.

I find that sites that have been rated more than once are receiving double 'relev' from count('category.categories')

Also, I still need to do all this within one query (because I am using MySQL to sort them) so I need to get the average rating in one shot, and then call it 'rating' within my temp table.

#12 fenway

fenway
  • Staff Alumni
  • MySQL Si-Fu / PHP Resident Alien
  • 16,199 posts
  • LocationToronto, ON

Posted 06 January 2006 - 05:39 AM

Now that you've changed the question, I have no idea what the problem is anymore. Post your current query -- because I don't see how your COUNT() could be wrong if you're GROUPing BY site_id.
Seriously... if people don't start reading this before posting, I'm going to consider not answering at all.

#13 yeago

yeago
  • Members
  • PipPip
  • Member
  • 20 posts

Posted 06 January 2006 - 11:14 PM

[!--quoteo(post=333812:date=Jan 6 2006, 05:39 AM:name=fenway)--][div class=\'quotetop\']QUOTE(fenway @ Jan 6 2006, 05:39 AM) View Post[/div][div class=\'quotemain\'][!--quotec--]
Now that you've changed the question, I have no idea what the problem is anymore. Post your current query -- because I don't see how your COUNT() could be wrong if you're GROUPing BY site_id.
[/quote]

Ok here goes:

Table 'sites' (id, url, description)
Table 'categories' (site_id, category) <-- site_id will appear twice if site has been placed into two diff categories
Table 'ratings' (site_id, rating) <-- site_id will appear twice if the site has been rated twice

Main condition: I must get this done in one fell swoop using a temporary table because results must be sortable and I'm choosing MySQL rather than php Array_sort() to sort them. Comment on wisdom of this?

What I need:

I need a table that looks like

id, url, description, count(categories) as relev, avg(rating) as rating
                CREATE TEMPORARY TABLE Results select
                s.*,avg(r.rating) as rating
                from sites as s
                LEFT JOIN ratings as r on (r.site_id = s.id)
                where
                r.site_id = s.id group by site_id

Problem with this:

Sites that have been rated twice are simply having their ratings added together, not averaged. Sites that have been rated once are having their ratings doubled.



#14 fenway

fenway
  • Staff Alumni
  • MySQL Si-Fu / PHP Resident Alien
  • 16,199 posts
  • LocationToronto, ON

Posted 06 January 2006 - 11:22 PM

Where is the part of the query that deal with the categories table?
Seriously... if people don't start reading this before posting, I'm going to consider not answering at all.

#15 yeago

yeago
  • Members
  • PipPip
  • Member
  • 20 posts

Posted 06 January 2006 - 11:41 PM

[!--quoteo(post=334148:date=Jan 6 2006, 11:22 PM:name=fenway)--][div class=\'quotetop\']QUOTE(fenway @ Jan 6 2006, 11:22 PM) View Post[/div][div class=\'quotemain\'][!--quotec--]
Where is the part of the query that deal with the categories table?
[/quote]

Well, actually there are two queries, one that deals with them and one that doesn't, considering that the user may not have specified categories to add or exclude from search.

Here it is anyway:

I removed reference to the rating for now since I can't get it right.


CREATE TEMPORARY TABLE Results SELECT
s.*, count( c.category ) as relev
FROM sites as s
LEFT JOIN categories as c on (c.site_id = s.id)
WHERE ($includes) and c.site_id = s.id
GROUP BY s.id,c.site_id";


#16 fenway

fenway
  • Staff Alumni
  • MySQL Si-Fu / PHP Resident Alien
  • 16,199 posts
  • LocationToronto, ON

Posted 07 January 2006 - 01:16 AM

[!--quoteo--][div class=\'quotetop\']QUOTE[/div][div class=\'quotemain\'][!--quotec--]Problem with this:

Sites that have been rated twice are simply having their ratings added together, not averaged. Sites that have been rated once are having their ratings doubled.[/quote]

That's impossible. Something else is going on... there's no way that this query:

SELECT s.id,AVG(r.rating) FROM sites AS s LEFT JOIN ratings AS r ON( r.site_id = s.id ) WHERE r.site_id = s.id GROUP BY s.id


could possibly be doing what you've described. Try this query on two new tables, one with a few site IDs, and the other with 3 ratings for one of them and just 1 for the other. You'll see that in this simple scenario, everything works as expected.
Seriously... if people don't start reading this before posting, I'm going to consider not answering at all.




0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users