# Standard Deviance and Outlier help

3 replies to this topic

### #1 kfreak

kfreak
• New Members
• Newbie
• 2 posts

Posted 13 October 2003 - 01:40 AM

Basically I\'m just trying to switch my current sql statement which finds averages to do the same, but omit outliers. Here\'s my first one:

`"SELECT a.*, AVG(b.rating) as rating FROM rand_thoughts a, ratings b WHERE b.rid=a.id GROUP BY b.rid ORDER BY rating DESC"`

Works fine, but say you have the set of votes (from 1-10)
7
7
7
8
6
1

The outlier would obviously be one, and it would throw the voting. So I remembered finding the IQR in math class (I was so shocked that I was using something I actually learned, I\'ll have to tell my math teacher) to calculate the outliers after you\'ve found the 25th and 75th quartiles. But the problem is with the sql statement, this is what I\'ve come up with.

`"SELECT a.id, b.rating FROM rand_thoughts a, ratings b WHERE b.rid=a.id && (b.rating > (AVG(b.rating)-STD(b.rating))) && (b.rating < (AVG(b.rating)+STD(b.rating) * .675)) GROUP BY b.rid"`

I\'m pretty sure the logic works, but my problem is it gives me an error. This is the error from mysql_error():

`Invalid use of group function`

I\'ve searched google and all the programming forums that I know of, that\'s part of how I initially came up with finding the standard deviance to check for outliers.

Anyway, any help or suggestions would be great.

Thanks.
~ www.devrand.com

### #2 Barand

Barand
• Moderators
• Sen . ( ile || sei )
• 18,017 posts

Posted 13 October 2003 - 06:47 PM

You cannot group by something you haven\'t selected.

Try ... GROUP BY a.id ...
If you are still using mysql_ functions, STOP! Use mysqli_ or PDO. The longer you leave it the more you will have to rewrite.

 |baaGrid| easy data tables - and more|baaChart| easy line, column and pie charts

### #3 kfreak

kfreak
• New Members
• Newbie
• 2 posts

Posted 14 October 2003 - 12:51 AM

I fixed what you suggested. I think I just did that while trying all the different ways trying to get it to work. I\'ve managed to get this to work in 2 seperate queries but I really hate to do that if I have to. This is what I\'m working with now.

`"SELECT a.id, b.rid, b.rating FROM rand_thoughts a, ratings b WHERE b.rid=a.id && (b.rating > (AVG(b.rating)-STD(b.rating))) && (b.rating < (AVG(b.rating)+STD(b.rating) * .675)) GROUP BY b.rid"`

I still get the same error.
~ www.devrand.com

### #4 Barand

Barand
• Moderators
• Sen . ( ile || sei )
• 18,017 posts

Posted 14 October 2003 - 06:33 AM

You\'re selecting where a.id = b.rid therefore, if they are always the same, why select both?

SELECT a.id, b.rating FROM rand_thoughts a, ratings b WHERE b.rid=a.id
GROUP BY a.id, b.rating
HAVING (b.rating > (AVG(b.rating)-STD(b.rating))) && (b.rating < (AVG(b.rating)+STD(b.rating) * .675))
If you are still using mysql_ functions, STOP! Use mysqli_ or PDO. The longer you leave it the more you will have to rewrite.