kfreak Posted October 13, 2003 Share Posted October 13, 2003 Basically I\'m just trying to switch my current sql statement which finds averages to do the same, but omit outliers. Here\'s my first one: "SELECT a.*, AVG(b.rating) as rating FROM rand_thoughts a, ratings b WHERE b.rid=a.id GROUP BY b.rid ORDER BY rating DESC" Works fine, but say you have the set of votes (from 1-10) 7 7 7 8 6 1 The outlier would obviously be one, and it would throw the voting. So I remembered finding the IQR in math class (I was so shocked that I was using something I actually learned, I\'ll have to tell my math teacher) to calculate the outliers after you\'ve found the 25th and 75th quartiles. But the problem is with the sql statement, this is what I\'ve come up with. "SELECT a.id, b.rating FROM rand_thoughts a, ratings b WHERE b.rid=a.id && (b.rating > (AVG(b.rating)-STD(b.rating))) && (b.rating < (AVG(b.rating)+STD(b.rating) * .675)) GROUP BY b.rid" I\'m pretty sure the logic works, but my problem is it gives me an error. This is the error from mysql_error(): Invalid use of group function I\'ve searched google and all the programming forums that I know of, that\'s part of how I initially came up with finding the standard deviance to check for outliers. Anyway, any help or suggestions would be great. Thanks. Quote Link to comment Share on other sites More sharing options...
Barand Posted October 13, 2003 Share Posted October 13, 2003 You cannot group by something you haven\'t selected. Try ... GROUP BY a.id ... Quote Link to comment Share on other sites More sharing options...
kfreak Posted October 14, 2003 Author Share Posted October 14, 2003 Thanks for the reply. I fixed what you suggested. I think I just did that while trying all the different ways trying to get it to work. I\'ve managed to get this to work in 2 seperate queries but I really hate to do that if I have to. This is what I\'m working with now. "SELECT a.id, b.rid, b.rating FROM rand_thoughts a, ratings b WHERE b.rid=a.id && (b.rating > (AVG(b.rating)-STD(b.rating))) && (b.rating < (AVG(b.rating)+STD(b.rating) * .675)) GROUP BY b.rid" I still get the same error. Quote Link to comment Share on other sites More sharing options...
Barand Posted October 14, 2003 Share Posted October 14, 2003 You\'re selecting where a.id = b.rid therefore, if they are always the same, why select both? SELECT a.id, b.rating FROM rand_thoughts a, ratings b WHERE b.rid=a.id GROUP BY a.id, b.rating HAVING (b.rating > (AVG(b.rating)-STD(b.rating))) && (b.rating < (AVG(b.rating)+STD(b.rating) * .675)) Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.