SchweppesAle Posted May 27, 2010 Share Posted May 27, 2010 Trying to figure out what's wrong with the following query. doesn't work(returns an empty set) SELECT id, email, COUNT(email) AS NumOccurrences FROM huge_email_list GROUP BY id HAVING ( COUNT(email) > 1 ) works just fine SELECT id, email, COUNT(email) AS NumOccurrences FROM huge_email_list GROUP BY email HAVING ( COUNT(email) > 1 ) Quote Link to comment https://forums.phpfreaks.com/topic/203097-group-by/ Share on other sites More sharing options...
PFMaBiSmAd Posted May 27, 2010 Share Posted May 27, 2010 For the first one, if id is a unique index, your groups will only have one row in each of them and the count() will never be greater-than one. Quote Link to comment https://forums.phpfreaks.com/topic/203097-group-by/#findComment-1064173 Share on other sites More sharing options...
SchweppesAle Posted May 27, 2010 Author Share Posted May 27, 2010 For the first one, if id is a unique index, your groups will only have one row in each of them and the count() will never be greater-than one. hmmm..I actually thought GROUP BY was used to return only unique entries for the specified column. So "GROUP BY email" would only return one entry which contains that specific email. That's why I had assumed GROUP BY id was unnecessary since it's an index. Is this not the case? Quote Link to comment https://forums.phpfreaks.com/topic/203097-group-by/#findComment-1064185 Share on other sites More sharing options...
SchweppesAle Posted May 27, 2010 Author Share Posted May 27, 2010 It might be more useful if I describe exactly what I'm trying to do. We have a huge table containing email addresses(>5million) and I'd like to write a query which singles out duplicate email address entries. The second query(appears?) to work but due to the GROUP BY email clause it won't display all duplicate entries for a specific email address and their associated ids. I'd like to modify it slightly so that it displays all duplicate entries per email address. What would be the correct way of doing this? Quote Link to comment https://forums.phpfreaks.com/topic/203097-group-by/#findComment-1064186 Share on other sites More sharing options...
PFMaBiSmAd Posted May 27, 2010 Share Posted May 27, 2010 You would need to use the HAVING query as a subquery - SELECT * from huge_email_list WHERE id IN (SELECT id FROM huge_email_list GROUP BY email HAVING ( COUNT(email) > 1 )) ORDER BY email Quote Link to comment https://forums.phpfreaks.com/topic/203097-group-by/#findComment-1064192 Share on other sites More sharing options...
PFMaBiSmAd Posted May 27, 2010 Share Posted May 27, 2010 Actually, here is a query that will work (the above one won't) - SELECT * from huge_email_list WHERE email IN (SELECT email FROM huge_email_list GROUP BY email HAVING ( COUNT(email) > 1 )) Quote Link to comment https://forums.phpfreaks.com/topic/203097-group-by/#findComment-1064216 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.