Jump to content

distinct select multiple columns


webguy262

Recommended Posts

Simple table with two cols: date & email.

 

I want to select unique (distinct ?) emails so I don't get dupes, but I also want the date column for the unique emails.

 

If I do this in phpmyadmin...

 

SELECT DISTINCT email, date
FROM `voter_emails`

 

... I get duplicate emails.

 

So I guess I need a subselect or temp table?

 

Can anyone help with syntax for that?

Link to comment
Share on other sites

"select distinct email, date and group by email"

 

really should be

 

"select email, date and group by email, date".

 

I recommend otherwise. Use

 

SELECT DISTINCT date, email from voter_emails

But if you are "getting duplicate emails", then what is exactly that you want?

 

Do you want the latest date of a given email? Then use:

SELECT max(date), email from voter_emails GROUP BY email

 

I suggest you review what you really intend to have, then, and only then, write the query. You have it backwards if you write the query, then determine what you want!

 

Don't use DISTINCT -- it's evil.

 

I disagree. What is evil is the fact that in SQL, tables can have duplicate records. That is what is evil. Why? Because SQL is intended to be an implementation of the Relational Model. The Relational Model is based upon the mathematical idea of relations. And relations are just sets.

And sets have no duplicates.

 

How did SQL  miss that very fundamental fact, escapes me.

 

That is the evil that is ought to be conquered.

 

DISTINCT is a workaround by the SQL committee so that SQL is a "truly relational language" (i..e, so that tables are indeed relations, no duplicate rows).

 

Are you suggesting that using DISTINCT is evil since it will achieve what the relational model wanted to achieve? I hope you do not intend that.

 

Of course, performance-wise, DISTINCT can be expensive. But: I'd rather have a slow program, rather than have a program that is not as precise as mathematics. I'd sacrifice efficiency for accuracy, as a principle.

Link to comment
Share on other sites

DISTINCT has nothing to do with sets -- you can ask for a subset of fields from "unique" rows that have duplicate tuples -- nothing violates SQL.

 

DISTINCT is evil because it's a very poor workaround to GROUP BY.  It doesn't do anything differently, but it can easily be misconstrued as a function call - I've seen it many times on this forum.  You're allowed to write "SELECT DISTINCT *" with a GROUP BY clause -- that's just evil.

 

As for tables allowing duplicates, that up to you, the DBA, when you inserted the record -- if you choose a truly UNIQUE value, with a unique constraint, you won't get so-called duplicates.

Link to comment
Share on other sites

Thanks to all who replied.

 

What I want is best captured by ...

 

SELECT max(date),email FROM ... GROUP BY email

 

It omits duplicate email address and give the latest date an email was entered.

 

Thanks again for helping me learn!

Link to comment
Share on other sites

DISTINCT has nothing to do with sets

What is you basis for saying that DISTINCT "has nothing to do with sets"?

 

Tables are sets. SELECT is a set operation. Obviously, since DISTINCT is used with SELECT, then it has "something to do" with sets.

 

 

 

-- you can ask for a subset of fields from "unique" rows that have duplicate tuples -- nothing violates SQL.

 

I don't even understand this.

 

DISTINCT is evil because it's a very poor workaround to GROUP BY.

 

In the first place, DISTINCT is NOT a work around for GROUP BY. (Where have you gotten that info?)

 

I am arguing on the line of correctness, and not on current implementation of some dbms of the DISTINCT directive.

 

Now, what if some good DBMS actually implemented efficiently DISTINCT? Obviously, your argument would not anymore be true.

 

 

It doesn't do anything differently, but it can easily be misconstrued as a function call - I've seen it many times on this forum. 

Right. That is because in SQL a given requirement can be achieved in many ways. I don't see anyhow why this is relevant to the discussion that "DISTINCT is evil".

 

You're allowed to write "SELECT DISTINCT *" with a GROUP BY clause -- that's just evil.

Then shame on the current implementation of MySQL of SQL.

 

Again, if syntactically, DISTINCT is allowed with GROUP BY, would that make necessarily make DISTINCT "evil"?

 

GROUP BY could easily be the culprit for that matter.

 

Further, you use GROUP BY when aggregating data, and not when removing duplicate rows. DISTINCT is for that purpose.

You are misusing GROUP BY if you are using it to remove duplicate rows (w/c in the first place should not be existing).

 

We could also say that MySQL's implementation of SQL is incomplete/incorrect. THus, DISTINCT is not "evil." It is the implementation.

 

 

As for tables allowing duplicates, that up to you, the DBA, when you inserted the record -- if you choose a truly UNIQUE value, with a unique constraint, you won't get so-called duplicates.

Yeah I know. You have to explicitly define your keys. That's what sucks.

 

Relational Model explicitly states that by default, all attributes of a given table are members of a composite key. Thus, you don't even have to explicitly define your keys to avoid duplicate rows! Imagine the time saved on that.

 

Further, it is just not about tables where duplicates wreak havoc.

 

SELECT 'A' as 'attr' UNION ALL SELECT 'A' as 'attr'

 

Now you have duplicates. And why would SQL support the "ALL" directive? To introduce duplicates. Why would we want duplicates, where in set theory there are no duplicates? The answer to that question eludes me.

 

For 40 years, the Relational Model has not been completely implemented.

 

-----------

Please, I do not want to engage in this.

 

Just stop telling people that "x is good", "x is bad", "x is evil", etc, etc. Computing is a scientific enterprise.

Such categorical claims, and somewhat dogmatic, should be avoided where possible.

 

If you have to have to claim such things, provide a rigorous treatise on the subject matter. The discussion should be logical and scientific.

That is the only way. Otherwise, you'd just be feeding misconceptions to the general public.

 

-----------

Hope it helps.

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.