Jump to content

Trying to join tables and select by one distinct max value


plastik77

Recommended Posts

Hi, I am trying to pull multiple columns from 3 joined tables, but I want the result set to contain only one distinct "data entry" per p.id (the pet id which is a foreign key in the data entry table). The issue i have is that there could be two data entries, numbered 1 and 2, belonging to a pet - the query has to just pick the data entry with the highest number - that's why i was trying to use max and group by but it doesn't quite work. Can anyone see where i'm going wrong? Many thanks!

SELECT `p`.`id`, `o`.`id`, `o`.`email`, MAX(d.number), `d`.`number`
FROM (`pets` AS `p`, `owners` AS `o`, `data_entries` AS `d`) 
WHERE `p`.`owner_id` = `o`.`id` 
AND `p`.`id` = `d`.`pet_id` 
GROUP BY `p`.`id` 
ORDER BY `d`.`number` DESC

 

Link to comment
Share on other sites

Hi

 

Think something like this is what you want

 

SELECT p.id, o.id, o.email, d.MaxPetNumber
FROM pets p
INNER JOIN owners o ON p.owner_id = o.id
INNER JOIN (SELECT pet_id, MAX(number) AS MaxPetNumber FROM data_entries GROUP BY pet_id) d ON p.id = d.pet_id

 

If you want more info from the data_entries table then add an extra INNER JOIN to connect the subselect to the full data_entries table.

 

All the best

 

Keith

Link to comment
Share on other sites

Hi Keith, thanks for the reply, almost there I think. but I can't quite work out how to integrate this extra clause below with the date comparison. As soon as I add in the date comparison clause, the results no longer contain the mx data entry number - any ideas?

 

SELECT p.id, o.id, o.email, d.MaxPetNumber
FROM pets AS p
INNER JOIN owners o ON p.owner_id = o.id
INNER JOIN (SELECT pet_id, MAX(number) AS MaxPetNumber FROM data_entries AS d WHERE FROM_UNIXTIME(d.updated, '%Y-%m-%d') <= DATE_FORMAT(DATE_SUB(NOW(), INTERVAL 25 day), '%Y-%m-%d')   GROUP BY pet_id) d ON p.id = d.pet_id

Link to comment
Share on other sites

Hi

 

Not quite sure. You have used the same alias name for the subselect as you have for the table within it (ie, called them both d), but that shouldn't cause an issue.

 

Do you get no row returned when you are expecting one, or a blank max pet id?

 

All the best

 

Keith

Link to comment
Share on other sites

The query does return a set of rows, but the date check is not being done with the max data entry number - I know there are a few pets with data_entries numbered 1, 2 and 3 - these shouldn't appear in the result set as data entry number 3 is within the time period being checked, but because the comparison is done against the date of data entry number 1, these are showing up in the results. Hope that makes sense - and thanks for your help so far!

Link to comment
Share on other sites

Hi

 

I think it should work.

 

However not sure you need to convert the dates to yyyymmdd format for the comparison. I would use the current date minus 25 days and convert that to a unix time stamp to compare directly with each row (should be a lot more efficient).

 

For a bit of debugging, knock up some sql to select the 2 date comparison fields you use in the subselect and see what comes out.

 

All the best

 

Keith

Link to comment
Share on other sites

Hi - ah, the problem is that the subquery to select max number has the date comparison as a clause, but in fact I need to extract the max number first, then do the date comparison on that record - if that makes sense! The date comparison must have to be moved out of that subquery, does that mean an extra join?

Link to comment
Share on other sites

Hi

 

Ah, I think I see now what you want. You want to find the records where the record with the maxpetid for that pet has a date in that range.

 

If so then yes you do need to do another JOIN

 

SELECT p.id, o.id, o.email, d.MaxPetNumber

FROM pets p

INNER JOIN owners o ON p.owner_id = o.id

INNER JOIN (SELECT pet_id, MAX(number) AS MaxPetNumber FROM data_entries GROUP BY pet_id) d ON p.id = d.pet_id

INNER JOIN data_entries d2 ON p.id = d2.pet_id AND  d.MaxPetNumber = d2.number

WHERE FROM_UNIXTIME(d2.updated, '%Y-%m-%d') <= DATE_FORMAT(DATE_SUB(NOW(), INTERVAL 25 day), '%Y-%m-%d')

 

As above, although I would suggest you clean up the date check itself (ie, just keep it in time stamp format rather than convert to ymd).

 

All the best

 

Keith

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.