Jump to content

MariaDB Ignoring ORDER BY clause before or after GROUP BY clause


Recommended Posts

Good Evening Team,

I have a simple client list for our schools and sometimes my schools change names but more or less remain the same entity. I sometimes need to refer to the previous name and sometimes the current. I accomplish this with the following 2 tables...

  • campus_table.campus_id, campus_table(other static fields that are not relevant here)
  • campus_name_assignment_table.campus_name_assignment_id
    • campus_name_assignment_table.campus_id
    • campus_name_assignment_table.edited_timestamp

My goal is to query the latest campus_name_assignment unique to each campus_id. My latest attempt at a query looks like so...

SELECT
campus_table.campus_id,
campus_name_assignment_table.campus_name_assignment,
campus_name_assignment_table.edited_timestamp
FROM campus_table
INNER JOIN campus_name_assignment_table
ON campus_table.campus_id=campus_name_assignment_table.campus_id

 

I would like to ORDER these results by edited_timestamp DESC and then to GROUP them by campus_id. This does not seem to be working in any capacity. There may be some nuance tied to the fact that my version of XAMPP uses MariaDB and I found the following documentation on this, but either I don't understand it or it is incorrect. Any/all help is very much appreciated. I'm kinda frustrated that I cannot get the query itself to work correctly as it seems that the order piece is being ignored - I've tried workarounds such as the HAVING MAX() extension included after GROUP BY but this similarly does not seem to work. I've also tried to order the campus_name_assignment_table in an alias'd subquery but as soon as it gets into the main query it completely re-sorts.

https://mariadb.com/kb/en/library/why-is-order-by-in-a-from-subquery-ignored/

Only halfway understanding why but I got the following to work...if anyone has any input it would be appreciated.

SELECT * FROM(
    SELECT
    grouped_campus_name_assignments.campus_name_assignment,
    grouped_campus_name_assignments.campus_id
    FROM
        (SELECT
        campus_name_assignment_table.campus_name_assignment,
        campus_name_assignment_table.campus_id,
        campus_name_assignment_table.edited_timestamp
        FROM
        campus_name_assignment_table
        GROUP BY
        campus_name_assignment_table.campus_id,
        campus_name_assignment_table.edited_timestamp DESC)
        AS grouped_campus_name_assignments
    GROUP BY grouped_campus_name_assignments.campus_id)
    AS latest_campus_name_assignments
    INNER JOIN campus_table
    ON latest_campus_name_assignments.campus_id=campus_table.campus_id

 

39 minutes ago, gizmola said:

Does this provide the ordering you are trying to achieve?  

 

I honestly am not sure the answer this question because tbh I'm still not 100% clear/confident on the semantics but I am currently making progress so I'm optimistic! Essentially what I wanted to do was take the most-recent version of the campus_name_assignment for each entity campus_id....My initial inquiry understood this to be accomplished by the process of 'ordering' followed by 'grouping'.

As shown in my solution above, the 'grouping' phase can include a sort/order parameter(new to me!), which then allows for me to sort/order by the timestamp within the query, thereby rendering a second 'grouping' phase in the encompassing query to eliminate the older timestamps from the new table! I imagine there's a smoother way to accomplish this but for my current mastery of sql this will work for now - any input is appreciated!

When you GROUP BY you will get one row per group in the result.  

I like to think of this as compacting all the rows that were part of the group into one row, along with the property that summary/group operators can be used relative to the initial underlying grouping, to compute things like SUM, AVG, COUNT etc.

A classic example is a sales table where you GROUP BY YEAR(sales_date) and have SUM(order_total) AS total_sales to get a report of gross sales by year. 

If I understand your question correctly you want to ORDER the entire result set in a way that is incompatible with the natural property of ORDER by.  

What you would like is:

  • ALL results ORDER BY campus_name_assignment_table.edited_timestamp DESC
  • Except that once you have a particular campus_id displayed, you want the other campus_name_assignment_table rows for that same campus_id to follow.

This can't be done without employing some form of manipulation.

The simple solution that uses the ORDER BY, is to order results using your join and ORDER_BY campus_name_assignment_table.campus_id, campus_name_assignement_table.edited_timestamp DESC.

This gets you the order of campus_name_assignments to be most recent first for any one campus_id, but it is going to be ordered by campus_id first.

Am I right in assuming that this is the problem you are having?  

Thank you for your input. Your description of the GROUP BY function is very helpful.

ALL results ORDER BY campus_name_assignment_table.edited_timestamp DESC

Except that once you have a particular campus_id displayed, you want the other campus_name_assignment_table rows for that same campus_id to follow.

Am I right in assuming that this is the problem you are having?

Not exactly and I apologize if the inquiry was not clear.

My goal is to have a table display the most recent campus name assignments ONLY. So therefore, it would only show one campus name assignments per campus id. This name assignment would be selected by looking at each campus id and then selecting the name assignment from the record that contained the latest edited_timestamp for that respective campus id. If that makes sense?

So campus_id 1 is named "School of Technology" but then renames itself to "School of Technology and Mathemetics". A query of this table would NOT show previous name assignments for each campus_id, only the latest.

 

 

 

It sounds like you need a subquery to establish the latest date for each campus

Join your name_assignment to this subquery on compus id and date to find the name that matches that date.

SELECT campus_id
     , campus_name_assignment
FROM campus_name_assignment_table na
JOIN (
        SELECT campus_id
             , MAX(edited_timestamp) as edited_timestamp
        FROM campus_name_assignment_table
        GROUP BY campus_id
     ) latest
            USING (campus_id, edited_timestamp);

 

7 hours ago, Barand said:

It sounds like you need a subquery to establish the latest date for each campus

Join your name_assignment to this subquery on compus id and date to find the name that matches that date.


SELECT campus_id
     , campus_name_assignment
FROM campus_name_assignment_table na
JOIN (
        SELECT campus_id
             , MAX(edited_timestamp) as edited_timestamp
        FROM campus_name_assignment_table
        GROUP BY campus_id
     ) latest
            USING (campus_id, edited_timestamp);

 

Thank you for this...a couple of questions.

Can you explain the "na" prior to the first JOIN command? Also the 'latest' piece I can't find any documentation for this,

This finally helps me to understand how to get the max piece to work - I was only able to get it to extract one row but the way you join these make perfect sense. This is very helpful thank you.

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.