Jump to content

Need a better way to write a select query that repeats subqueries...?


Balagi

Recommended Posts

With the following example I am dealing with 2 tables:

 

students, projects

SELECT student_name, count(p.id) AS project_count
FROM students AS s
LEFT JOIN projects AS p ON p.student_id = s.id
GROUP BY s.id

 

So, if I have a class of 20 students, and each student has done 4 projects, each row will return the following:

[student] => John Doe, [project_count] => 20

 

What I want to do is figure out how to get more specific data for each student (row) from my students table, such as:

[student] => John Doe, [project_count] => 20, [last_project_name] => American History, [last_project_date] => 2005-05-01

 

So in addition to the first 2 columns, I want to grab the project from the projects table that was last added to the database, in the above results example the last project from John Doe would be titled American History and turned in on the said date.

 

Granted I'm still a novice when it comes to MySQL, I want to know if there is a more efficient and organized way to go about this rather than what I came up with by using 2 long sub queries:

 

SELECT student_name, count(p.id) AS project_count, 
    (SELECT project_name FROM projects p2 
     LEFT JOIN students s2 ON s2.id = p2.student_id 
     WHERE p2.student_id = s.id ORDER BY project_time 
     DESC LIMIT 0,1) AS last_project_name,
    (SELECT project_time FROM projects p2 
     LEFT JOIN students s2 ON s2.id = p2.student_id 
     WHERE p2.student_id = s.id ORDER BY project_time 
     DESC LIMIT 0,1) AS last_project_date,
FROM students s
LEFT JOIN projects p ON p.student_id = s.id
GROUP BY s.id

 

I want to emphasize that the above query DOES work, I'm hoping to find a better and quicker way to achieve the same results.

 

Thank you in advance.

Link to comment
Share on other sites

At first glance I thought your query was fairly optimal but the following is better and quicker:

 

SELECT students.id
     , students.student_name
     , count(projects.id) AS project_count
     , latest_project.project_name
     , latest_project.project_time
FROM students
     LEFT JOIN projects ON students.id = projects.student_id
     LEFT JOIN ( 
          SELECT projects.* 
          FROM projects 
               INNER JOIN (
                    SELECT student_id, max(project_time) AS latest_time
                    FROM projects GROUP BY projects.student_id
               ) AS latest_project ON projects.student_id = latest_project.student_id AND projects.project_time = latest_project.latest_time
          GROUP BY projects.student_id
     ) AS latest_project ON projects.student_id = latest_project.student_id
GROUP BY students.id

Link to comment
Share on other sites

Queries like "most recent" or "biggest" are usually messy in SQL, when combined with straightforward queries.  If I have to do such queries very often, AND it is causing performance trouble, I usually cache the values I'm interested in.

 

For your case, you could add a "most recent project for each student" table, which stores the most recent project.  That can be joined back with the projects table to fetch the date.  That table will need to be updated each time a project is created, deleted, or has its date altered.

Link to comment
Share on other sites

Thanks very much, bubblegum.anarchy.  Your idea worked perfectly, although it took me like an hour to fully comprehend.  I would have never figured that out, that really gives me some insight into what MySQL can do that I had no idea of.

 

From my testing, adding that 2nd LEFT JOIN (the long one) only added about .0014 seconds to my query runtime (granted it's only being tested on localhost) for 10 records. 

 

I had thought about just going the easy route and updating the "students" table after every insert/update/delete query, but I just wanted to make my results more concrete, in case I ever decide to change the way I display my data.

 

Thanks again.

Link to comment
Share on other sites

While I'm still working with these tables I have another set of information I want to grab, but I'm not sure if this is even possible or how to do it if it is.  I've researched MySQL.com and can't find anything that helps with what I want to do.  If you have an idea I'd greatly appreciate to hear it.

 

Basically I want to return a "rank #" for each student based on the projects they have turned in.  Say the projects table has the following fields:

 

id, student_id, project_name, project_time, project_order

 

I want to grab the rank based on the order ASC and then the time ASC.  So for example, if this was my projects table, the ranks would be as follows:

 

1 | 1 | cheese | 2222 | 5 = rank 4

2 | 1 | water | 4444 | 4 = rank 2

3 | 2 | cheese | 1111 | 5 = rank 3

4 | 2 | water | 3333 | 4 = rank 1

5 | 3 | pizza | 0111 | 6 = rank 5

 

Here, the ORDER BY of the query would be (project_order, project_time), but I will be using the query posted by bubblegum.anarchy earlier, so I don't care about the ranks of the projects, I care about the ranks of students.  Using the above example User 2 would be ranked #1, User 2 would be ranked #2, and User 3 as #3.

 

Using this query, is it possible to get each students rank using another JOIN?

 

SELECT students.id
     , students.student_name
     , count(projects.id) AS project_count
     , latest_project.project_name
     , latest_project.project_time
     , something.something AS student_rank
FROM students
     LEFT JOIN projects ON students.id = projects.student_id
     LEFT JOIN ( 
          SELECT projects.* 
          FROM projects 
               INNER JOIN (
                    SELECT student_id, max(project_time) AS latest_time
                    FROM projects GROUP BY projects.student_id
               ) AS latest_project ON projects.student_id = latest_project.student_id AND projects.project_time = latest_project.latest_time
          GROUP BY projects.student_id
     ) AS latest_project ON projects.student_id = latest_project.student_id
     LEFT JOIN something
GROUP BY students.id

 

I haven't even made an attempt at figuring this out as I have no idea where to start. If you have an idea what I'm talking about (I have the feeling my explanation was rather vague), please toss some advice my way.

 

P.S. I am shying away from adding a "rank" field to the students table because that will eventually require updating thousands of records for every insert, update, and delete done to the projects table.

Link to comment
Share on other sites

Add the following to the end of query:

 

ORDER BY concat(project_order, time_to_sec(project_time));

 

assuming that project_time is actually stored as a time data type

 

EDIT: ...and to add an actual rank, use a php variable or the following concept:

 

mysql_query("SET @rank = 0");
mysql_query("SELECT @rank := @rank + 1 AS rank, students.id, ... FROM students ....");

Link to comment
Share on other sites

I'm sorry I should have specified that I don't want to have to order the results by each students rank (though it will be an option for the guests).  So, lets assume that the guest wants to order the results by student_name, I still want one of the fields to show the student's rank, for example:

 

student_name | overall_rank | gender_rank

Adam | 3 | 3

Alex | 2 | 1

Alice | 5 | 2

Bob | 4 | 2

Tiffany | 1 | 1

 

edit: project_time is a timestamp BTW.

Link to comment
Share on other sites

That's a tough one.  Yes, adding a rank field will drive you bonkers trying to keep it up to date.

 

You might want to do it in two stages.  First fetch it in rank order, then have php add in the ranks, then sort it in php in your display order.  You could do something similar in mysql using a temporary table I suppose.. but I'm not sure how you would do that, being a postgres user.

Link to comment
Share on other sites

I'm not sure how much overhead this will add, but you could try adding a rank column using the session variable method (SET @var=0 and @var:=@var+1) and using an ORDER BY clause.  Then, make all that a subquery to a SELECT wrapped around the outside of it that reorders based on whatever criteria you've decided to use.  That way you can generate a rank column, yet still order the results however you want.

 

SET @rank = 0;
SELECT * FROM ( SELECT /* your other columns */,@rank:=@rank+1 AS Rank FROM /* ...etc,etc... */ ORDER BY rankcol1,rankcol2) AS inside ORDER BY name,time DESC,whatever;

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.