Jump to content

sql JOIN - almost fixed


dweb

Recommended Posts

Hi

 

I have these 2 tables;

student_points

id    member_id    class    stage
16     4           math     23
25     15          math     34
42     21          math     56
43     21          math     524
46     21          math     251
student_times

id     member_id     class     total
16     4             math      111
25     15            math      44
42     21            math      22
43     21            math      1
45     21            math      2
46     21            math      1

what i am trying to do, is get the following result

member_id    class      total     stage
4            math       111       23
15           math       44        34
21           math       26        251

I have this query

SELECT st.member_id, sp.member_id, SUM(sp.total) as total, st.stage

FROM student_points sp

LEFT JOIN

 student_times st

ON st.member_id = sp.member_id AND st.class = 'math'

WHERE sp.class = 'math'

AND st.stage != 0

GROUP BY st.member_id

ORDER BY st.stage, sp.total DESC

but it returns

member_id     class     total      stage
4             math      111        23
15            math      44         34
21            math      104        251

the value for `total` should be 26, not 104

 

why would it return 104?

 

thanks

 

Link to comment
Share on other sites

First,

Your listed table contents and query are not accurate. For example, your query says the stage field is in the student_times table, but your tables show it as in the student_points table. Your total field is similarlly reversed. Even if you fix that, the total comes out to 78, not 104. Be sure when you as for help that the information you post is accurate or you are just going to end up confusing people and wasting time.

 

the value for `total` should be 26, not 104

 

why would it return 104?

As for why the total is 104 (or 78 as your sample shows) rather than 26 as you'd expect is because your join ends up duplicating rows in the second table. For each row in the first table, it matches all the rows in the second table, eg:

student_points
42     21          math     56
is joined with

student_times
42     21            math      22
43     21            math      1
45     21            math      2
46     21            math      1
Then

student_points
43     21          math     524
is again, joined with:

student_times
42     21            math      22
43     21            math      1
45     21            math      2
46     21            math      1
And so on, so your ultimate result prior to grouping and summing is:

MEMBER_ID       TOTAL   STAGE
4               111     23
15              44      34
21              22      56
21              2       56
21              1       56
21              1       56
21              22      251
21              2       251
21              1       251
21              1       251
21              22      524
21              2       524
21              1       524
21              1       524
When you apply the group and sum then, you get 22+1+2+1+22+1+2+1+22+1+2+1 = 78

 

You could resolve the duplicating of the second tables rows by doing a sub query first, but you'd still end up with the total duplicated due to the original table and a result set like this:

MEMBER_ID       TOTAL   STAGE
4               111     23
15              44      34
21              26      56
21              26      251
21              26      524
What you probably need to do is introduce another table, however without knowing more about what your application does I'm not sure what to recommend. If you want to explain your overall app goals and table design then perhaps someone can recommend a better approach. Edited by kicken
Link to comment
Share on other sites

Hi

 

Sorry if my example was not clear.

 

My current data looks like

INSERT INTO `student_points` (`id`, `member_id`, `class`, `total`) VALUES
(16, 4, 'math', '111'),
(25, 15, 'math', '44'),
(42, 21, 'math', '22'),
(43, 21, 'math', '1'),
(45, 21, 'math', '2'),
(46, 21, 'math', '1');

INSERT INTO `student_times` (`id`, `member_id`, `class`, `stage`) VALUES
(16, 4, 'math', '23'),
(25, 15, 'math', '34'),
(42, 21, 'math', '56'),
(43, 21, 'math', '67'),
(45, 21, 'math', '22'),
(46, 21, 'math', '5');

if I run the following query then the results are correct

SELECT sp.member_id, SUM(sp.total) as total
FROM student_points sp
WHERE sp.class = 'math'
GROUP BY sp.member_id
and I get this

member_id    total
4            111
15           44
21           26
so the above is correct

 

all i want to do is add onto the results, the latest `stage` record that is stored in the `student_points` table

 

so it would look like

member_id    total     stage (from the `student_times` table)
4            111       23 (from ID 16)
15           44        34(from ID 25)
21           26        5(from ID 45)
does that make more sense?

 

thanks

Edited by Zane
Link to comment
Share on other sites

at last, this seems to work, does it look correct?

SELECT sp.member_id, st.stage, total_count FROM student_points sp

LEFT JOIN (SELECT st.* FROM student_times st WHERE st.class = 'math'  ORDER BY st.id DESC)

AS st ON st.member_id = sp.member_id

LEFT JOIN (SELECT sp1.*, SUM(sp1.total) AS total_count FROM student_points sp1 WHERE sp1.class = 'math'  GROUP BY sp1.member_id ORDER BY sp1.id DESC)

AS sp1 ON sp1.member_id = sp.member_id

WHERE sp.class = 'math'

GROUP BY sp.member_id

Link to comment
Share on other sites

 

at last, this seems to work, does it look correct?

 

It could be much more effective using inner join.

select 
    sp.member_id, sum(sp.total) as Total, st.MinStage
from
    student_points sp
        inner join
    (select 
        id, member_id, min(stage) as MinStage
    from
        student_times
    where
        class = 'math'
    group by member_id) as st using (id)
where
    sp.class = 'math'
group by sp.member_id


+-----------+-------+----------+
| member_id | Total | MinStage |
+-----------+-------+----------+
|         4 |   111 |       23 |
|        15 |    44 |       34 |
|        21 |    22 |        5 |
+-----------+-------+----------+
3 rows in set (0.00 sec)

Edited by jazzman1
Link to comment
Share on other sites

@jazzman1, I think he wants the most recent stage value, not the minimum stage value.

 

 

all i want to do is add onto the results, the latest `stage` record that is stored in the `student_points` table

 

Maybe it is just me, but something about this seems to hint at a bad schema design.

 

For one, I'm not understanding why you have two separate tables. Both seem to be a mirror image of each other except one has a total column and the other has a stage column. Is there a reason you should not have one table that just includes those two columns? At the very least you shouldn't need to duplicate the id & member_id values and should use the primary key from one table as a foreign key in the other. But, I don't know enough about the data to be able to give an intelligent response on that.

 

Second, you apparently want the 'last' or most recent stage value, but you have no accurate way to get that info. The only thing you have to rely on is the numerical value of the ID column. The value of the ID field should not be used for that type of purpose. You should instead have a column for date_added or something similar to use for that purpose. You could even create the field to be auto-populated when the record is created and/or updated.

 

Anyway, with the current structure you have (which may be the real issue), here is my solution

SELECT sp.member_id, sp.total, st.stage
FROM (
  SELECT MAX(id) as id, member_id, SUM(total) as total
  FROM student_points sp
  WHERE class = 'math'
  GROUP BY member_id
) as sp

JOIN student_times as st USING(id)
Edited by Psycho
Link to comment
Share on other sites

 

@jazzman1, I think he wants the most recent stage value, not the minimum stage value.

 

 

 

Maybe it is just me, but something about this seems to hint at a bad schema design.

 

For one, I'm not understanding why you have two separate tables. Both seem to be a mirror image of each other except one has a total column and the other has a stage column. Is there a reason you should not have one table that just includes those two columns? At the very least you shouldn't need to duplicate the id & member_id values and should use the primary key from one table as a foreign key in the other. But, I don't know enough about the data to be able to give an intelligent response on that.

 

Second, you apparently want the 'last' or most recent stage value, but you have no accurate way to get that info. The only thing you have to rely on is the numerical value of the ID column. The value of the ID field should not be used for that type of purpose. You should instead have a column for date_added or something similar to use for that purpose. You could even create the field to be auto-populated when the record is created and/or updated.

 

Anyway, with the current structure you have (which may be the real issue), here is my solution

SELECT sp.member_id, sp.total, st.stage
FROM (
  SELECT MAX(id) as id, member_id, SUM(total) as total
  FROM student_points sp
  WHERE class = 'math'
  GROUP BY member_id
) as sp

JOIN student_times as st USING(id)

 

 

Thanks for your suggestion, but I tried your QUERY and it only returned 2 of the users, not sure why?

 

The reason there are two separate tables, is because they contain no real related data, apart from what I am trying to do here.

 

I also think, maybe it adds a little bit extra security, rather than storing lots of data in one single table

Link to comment
Share on other sites

Thanks for your suggestion, but I tried your QUERY and it only returned 2 of the users, not sure why?

 

The reason there are two separate tables, is because they contain no real related data, apart from what I am trying to do here.

 

I also think, maybe it adds a little bit extra security, rather than storing lots of data in one single table

 

Hmm, your data shows a one-to-one relationship between the data in the two tables. As for more security, no it doesn't add any security. How you configure your tables should be completely dependent upon the nature and structure of the data, not based upon any preconceived notions.

 

As for why the query I provided didn't work for you is an unknown. I actually created two tabled with the exact same configuration and data you supplied and I received the correct results.

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.