Jump to content

Show Last N records per group


MargateSteve

Recommended Posts

I know that this subject has been done to death but I really cannot implement any of the suggestions in the context I need it.

 

I currently have a query that show a League Standings table

$lgetable = mysql_query("SELECT *
,COUNT(*) AS ct
, t.team_name as Tm
, @rownum := @rownum+1 AS rank
, (sum(CASE WHEN (".$ht." AND ".$hw.")OR(".$at." AND ".$aw.") THEN 3 ELSE 0 END)+ sum(CASE WHEN (".$ht." OR ".$at.") AND ".$d." THEN 1 ELSE 0 END)) AS P
, count(CASE WHEN (".$ht." OR ".$at.") THEN 1 ELSE 0 END) as GP
, sum(CASE WHEN (".$ht." AND ".$hw.") OR (".$at." AND ".$aw.") THEN 1 ELSE 0 END) AS W
, sum(CASE WHEN (".$ht." AND ".$d.") OR (".$at." AND ".$d.") THEN 1 ELSE 0 END) AS D
, sum(CASE WHEN (".$ht." AND ".$aw.") OR (".$at." AND ".$hw.") THEN 1 ELSE 0 END) AS L
, SUM(CASE WHEN (".$ht.") THEN ".$hg." WHEN (".$at.") THEN ".$ag." END) as GF
, SUM(CASE WHEN (".$ht.") THEN ".$ag." WHEN (".$at.") THEN ".$hg." END) as GA
, (SUM(CASE WHEN (".$ht.") THEN ".$hg." WHEN (".$at.") THEN ".$ag." END)- SUM(CASE WHEN (".$ht.") THEN ".$ag." WHEN (".$at.") THEN ".$hg." END)) as GD

from teams t
join all_games g 
on t.team_id in (g.home_team,g.away_team)

WHERE comp = '1' 
AND home_goals IS NOT NULL 
AND date BETWEEN '2010-07-01' AND '2011-06-31' 
GROUP BY t.team_id
ORDER BY P desc, GD desc, GF desc 

(The alias's used are
$ht = "g.home_team = t.team_id";
$at = "g.away_team = t.team_id";
$hw = "g.home_goals > g.away_goals";
$aw = "g.home_goals < g.away_goals";
$d = "g.home_goals = g.away_goals";
$hg ="g.home_goals";
$ag ="g.away_goals"

 

I am trying to use that same query to show only the most recent five games for each team (t.team_id) which I thought would be possible just by removing the current ORDER BY statement and replacing it with

HAVING ct <= 5
ORDER BY date DESC

But this causes no rows to be returned. Strangely, as a test, I grouped it by date instead of t.team_id, and it returned results, though obviously not the right ones.

 

Am I simply missing something with the grouping, or am I miles from the solution?

Thanks

Steve

Link to comment
Share on other sites

  • 3 months later...
  • 3 weeks later...

If you could provide me sample data and table def in the form of SQL scripts, maybe I can dig into this.

 

Also specify the attributes that you want in the result, and the explanation of the requirement/specification of the result.

Link to comment
Share on other sites

Thanks ebmigue.

 

I have attached an sql dump of the relevant tables and data, a text file with the the query and bits I have put into strings and the html of the bit that is being output. All have been the sql and html files have been given a .txt extension as it wouldn't let me attach them otherwise!

An online test version (ignore the formatting) of the page can be found at http://follyball.co.uk/oss/standings.php.

 

Basically, the page currently shows the standings of a soccer league with three sections, games where the team was the home side, games where the team was the visiting (away) side and all of the teams games. The home and away parts are just for reference and it is the 'All' games part where any sorting is done (firstly by points descending).

What I am trying to achieve is to show the standings for just the last 6 (or any number) games for each team.

It cannot be filtered by date as the games are not always played on the same day so some teams may have their 6 games go back over the last 6 weeks and some may have played 6 games in the last 3 weeks. In the table the 'P' column is games played and as you can see some teams by this point had played 20+ games while others had only played 15.

 

In a nutshell I need to get each team, find the 6 most recent games for each of those teams, and then sort the standings based on those results.

 

The nearest I ever got to making it work was for it to select the most recent 6 games from the database and not the most recent for each of the teams. My guess is that it would need to get the last 6 games for each team before doing any grouping but I am probably miles out!

 

Thanks in advance for any suggestions.

 

Steve

 

 

 

 

[attachment deleted by admin]

Link to comment
Share on other sites

Apologies for the double quick post. I realised that I had put more code in the html file than was needed and also thought that it may be easier if code for just the bit that shows all games may be easier to work with so I have attached another version if that is better.

 

Steve

 

[attachment deleted by admin]

Link to comment
Share on other sites

Thanks for the info.

 

No guarantees though that I might solve this, as I am busy, and admittedly, your problem is a bit nontrivial.

 

I can say this though: Your problem involves ordered sets, and unfortunately SQL (and MySQL for that matter) does not have built-in facilities to deal with ordered sets. So whatever solution, the sql way, is proposed, it might involve a hack and might not be DBMS cross-compatible.

 

This might be a good start, though: http://explainextended.com/2009/03/05/row-sampling/, as posted by fenway.

 

I'll get back to this as soon as I can.

Link to comment
Share on other sites

Thanks also Fenway.

 

I think the problem I am having is that all the instructions that have been offered, or i have found, have dealt with just one table but this one involves three.

 

Below is a trimmed version of the query to make it easier to work with and I have attached a dump of the tables and data.

 

Thanks again

Steve

 

$i = 1;

/*ADD th, st, rd TO THE POSITION NUMBERS*/
/*
* @return number with ordinal suffix
* @param int $number
* @param int $ss Turn super script on/off
* @return string
*/
function ordinalSuffix($i, $ss=0)
{
    /*** check for 11, 12, 13 ***/
    if ($i % 100 > 10 && $i %100 < 14)
    {
        $os = 'th';
    }
    /*** check if number is zero ***/
    elseif($i == 0)
    {
        $os = '';
    }
    else
    {
        /*** get the last digit ***/
        $last = substr($i, -1, 1);

        switch($last)
        {
            case "1": 
            $os = 'st';
            break;

            case "2":
            $os = 'nd';
            break;

            case "3":
            $os = 'rd';
            break;

            default:
            $os = 'th';
        }
    }

    /*** add super script ***/
    $os = $ss==0 ? $os : '<sup>'.$os.'</sup>';

    /*** return ***/
    return $i.$os;
};

/*GET THE CORRECT SEASON*/
if (isset($_GET['season_id']) ? $_GET['season_id'] : 1);
$season_id = $_GET['season_id'];

/*COMMON STANDINGS QUERY STRINGS*/
$ht = "g.home_team = t.team_id";
$at = "g.away_team = t.team_id";
$hw = "g.home_goals > g.away_goals";
$aw = "g.home_goals < g.away_goals";
$d = "g.home_goals = g.away_goals";
$hg ="g.home_goals";
$ag ="g.away_goals";

/*THE STANDINGS QUERY*/ 
$table = mysql_query("SELECT *,
t.team_name as Tm
, @rownum := @rownum+1 AS rank
, ((sum(CASE WHEN (".$ht." AND ".$hw.")OR(".$at." AND ".$aw.") THEN 3 ELSE 0 END)
+ sum(CASE WHEN (".$ht." OR ".$at.") AND ".$d." THEN 1 ELSE 0 END)))  AS P
, count(CASE WHEN (".$ht." OR ".$at.") THEN 1 ELSE 0 END) as GP
, sum(CASE WHEN (".$ht." AND ".$hw.") OR (".$at." AND ".$aw.") THEN 1 ELSE 0 END) AS W
, sum(CASE WHEN (".$ht." AND ".$d.") OR (".$at." AND ".$d.") THEN 1 ELSE 0 END) AS D
, sum(CASE WHEN (".$ht." AND ".$aw.") OR (".$at." AND ".$hw.") THEN 1 ELSE 0 END) AS L
, SUM(CASE WHEN (".$ht.") THEN ".$hg." WHEN (".$at.") THEN ".$ag." END) as GF
, SUM(CASE WHEN (".$ht.") THEN ".$ag." WHEN (".$at.") THEN ".$hg." END) as GA
, (SUM(CASE WHEN (".$ht.") THEN ".$hg." WHEN (".$at.") THEN ".$ag." END)
- SUM(CASE WHEN (".$ht.") THEN ".$ag." WHEN (".$at.") THEN ".$hg." END)) as GD
from teams t
left join all_games g 
on t.team_id in (g.home_team,g.away_team)
LEFT JOIN seasons as S ON g.date BETWEEN S.season_start AND S.season_end
LEFT JOIN deductions d ON ( d.team = t.team_id) AND (d.season = S.season_id)
WHERE comp = '1' AND home_goals IS NOT NULL 
AND S.season_id = $season_id 
GROUP BY t.team_id
ORDER BY P desc, GD desc, GF desc 
");

 

[attachment deleted by admin]

Link to comment
Share on other sites

To clarify once more:

 

You  need a list of the teams.

 

Then you wanted to aggregate/summarise only the last 6 games of each team.

 

The aggregated/summarised columns are the "ALL" section.

 

Is that correct?

Link to comment
Share on other sites


SELECT
  last_games.team_id
,last_games.team_name
,last_games_ids
,all_games.*
FROM
  all_games
INNER JOIN (

SELECT
  DISTINCT
  b.`team_id`
,b.`team_name`
,SUBSTRING_INDEX(
    GROUP_CONCAT(a.`all_games_id` ORDER BY a.`all_games_id` DESC)
   ,','
   ,6 --change this as applicable. this represents the latest N games, w/c in this case is 6.
  ) AS 'last_games_ids'
FROM
  all_games a
INNER JOIN teams b ON
  a.`away_team` = b.`team_id`
OR a.`home_team` = b.team_id
GROUP BY b.team_id
)last_games ON
   FIND_IN_SET(all_games.`all_games_id`, last_games.last_games_ids) >= 1
WHERE last_games.`team_name` LIKE '%SUTTON%' --comment out if you want to list all the last 6 of all teams.
ORDER BY last_games.team_id, all_games_id DESC;

 

The query above only lists the last 6 games for team Sutton United; comment out the WHERE clause, and it will list (at most) all last 6 games

of all teams.

 

I have not aggregated this relation.

 

But I think it will be trivial from thereon, just use GROUP BY per team_id, then perform your usual SUM aggregate functions.

 

I hope this is correct and it helps.

Link to comment
Share on other sites

In answer to both of you, in no particular order..........

 

ebmigue - It is getting close but I am still having a problem!

 

If I use your query leaving out WHERE last_games.`team_name` LIKE '%SUTTON%' I get a table that shows each teams name 6 times so that part of it works (http://www.margate-fc.com/content/test/cform2.php).

 

If I add GROUP BY last_games.team_id just before the ORDER BY statement, it only shows each teams name once so the grouping is fine (http://www.margate-fc.com/content/test/cform3.php).

 

However, no matter where I put the aggregate statement, it results in an empty table but I think that the problem is just my confusion and maybe I am using the wrong aliases.

 

If I change my strings to

$ht = "a.home_team = b.team_id";
$at = "a.away_team = b.team_id";

 

And add the aggregate to find the number of games played

 ) AS 'last_games_ids'
  , count(CASE WHEN (".$ht." OR ".$at.") THEN 1 ELSE 0 END) as GP
FROM
  all_games a

the theory should be that the 'P' column on the page should have '6' in it for every team but the column is showing up blank. If I can work out how to get that part working correctly at least, then I can move on with all the other aggregates and sorting by date descending (which all should be fairly simple once I have it working).

 

fenway -  I revisited that link and have tried to implement it but I have got myself confused again.

As I understand it, the grouper would be t.team_id as it is the teams that need to be grouped. The t_limiter would be all_games as that is where the dates are contained and the date is vital (IMO) to get the last 6 games by team.

In the WHERE statement, I have translated

l.grouper = dl.grouper

to

WHERE  l.team_id IN  (dl.home_team,dl.away_team)

and

l.id = mid

to

l.date= dl.date

 

But once again, it just returns an empty table. I even stripped the whole thing bare and started again with just the all_games table and still couldn't get that to work.

It is frustrating as looking at what I am trying to do seems fairly simple and is used on soccer sites everywhere and looking at that example you posted I can understand the logic and how it works, but for the life of me, I just cannot get it to work the way I need.

 

 

Thanks to both of you for your continuing patience and help.

 

Steve

Link to comment
Share on other sites

Ok, try this first:

 

SELECT
  last_games.team_id
,last_games.team_name

,COUNT(DISTINCT `all_games`.`all_games_id`) AS 'GP'
,SUM(CASE WHEN all_games.`away_team` = last_games.team_id THEN 1 ELSE 0 END) AS 'AWAY_GAMES'
,SUM(CASE WHEN all_games.`home_team` = last_games.team_id THEN 1 ELSE 0 END) AS 'HOME_GAMES'
,SUM(CASE WHEN all_games.`away_team` = last_games.team_id THEN COALESCE(`all_games`.`away_goals`, 0) ELSE 0 END) AS 'GOALS_WHEN_AWAY'
,SUM(CASE WHEN all_games.`home_team` = last_games.team_id THEN COALESCE(`all_games`.`home_goals`,0) ELSE 0 END) AS 'GOALS_WHEN_HOME'

--,etc., etc..
--,last_games_ids
--,all_games.*
FROM
  all_games
INNER JOIN (

SELECT
  DISTINCT
  b.`team_id`
,b.`team_name`
,SUBSTRING_INDEX(
    GROUP_CONCAT(a.`all_games_id` ORDER BY a.`all_games_id` DESC)
   ,','
   ,6 --change the as applicable. this represents the latest N games
  ) AS 'last_games_ids'
FROM
  all_games a
INNER JOIN teams b ON
  a.`away_team` = b.`team_id`
OR a.`home_team` = b.team_id
GROUP BY b.team_id
)last_games ON
   FIND_IN_SET(all_games.`all_games_id`, last_games.last_games_ids) >= 1
--WHERE last_games.`team_name` LIKE '%SUTTON%'
GROUP BY last_games.team_id
ORDER BY last_games.team_id, all_games_id DESC;

 

I took the liberty of aggregating some attributes for you. You can do the rest, I suppose.

 

--------------

 

However, no matter where I put the aggregate statement, it results in an empty table...

 

Are you sure you are getting an empty table? Maybe you mean an empty column?

 

To be sure, aggregate functions will NOT cause you getting an empty table, but some of the relational operators will cause you to have an empty relation/table (e.g, JOIN, WHERE, MINUS, etc..).

 

In any case, check first your query, as it might have a syntax error. And PHP (or the library you are using) might be returning an empty array when it encounters exceptions during db calls. I suggest that you first run the query I gave in an external program (say, phpmyadmin), and investigate the matter there first.

 

Further, DO NOT at first "process" your query with variables. I suggest you copy/paste the code i've given "as is" to your existing PHP script, and check if it is working. If it is, then perform you modifications, add your PHP variables. Just a suggestion.

 

----------------

 

Regarding this:

count(CASE WHEN (".$ht." OR ".$at.") THEN 1 ELSE 0 END) 

 

When you need to COUNT the number of games, that will not work. Why not COUNT(DISTINCT all_games_id) instead?

 

-------------

If you are getting an empty column, that is because you have NULLs in the home_goals and away_goals column.

 

If you use SUM on those fields, the result will always be NULL, even if the same attribute(s) of other rows are not NULLs. You might want to use COALESCE on such cases. (Please see query above)

 

 

Hope it helps.

 

P.S. If the answer to my question at post http://www.phpfreaks.com/forums/index.php?topic=325737.msg1591582#msg1591582 are all "yes", I strongly suspect that  fenway's methods are not anymore necessary and that your problem can be solved by the simpler query given above.

 

Of course, I might be wrong.

Link to comment
Share on other sites

I forgot to mention, that the comments in the query (i.e., the "--") might need to be removed when actually used in PHP. In my experience, they are not properly handled by mysql_*, and results to syntax errors, where it should not.

Link to comment
Share on other sites

I quick test of this, via my phone, seems to suggest this is going to work! If you look at http://www.margate-fc.com/content/test/cform3.php most of the teams have a 6 in the 'P' column (the ones that don't have a 6 have got less than 6 games in the database).

I will have a proper look when I get home but it is looking very promising.

Thanks as always

Steve

Link to comment
Share on other sites

Absolutely 100% perfect!!

 

I made a couple of slight amendments in that I changed the ORDER BY in GROUP_CONCAT to date DESC to get the last 6 games for each team, and the overall ORDER BY is now on Pts then GD. I also joined the seasons table so I can show it season by season and my test run works exactly how it should - http://www.margate-fc.com/content/test/cform3.php.

 

All that is left to do is to turn the case statements into reusable strings just to tidy the code up but massive, massive thanks to you. I would not like to think how many hours have been spent on this!

 

I have put my (current) final query below for other peoples reference.

 

Thanks again

Steve

 

SELECT
   last_games.team_id
, last_games.team_name
, COUNT(DISTINCT `all_games`.`all_games_id`) AS 'GP'
, SUM(CASE WHEN all_games.`home_team` = last_games.team_id AND all_games.`home_goals` > all_games.`away_goals` THEN 1 ELSE 0 END) +  SUM(CASE WHEN all_games.`away_team` = last_games.team_id AND all_games.`home_goals` < all_games.`away_goals` THEN 1 ELSE 0 END) AS 'W'
, SUM(CASE WHEN all_games.`home_team` = last_games.team_id AND all_games.`home_goals` < all_games.`away_goals` THEN 1 ELSE 0 END) +  SUM(CASE WHEN all_games.`away_team` = last_games.team_id AND all_games.`home_goals` > all_games.`away_goals` THEN 1 ELSE 0 END) AS 'L'
, SUM(CASE WHEN all_games.`home_team` = last_games.team_id AND all_games.`home_goals` = all_games.`away_goals` THEN 1 ELSE 0 END) +  SUM(CASE WHEN all_games.`away_team` = last_games.team_id AND all_games.`home_goals` = all_games.`away_goals` THEN 1 ELSE 0 END) AS 'D'
, (SUM(CASE WHEN all_games.`home_team` = last_games.team_id THEN `all_games`.`home_goals` ELSE 0 END) 
   + SUM(CASE WHEN all_games.`away_team` = last_games.team_id THEN `all_games`.`away_goals` ELSE 0 END)) 
   - (SUM(CASE WHEN all_games.`home_team` = last_games.team_id THEN `all_games`.`away_goals` ELSE 0 END) 
   + SUM(CASE WHEN all_games.`away_team` = last_games.team_id THEN `all_games`.`home_goals` ELSE 0 END)) AS 'GD'
, ((SUM(CASE WHEN (all_games.`home_team` = last_games.team_id AND all_games.`home_goals` > all_games.`away_goals`)OR(all_games.`away_team` = last_games.team_id AND all_games.`home_goals` < all_games.`away_goals`) THEN 3 ELSE 0 END)
   + SUM(CASE WHEN (all_games.`home_team` = last_games.team_id OR all_games.`away_team` = last_games.team_id) AND all_games.`home_goals` = all_games.`away_goals` THEN 1 ELSE 0 END)))  AS P
, SUM(CASE WHEN all_games.`away_team` = last_games.team_id THEN 1 ELSE 0 END) AS 'AWAY_GAMES'
, SUM(CASE WHEN all_games.`home_team` = last_games.team_id THEN 1 ELSE 0 END) AS 'HOME_GAMES'
, SUM(CASE WHEN all_games.`away_team` = last_games.team_id THEN COALESCE(`all_games`.`away_goals`, 0) ELSE 0 END) AS 'GOALS_WHEN_AWAY'
, SUM(CASE WHEN all_games.`home_team` = last_games.team_id THEN COALESCE(`all_games`.`home_goals`,0) ELSE 0 END) +  SUM(CASE WHEN all_games.`away_team` = last_games.team_id THEN COALESCE(`all_games`.`away_goals`, 0) ELSE 0 END) AS 'GF'
, SUM(CASE WHEN all_games.`home_team` = last_games.team_id THEN COALESCE(`all_games`.`away_goals`,0) ELSE 0 END) +  SUM(CASE WHEN all_games.`away_team` = last_games.team_id THEN COALESCE(`all_games`.`home_goals`, 0) ELSE 0 END) AS 'GA'

FROM
  all_games
INNER JOIN 
(
SELECT
  	  DISTINCT
  	  b.`team_id`
	, b.`team_name`
	, SUBSTRING_INDEX(
      GROUP_CONCAT(a.`all_games_id` ORDER BY a.`date` DESC)
    ,',',6
  	) AS 'last_games_ids'
	FROM
  		all_games a
	 INNER JOIN teams b ON
  		  a.`away_team` = b.`team_id`
	  OR a.`home_team` = b.team_id
	 LEFT JOIN seasons as S ON a.date BETWEEN S.season_start AND S.season_end
  		WHERE a.comp = '1' 
   		AND S.season_id = 104
    	GROUP BY b.team_id
)
last_games ON
   	FIND_IN_SET(all_games.`all_games_id`, last_games.last_games_ids) >= 1
GROUP BY last_games.team_id
ORDER BY P DESC, GD DESC , GF DESC;

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.