Jump to content

Simple averageing problem


Go to solution Solved by Barand,

Recommended Posts

Hi I have this very simple averaging problem. Since I am almost a newbie with MySql, any help would be appreciated. 

 

I have a table which stores the results of a student's test conducted on a daily basis for 3 months.  I want to record the average score after each 5 days. So the first average score should be calculated after 5 days have passed and then after every 5 days. So I should have 2 average scores after 10 days and 3 after 15 days and so on. How can I achieve this in  Mysql. I want to use these average scores to display a graph. Ofcourse in general I would like it to calculate it for any n number of days.

 

Thank you 

Link to comment
https://forums.phpfreaks.com/topic/284153-simple-averageing-problem/
Share on other sites

Something like this

$db = new mysqli(HOST, USERNAME, PASSWORD, DATABASE);

$startDate = '2013-06-03';

/****** UNCOMMENT BLOCK TO CREATE TEST DATA ***************************************************

$db->query("DROP TABLE IF EXISTS testresult");

$sql = "CREATE TABLE testresult (
    name VARCHAR(20) NOT NULL,
    testdate DATE NOT NULL DEFAULT 0,
    score INT,
    PRIMARY KEY (name, testdate)
    )";
$db->query($sql);
    //
    // generate test data
    //
$students = array ('Peter', 'Paul', 'Mary');
$data = array();
$di = dateinterval::createFromDateString('next weekday');
$dt = new DateTime($startDate);
$dp = new DatePeriod($dt, $di, 14); // 15 day deriod
foreach ($dp as $day) {
    foreach ($students as $name) {
        $data[] = sprintf("('%s', '%s', %d)", $name, $day->format('Y-m-d'), rand(20,100));
    }
}
$sql = "INSERT INTO testresult VALUES " . join(',', $data);
$db->query($sql);
*******************************************************************************************

mysql> SELECT * FROM testresult;
+-------+------------+-------+
| name  | testdate   | score |
+-------+------------+-------+
| Mary  | 2013-06-03 |    58 |
| Mary  | 2013-06-04 |    99 |
| Mary  | 2013-06-05 |    34 |
| Mary  | 2013-06-06 |    56 |
| Mary  | 2013-06-07 |    37 |
| Mary  | 2013-06-10 |    22 |
| Mary  | 2013-06-11 |    45 |
| Mary  | 2013-06-12 |    30 |
| Mary  | 2013-06-13 |    58 |
| Mary  | 2013-06-14 |    31 |
| Mary  | 2013-06-17 |    98 |
| Mary  | 2013-06-18 |    42 |
| Mary  | 2013-06-19 |    55 |
| Mary  | 2013-06-20 |    47 |
| Mary  | 2013-06-21 |   100 |
| Paul  | 2013-06-03 |    60 |
| Paul  | 2013-06-04 |    98 |
| Paul  | 2013-06-05 |    51 |
| Paul  | 2013-06-06 |    61 |
| Paul  | 2013-06-07 |    63 |
| Paul  | 2013-06-10 |    71 |
| Paul  | 2013-06-11 |    21 |
| Paul  | 2013-06-12 |    85 |
| Paul  | 2013-06-13 |    75 |
| Paul  | 2013-06-14 |    65 |
| Paul  | 2013-06-17 |    87 |
| Paul  | 2013-06-18 |    85 |
| Paul  | 2013-06-19 |    81 |
| Paul  | 2013-06-20 |    46 |
| Paul  | 2013-06-21 |    93 |
| Peter | 2013-06-03 |    79 |
| Peter | 2013-06-04 |    70 |
| Peter | 2013-06-05 |    21 |
| Peter | 2013-06-06 |    75 |
| Peter | 2013-06-07 |    54 |
| Peter | 2013-06-10 |    46 |
| Peter | 2013-06-11 |    57 |
| Peter | 2013-06-12 |    66 |
| Peter | 2013-06-13 |    48 |
| Peter | 2013-06-14 |    29 |
| Peter | 2013-06-17 |    24 |
| Peter | 2013-06-18 |    69 |
| Peter | 2013-06-19 |    73 |
| Peter | 2013-06-20 |    92 |
| Peter | 2013-06-21 |    87 |
+-------+------------+-------+

**********************************************************************************************/
$sql = "SELECT name, MAX(testdate) as WeekEnding, AVG(score) as average
    FROM testresult
    GROUP BY name, WEEK(testdate)";
$db->query($sql);

/* RESULTS

+-------+------------+---------+
| name  | WeekEnding | average |
+-------+------------+---------+
| Mary  | 2013-06-07 | 56.8000 |
| Mary  | 2013-06-14 | 37.2000 |
| Mary  | 2013-06-21 | 68.4000 |
| Paul  | 2013-06-07 | 66.6000 |
| Paul  | 2013-06-14 | 63.4000 |
| Paul  | 2013-06-21 | 78.4000 |
| Peter | 2013-06-07 | 59.8000 |
| Peter | 2013-06-14 | 49.2000 |
| Peter | 2013-06-21 | 69.0000 |
+-------+------------+---------+ */

HI Barand, Hope you are doing good. Nice to be interacting and learning from you once again. Thanks for the effort above , however my problem is slightly different.  The table contains data for only one student to begin with and instead has one or more subjects that he / she is tested for like maybe English and Maths. The student takes a test for each subject everyday and we have to average averages for each 5 day slots. Not moving averages but simple 5 or n day slot averages. So for eg if we have the following table then averages for 5 days ( which may or may not be consecutive ) have to be taken out as indicated by their color slots. So the output will give 3 average values for each color block. Of-course their may be additional columns for scores of other subjects. Thanks and looking forward to your response.

 

+-------+------------+-------+
| name | testdate | score |
+-------+------------+-------+

| Mary | 2013-06-03 | 58 |
| Mary | 2013-06-04 | 99 |
| Mary | 2013-06-05 | 34 |
| Mary | 2013-06-06 | 56 |
| Mary | 2013-06-07 | 37 |

| Mary | 2013-06-10 | 22 |
| Mary | 2013-06-11 | 45 |
| Mary | 2013-06-12 | 30 |
| Mary | 2013-06-13 | 58 |
| Mary | 2013-06-14 | 31 |
|
Mary | 2013-06-17 | 98 |
| Mary | 2013-06-18 | 42 |
| Mary | 2013-06-19 | 55 |
| Mary | 2013-06-20 | 47 |
| Mary | 2013-06-21 | 100
|

Hey guys, while this seems simple I am sure that Its complex. I have had a few ideas that I have tried without success. Pls I request the gurus to take a look at this and help me out of this fix.

I tried this example below but it does not give the desired result though it does seem to be the way. I am sure one of you gurus can fix it just by looking at it. Please help me out. Thanks loads.

 

SET @i := 0;
SELECT
    @i:=@i+1 as rownum,
    FLOOR(@i/5) AS `datagrp`,
    AVG(`tmptbl`.`score`)
FROM (
    SELECT `score`
    FROM testresult 
) as `tmptbl`
GROUP BY `tmptbl`.`score`
  • Solution

The query I gave you before does give a simple average of of each week (five day group) however if they won't be consecutive then I adopted an approach similar to you.

 

For future proofing, don't add columns for subjects and no need for separate table for each student. Your setup would look something like this

+-------------+          +---------------+           +-------------+
|  student    |          | testresult    |           | subject     |
+-------------+          +---------------+           +-------------+
| studentId   |----+     | resultId      |     +-----| subjectId   |
| studentName |    +----<| studentId     |     |     | subjectName |
| etc         |          | subjectId     |>----+     +-------------+
+-------------+          | testdate      |
                         | score         |
                         +---------------+

As before this is the code I used to create the revised test data

$db->query("DROP TABLE IF EXISTS testresult");
$db->query("DROP TABLE IF EXISTS subject");

$sql = "CREATE TABLE testresult (
    resultId INT NOT NULL AUTO_INCREMENT,
    studentId INT,
    subjectId INT,
    testdate DATE NOT NULL DEFAULT 0,
    score INT,
    PRIMARY KEY (resultId)
    )";
$db->query($sql);

$sql = "CREATE TABLE subject (
    subjectId INT NOT NULL ,
    subjectName VARCHAR(20),
    PRIMARY KEY (subjectId)
    )";
$db->query($sql);

$sql = "INSERT INTO subject VALUES (1,'English'),(2,'Maths'), (3,'Science')";
$db->query($sql);
    //
    // generate test result data
    //
$subjects = array (1,2,3);
$data = array();
$di = dateinterval::createFromDateString('next weekday');
$startDate = '2013-06-03';

$dt = new DateTime($startDate);
$dp = new DatePeriod($dt, $di, 14); // 15 day deriod
foreach ($dp as $day) {
    foreach ($subjects as $sub) {
        $data[] = sprintf("(null, 1, %d, '%s', %d)", $sub, $day->format('Y-m-d'), rand(20,100));
    }
}
$sql = "INSERT INTO testresult VALUES " . join(',', $data);
$db->query($sql);

DATA

+----------+-----------+-----------+------------+-------+
| resultId | studentId | subjectId | testdate   | score |
+----------+-----------+-----------+------------+-------+
|        1 |         1 |         1 | 2013-06-03 |    78 |
|        2 |         1 |         2 | 2013-06-03 |    22 |
|        3 |         1 |         3 | 2013-06-03 |    77 |
|        4 |         1 |         1 | 2013-06-04 |    53 |
|        5 |         1 |         2 | 2013-06-04 |    22 |
|        6 |         1 |         3 | 2013-06-04 |    81 |
|        7 |         1 |         1 | 2013-06-05 |    23 |
|        8 |         1 |         2 | 2013-06-05 |    80 |
|        9 |         1 |         3 | 2013-06-05 |    63 |
|       10 |         1 |         1 | 2013-06-06 |    50 |
|       11 |         1 |         2 | 2013-06-06 |    95 |
|       12 |         1 |         3 | 2013-06-06 |    43 |
|       13 |         1 |         1 | 2013-06-07 |    72 |
|       14 |         1 |         2 | 2013-06-07 |    42 |
|       15 |         1 |         3 | 2013-06-07 |    99 |
|       16 |         1 |         1 | 2013-06-10 |    68 |
|       17 |         1 |         2 | 2013-06-10 |    58 |
|       18 |         1 |         3 | 2013-06-10 |    55 |
|       19 |         1 |         1 | 2013-06-11 |    76 |
|       20 |         1 |         2 | 2013-06-11 |    28 |
|       21 |         1 |         3 | 2013-06-11 |    96 |
|       22 |         1 |         1 | 2013-06-12 |    91 |
|       23 |         1 |         2 | 2013-06-12 |    51 |
|       24 |         1 |         3 | 2013-06-12 |    62 |
|       25 |         1 |         1 | 2013-06-13 |    89 |
|       26 |         1 |         2 | 2013-06-13 |    89 |
|       27 |         1 |         3 | 2013-06-13 |    62 |
|       28 |         1 |         1 | 2013-06-14 |    98 |
|       29 |         1 |         2 | 2013-06-14 |    68 |
|       30 |         1 |         3 | 2013-06-14 |    74 |
|       31 |         1 |         1 | 2013-06-17 |    76 |
|       32 |         1 |         2 | 2013-06-17 |   100 |
|       33 |         1 |         3 | 2013-06-17 |    24 |
|       34 |         1 |         1 | 2013-06-18 |    32 |
|       35 |         1 |         2 | 2013-06-18 |    96 |
|       36 |         1 |         3 | 2013-06-18 |    25 |
|       37 |         1 |         1 | 2013-06-19 |    64 |
|       38 |         1 |         2 | 2013-06-19 |    96 |
|       39 |         1 |         3 | 2013-06-19 |    50 |
|       40 |         1 |         1 | 2013-06-20 |    53 |
|       41 |         1 |         2 | 2013-06-20 |    41 |
|       42 |         1 |         3 | 2013-06-20 |    59 |
|       43 |         1 |         1 | 2013-06-21 |    30 |
|       44 |         1 |         2 | 2013-06-21 |    85 |
|       45 |         1 |         3 | 2013-06-21 |    93 |
+----------+-----------+-----------+------------+-------+

Query

SELECT MAX(testdate) as DateEnding, subjectName,  AVG(score) as average
    FROM (
        SELECT  testdate, score,
            @row:=IF(@prev=subjectID, @row+1, 1) as row,
            @prev:=subjectId as subjectId
        FROM testresult
            JOIN (SELECT @row:=0, @prev:='') as init
        ORDER BY subjectId, testdate
    ) as rows
        INNER JOIN subject USING (subjectId)
    GROUP BY FLOOR((row-1)/5), subjectName
    ORDER BY DateEnding, subjectName

RESULTS

+------------+-------------+---------+
| DateEnding | subjectName | average |
+------------+-------------+---------+
| 2013-06-07 | English     | 55.2000 |
| 2013-06-07 | Maths       | 52.2000 |
| 2013-06-07 | Science     | 72.6000 |
| 2013-06-14 | English     | 84.4000 |
| 2013-06-14 | Maths       | 58.8000 |
| 2013-06-14 | Science     | 69.8000 |
| 2013-06-21 | English     | 51.0000 |
| 2013-06-21 | Maths       | 83.6000 |
| 2013-06-21 | Science     | 50.2000 |
+------------+-------------+---------+
Edited by Barand

Hi barand,  I have been sitting and trying to dissect the query piecemeal to understand it. I'll be grateful if you can explain it so that it's easy for me to understand.

Since my database has only one subject which is default, all I need is the score averaged over testdate, I tried to do away with the following lines 

 

            @row:=IF(@prev=subjectID, @row+1, 1) as row,
            @prev:=subjectId as subjectId

   

as well as the ensuing JOIN command but obviously that's a mistake since GROUP BY FLOOR((row-1)/5) uses row to get the block of 5 values to be averaged. I am therefore not getting how the above two or I think just the 1st line should be modified to get the correct row value. This is what I tried and it's faulty.  But I think if i can calculate the rows value correctly, it would most probably work. Your help is sought again and thanks loads.

 

SELECT MAX(testdate) as DateEnding, AVG(score) as average

    FROM (
        SELECT  testdate, score,
 
        FROM testresult
 
        ORDER BY testdate
    ) as rows
 
    GROUP BY FLOOR((row-1)/5),
    ORDER BY DateEnding

I am attempting to future-proof it for you so you won't have to rewrite the query when you come to add extra subjects as you said you would be doing. So long as you have a subject column it won't matter if there are 1 or 1001 different subjects. If there is only 1 subject just put 1, say, in every records' subject field. If you add more students to the table then you will need to change it however. (The simplest change it that case would be to add "WHERE studentId = $id").

Hi Barand, Thanks for this. Ofcourse I see your point and I am sure i'll be able to utilize this when it comes to more than one student in a the table but i'ld be glad if you'ld be able to explain these, if only from a totally academic & learning point of view, the working of the query especially these two lines

 

            @row:=IF(@prev=subjectID, @row+1, 1) as row,
            @prev:=subjectId as subjectId

 

Also in the model that I am pursuing, ( again only from an academic point of view) I don't need to have a subject column because since there is only one student, I can have the subject as the field label instead and thereby save an entire field. In that case it would be feasible to come up with a solution that I have been trying in my previous reply. I am putting it down again below for convenience. Would it then be possible to get this working with the correct calculations of the row variable and if so how. Thanks. I hope you'll  appreciate that I wish to learn and not just pick the code from one place and put it in another. Thank you for all the help. 

 

SELECT MAX(testdate) as DateEnding, AVG(score) as average

    FROM (
        SELECT  testdate, score,
 
        FROM testresult
 
        ORDER BY testdate
    ) as rows
 
    GROUP BY FLOOR((row-1)/5),
    ORDER BY DateEnding
 

 

Edited by ajoo

 

 

@row:=IF(@prev=subjectID, @row+1, 1) as row,
            @prev:=subjectId as subjectId

 

My data was ordered by subject, date in the subquery so it would look like

subject     date
-------     ---------
    1       day 1
    1       day 2
    1       day 3
    1       day 4
    1       day 5
    2       day 1
    2       day 2
    2       day 3
    2       day 4
    2       day 5

When calculating the @row value I want to start again at 1 when the subject changes so that the same days fall into the same groups of five days

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.