Jump to content

Filtering data on the dataset using the same dataset for subquery filtering


mbaric

Recommended Posts

I am having trouble writing exactly 4 queries in mysql.

 

Here is the view I am using for queries: http://pastie.org/8403299

 

Here is the Raw Data you can check 


 

which you can check.

 

What is needed:

 

1) Gained students are those who exist in the selected / filtered month and don't exist in the previous month.

 

2) Lost students are those who exist in the previous month of the selected / filtered month and don't exist in the selected / filtered month.

 

3) Get a list of all lost students since the start of the center till now().

 

4) Get a list of all gained students since the start of center till now.

 

I am currently stuck with query 1) (Gained students are those who exist in the selected / filtered month and don't exist in the previous month.)

 

I have written it like that (see just below), but I get the same number of rows back if I haven't filtered anything... uff.

 



  1. SELECT *



  2. FROM lessons_master_001 my1



  3. WHERE NOT EXISTS



  4. (



  5. SELECT 0



  6. FROM lessons_master_001 my2



  7. WHERE



  8. my1.student_id = my2.`student_id`



  9. AND



  10. my1.teacher_id = my2.teacher_id



  11. AND



  12. my1.`student_payment_id`= my2.`student_payment_id`AND



  13. CONCAT(CAST(MONTH(DATE_ADD(my1.Lesson_Booking_Date, INTERVAL -1 MONTH))AS CHAR(20)),CAST(YEAR(DATE_ADD(my1.Lesson_Booking_Date, INTERVAL -1 MONTH))AS CHAR(20)))



  14. =



  15. CONCAT(CAST(MONTH(my2.`Lesson_Booking_Date`)AS CHAR(20)),CAST(YEAR(my2.`Lesson_Booking_Date`)AS CHAR(20)))



  16. )



My logic is like this so I take everything from the view lessons_master_001 and give alias my1 and in the NOT EXISTS clause I am giving to the same dataset alias my2, then join those ID's that are also used in view lessons_master_001 and then try to connect previous month from data set my1 with current month from data set my2.

 

I am using DATE_ADD function to subtract month from current date and practically get date - 1 month.

 

I would really appreciate any help you will provide, cause I lost 2 days already on it with no progress in front.

 

Thank you in advance.

Link to comment
Share on other sites

In database you must learn to think set-oriented. You are looking for a set of records where it is true that the value of each student appears in the subset for this month, but does not appear in the subset for last month.

So the first thing you'd do is create the two subsets and link them; select once from the table for this month, and select a second time, for last month. Linkthose together and see which students don't appear in both sets.

 

Lookup the LEFT JOIN.

 

also, ear about dates in SQL, your whole concat stuff is very messy and boils down to "AND my1.lesson_booking_date - interval 1 month = my.lesson_booking_date"

Never manually transform dates to do comparisons. You pretty much never have to transform dates anyway, but certainly *never* in comparisons because it messes up any optimizations that the database can do to speed things up.

Link to comment
Share on other sites

Hi Vinny,

 

Thank you for your answer and pointers. Can you please give me an example of what you said in

 

So the first thing you'd do is create the two subsets and link them; select once from the table for this month, and select a second time, for last month. Linkthose together and see which students don't appear in both sets.

 

 

Can you write in mySQL syntax if that is not a problem, please. Thank you very much in advance, the rest of it I will try to handle myself.

Link to comment
Share on other sites

Instead of all the CONCAT of date items it's easier just to

EXTRACT(YEAR_MONTH FROM my1.date - INTERVAL 1 MONTH) = EXTRACT(YEAR_MONTH FROM my2.date)

so try something like

SELECT DISTINCT my1.*
FROM lessons_master_001 my1
    LEFT JOIN lessons_master_001 my2
    ON my1.student_id = my2.student_id
    AND
    EXTRACT(YEAR_MONTH FROM my1.date - INTERVAL 1 MONTH) = EXTRACT(YEAR_MONTH FROM my2.date)
WHERE my2.student_id IS NULL
Link to comment
Share on other sites

Thank you Barand and Vinny.

 

So here is what I again don't understand.

 

If the view lessons_master_001 returns 20676 rows


CREATE VIEW `lessons_master_001` AS 
SELECT
  `lessons`.`id`                                      AS `id`,
  `lessons`.`created_at`                              AS `Lesson_Booking_Date`,
  MONTH(`lessons`.`created_at`)                       AS `Month_Id`,
  CAST(MONTHNAME(`lessons`.`created_at`) AS CHAR(20) CHARSET utf8) AS `Month_Name`,
  YEAR(`lessons`.`created_at`)                        AS `Year`,
  TIMEDIFF(`lessons`.`to_time`,`lessons`.`from_time`) AS `Lesson_Hours`,
  (CAST(TIMEDIFF(`lessons`.`to_time`,`lessons`.`from_time`) AS DECIMAL(10,0)) / 10000) AS `Lesson_Hours_Num`,
  `lessons`.`from_time`                               AS `from_time`,
  `lessons`.`to_time`                                 AS `to_time`,
  `lessons`.`status`                                  AS `Lesson_Status`,
  `lessons`.`trial`                                   AS `trial`,
  `lessons`.`student_id`                              AS `student_id`,
  `users`.`first_name`                                AS `Student_Firstname`,
  `users`.`last_name`                                 AS `Student_Lastname`,
  CONCAT_WS(_utf8' ',`users`.`first_name`,`users`.`last_name`) AS `Student_Fullname`,
  `lessons`.`teacher_id`                              AS `teacher_id`,
  `users_1`.`first_name`                              AS `Teacher_Firstname`,
  `users_1`.`last_name`                               AS `Teacher_Lastname`,
  CONCAT_WS(_utf8' ',`users_1`.`first_name`,`users_1`.`last_name`) AS `Teacher_Fullname`,
  `lessons`.`rating_foreign_instruction`              AS `rating_foreign_instruction`,
  `lessons`.`rating_arabic_tuition`                   AS `rating_arabic_tuition`,
  `lessons`.`rating_call_quality`                     AS `rating_call_quality`,
  `lessons`.`rating_punctuality`                      AS `rating_punctuality`,
  `lessons`.`rating_comment`                          AS `rating_comment`,
  `lessons`.`feedback_flag`                           AS `feedback_flag`,
  `student_payments`.`id`                             AS `Stdpym_Id`,
  `student_payments`.`gateway`                        AS `Pym_Gateway`,
  `student_payments`.`status`                         AS `Pym_Status`,
  `student_payments`.`amount`                         AS `amount`,
  `student_payments`.`discount`                       AS `discount`,
  `student_payment_transactions`.`student_payment_id` AS `student_payment_id`,
  `student_payment_transactions`.`success`            AS `Stdpym_Success`,
  CONCAT(CAST(MONTH(`lessons`.`created_at`) AS CHAR(20)),CAST(YEAR(`lessons`.`created_at`) AS CHAR(20))) AS `month_year_string`
FROM ((((`student_payments`
      INNER JOIN `student_payment_transactions`
        ON ((`student_payments`.`id` = `student_payment_transactions`.`student_payment_id`)))
     INNER JOIN `lessons`
       ON ((`lessons`.`student_payment_id` = `student_payments`.`id`)))
    INNER JOIN `users`
      ON ((`lessons`.`student_id` = `users`.`id`)))
   INNER  JOIN `users` `users_1`
     ON ((`lessons`.`teacher_id` = `users_1`.`id`)))
WHERE ((`lessons`.`status` <> _utf8'canceled')
       AND (`student_payment_transactions`.`success` = 1)
       AND (`users`.`email` NOT IN(_utf8'hennanra@nildram.co.uk',_utf8'hennanra@gmail.com',_utf8'madinaharabic1@gmail.com',_utf8'madinaharabic2@gmail.com',_utf8'_zz_student_0001@madinaharabic.com',_utf8'samar.madinah@yahoo.com',_utf8'saso.vera@hotmail.com',_utf8'samar.work@hotmail.com',_utf8'madinaharabictuition@gmail.com',_utf8'samar.d1@hotmail.com',_utf8'samar.abdelmonem@gmail.com',_utf8'samar.d1@gmail.com',_utf8'omneya90@madinaharabic.com',_utf8'omneya@madinaharabic.com',_utf8'zzzz@test.com',_utf8'mama_omnia@hotmail.com'))
       AND (`users_1`.`email` NOT IN(_utf8'hennanra@nildram.co.uk',_utf8'hennanra@gmail.com',_utf8'madinaharabic1@gmail.com',_utf8'madinaharabic2@gmail.com',_utf8'_zz_student_0001@madinaharabic.com',_utf8'samar.madinah@yahoo.com',_utf8'saso.vera@hotmail.com',_utf8'samar.work@hotmail.com',_utf8'madinaharabictuition@gmail.com',_utf8'samar.d1@hotmail.com',_utf8'samar.abdelmonem@gmail.com',_utf8'samar.d1@gmail.com',_utf8'omneya90@madinaharabic.com',_utf8'omneya@madinaharabic.com',_utf8'zzzz@test.com',_utf8'mama_omnia@hotmail.com')))

Barand, when I execute your query like this:

SELECT DISTINCT my1.*
FROM lessons_master_001 my1
    LEFT JOIN lessons_master_001 my2
    ON my1.student_id = my2.student_id
    AND
    my1.teacher_id = my2.teacher_id
    AND
    my1.`student_payment_id` = my2.`student_payment_id`
    AND
    EXTRACT(YEAR_MONTH FROM my1.Lesson_Booking_Date - INTERVAL 1 MONTH) = EXTRACT(YEAR_MONTH FROM my2.Lesson_Booking_Date)
WHERE my2.student_id IS NULL

I got again 20676 rows, how can this be if we are checking just the students that DO NOT EXIST in previous month but exist in current month. Maybe for you will be stupid question but I really don't see the point here.... Can you guys explain...

 

From my perspective we should get LESS rows back....

Link to comment
Share on other sites

Barand,

 

thank you for your reply.

 

I believe that the last where clause statment 

WHERE my2.student_id IS NULL

is not needed at all. Or why do you believe cause you wrote it, above line is needed.

 

Haven't said before but I really liked your use of EXTRACT mysql function, was not aware on that at all.

 

Now as we went so far can you help me with other queries too, please.

 

Query 2) Lost students are those who exist in the previous month of the selected / filtered month and don't exist in the selected / filtered month.
Query 3) Get a list of all lost students since the start of the center till now().
Query 4) Get a list of all gained students since the start of center till now.
 
Would it be something in reverse from Query No. 1, cause now we have that it has to exist in previous and doesn't exist in current one. Can you please post it also as a solution here.
 
And what about Query 3 and 4. Should we do it with count or do we have some more elegant solution to it?
 
Again sorry for bothering you and troubling you so much. And thank you again for your help and sticking with me.
Link to comment
Share on other sites

I believe that the last where clause statment 

WHERE my2.student_id IS NULL

is not needed at all. Or why do you believe cause you wrote it, above line is needed.

 

 

You believe wrong.

 

When you do a table join it joins data from one table to the matching rows in the other.

 

With A INNER JOIN B only data from rows which match in both tables are returned.

 

With A LEFT JOIN B all rows from A are returned with data from matching B rows where there is a match. If there is no match, data from the B table contains NULL values.

 

Therefore those in my1 with no match in my2 contain NULL in the columns that come from the my2 table. As those are the ones we are looking for we have the WHERE clause.

 

Your second query is the reverse of that one. Give the others a go based on what I have given you so far. I won't just spoonfeed all the answers to you

Link to comment
Share on other sites

Hi Barand,

 

Thank you for nice explanation in joins.

 

I have two more questions for you. If you can please take a look at the lessons_master_001 view. When we try to join those two vies as we do here

FROM lessons_master_001 my1
    LEFT JOIN lessons_master_001 my2

and again check the lessons_master_001 view and the joins there is it enough to join just by student.id like you wrote:

FROM lessons_master_001 my1
    LEFT JOIN lessons_master_001 my2
    ON my1.student_id = my2.student_id

or it has to be like this?!


FROM lessons_master_001 my1
LEFT JOIN lessons_master_001 my2
ON my1.student_id = my2.student_id
AND
my1.teacher_id = my2.teacher_id
AND
my1.`student_payment_id` = my2.`student_payment_id

I gave those additional statements cause I wasn't getting any results with just the join on student.id.

 

Question 2)

 

Can you please provide me the logic on how would you wrote Query 3 and Query 4, should you COUNT if the student exists in previous time or would you take completely different approach?

 

Again thank you for your kind words and time.

Link to comment
Share on other sites

If you can please take a look at the lessons_master_001 view. When we try to join those two vies as we do here

FROM lessons_master_001 my1
    LEFT JOIN lessons_master_001 my2

and again check the lessons_master_001 view and the joins there is it enough to join just by student.id like you wrote:

FROM lessons_master_001 my1
    LEFT JOIN lessons_master_001 my2
    ON my1.student_id = my2.student_id

or it has to be like this?!


FROM lessons_master_001 my1
LEFT JOIN lessons_master_001 my2
ON my1.student_id = my2.student_id
AND
my1.teacher_id = my2.teacher_id
AND
my1.`student_payment_id` = my2.`student_payment_id

I gave those additional statements cause I wasn't getting any results with just the join on student.id.

 

 

 

If that's what it needs. Your data and application. I did say "try something like this" to demonstrate the method.

 

Why are you running all your queries on that bloated view?

Link to comment
Share on other sites

Just a thought. You will need to change the WHERE clause to restrict the search to the current month otherwise it's finding them for every month and not present in previous month. (Is that question 3?)

WHERE my2.student_id IS NULL
AND EXTRACT(YEAR_MONTH FROM my1.Lesson_Booking_Date) = EXTRACT(YEAR_MONTH FROM CURDATE())
Link to comment
Share on other sites

Hi Barand.

 

Our first query should look like this:

SELECT distinct
my1.id,
my1.from_time,
my1.to_time,
my1.Student_Fullname,
my1.Teacher_Fullname,
my1.Lesson_Booking_Date,
my1.trial,
my1.Lesson_Status

FROM lessons_master_001 my1
    LEFT JOIN lessons_master_001 my2
    ON my1.student_id = my2.student_id
    AND
    my1.teacher_id = my2.teacher_id
    AND
    my1.`student_payment_id` = my2.`student_payment_id`
    AND
    EXTRACT(YEAR_MONTH FROM my1.from_time - INTERVAL 1 MONTH) = EXTRACT(YEAR_MONTH FROM my2.from_time)
WHERE my2.student_id IS NULL

I believe this query should do the trick for query #1.

 

Can you please one more time explain the logic on query #3 or #4, would you add another subquery and then COUNT them out or? Sorry for asking the same thing....

Edited by mbaric
Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.