Jump to content

Recommended Posts

Hello All -

 

I'm new to creating somewhat complicated queries (at least for me they are). I have several tables that have data in them and criteria that needs met in order to complete the query. I was going about this looping through the first Query:

 

SELECT *
		 FROM stores AS store
			 INNER JOIN invoice_details AS invDetails
			 ON store.sCustNum = invDetails.sCustNum
			 WHERE store.sSalesman = 508 AND store.BillToId = 7161
			 AND STR_TO_DATE(invDetails.invoiceDate,'%m/%d/%y') BETWEEN '2012-01-01' AND '2012-12-31'
			 GROUP BY store.sCustNum

 

Then using the following query to get the purchases from each customer based on the output of the above query. So I was looping through this query to get the SUMS of purchases for each customer. This wasn't working as it was just "stuck" with no output. Damn, I'm really envious of you pros.

 

SELECT SUM(invItems.merchantPrice) AS purchases
		 FROM invoice_items AS invItems
		 INNER JOIN invoice_details AS invDetails
			 ON invDetails.invoiceNumber = invItems.invNumber
		 WHERE invDetails.sCustNum = '{$store->sCustNum}' GROUP BY invDetails.sCustNum

 

It then dawned on me that I may be able to accomplish this using one query. I don't like others to do my work for me but I'm not sure how to approach this? I was looking into multiple SELECTs but again I'm not sure if that's the right direction. Really, thanks for any help on this matter.

Link to comment
https://forums.phpfreaks.com/topic/271451-sum-in-multiple-tables/
Share on other sites

Okay - I have to following queries I'm using to get the data I want form the database. I do get what I'm after but don't think it's the most practical approach and it takes some time to query the results. I would be very grateful if someone could help me combine these into one statement. Maybe it can't be done and I'm going about it the right way?

 

Query One:

 

$query = "SELECT sCustNum, BillToId, sNumber, sSalesman, sName, sAddress
		 FROM stores AS store
			 WHERE store.sSalesman = 508 AND store.BillToId = 7161
			 GROUP BY store.sCustNum
	  ";

 

Query Two:

 

   $query = "SELECT SUM(invItems.merchantPrice) AS purchases, SUM(invItems.retailPrice) AS retail
		    FROM invoice_items AS invItems
		    JOIN invoice_details AS invDetails
			    ON invDetails.invoiceNumber = invItems.invNumber
		    WHERE invDetails.sCustNum = '" . $store->sCustNum . "'
		    AND STR_TO_DATE(invDetails.invoiceDate,'%m/%d/%y') BETWEEN '{$fromDate}' AND '{$toDate}'
		  ";

 

Query Three - Same as Query Two but with different dates:

   $query = "SELECT SUM(invItems.merchantPrice) AS purchases, SUM(invItems.retailPrice) AS retail
		    FROM invoice_items AS invItems
		    JOIN invoice_details AS invDetails
			    ON invDetails.invoiceNumber = invItems.invNumber
		    WHERE invDetails.sCustNum = '" . $store->sCustNum . "'
		    AND STR_TO_DATE(invDetails.invoiceDate,'%m/%d/%y') BETWEEN '{$fromDateLastYear}' AND '{$toDateLastYear}'
		  ";

So I'm working through my problem. If I comment out the third JOIN and the SELECT SUMs I get results as expected. But, if I include the third JOIN it doesn't produce results? You'll notice the "invoice_items AS lastYearItems" JOIN has a different date. I would expect to see NULL in the LastYearPurchases and LastYearRetail IF the dates didn't exist but they do? What could I be doing wrong?

 

SELECT
   stores.sName,
   stores.BillToId,
   stores.sCustNum,
   SUM(invoice_items.merchantPrice) AS currYearPurchases,
   SUM(invoice_items.retailPrice) AS currYearRetail,
   SUM(lastYearItems.merchantPrice) AS LastYearPurchases,
   SUM(lastYearItems.retailPrice) AS LastYearRetail
    FROM invoice_details
		 JOIN stores ON (invoice_details.sCustNum = stores.sCustNum)
		 JOIN invoice_items ON (invoice_details.invoiceNumber = invoice_items.invNumber && STR_TO_DATE(invoice_details.invoiceDate,'%m/%d/%y') BETWEEN '2012-01-01' AND '2012-12-31' )
		 JOIN invoice_items AS lastYearItems ON (invoice_details.invoiceNumber = lastYearItems.invNumber && STR_TO_DATE(invoice_details.invoiceDate,'%m/%d/%y') BETWEEN '2011-01-01' AND '2011-12-31' )
    WHERE stores.BillToId = 7161 AND stores.sSalesman = 508
	    GROUP BY stores.sCustNum

I've changed the query a bit but still not able to get results?

 

SELECT
   stores.sName,
   stores.BillToId,
   stores.sCustNum,
   invoice_details.invoiceDate,
   ROUND(SUM(currYearItems.merchantPrice),2) AS currYearPurchases,
   ROUND(SUM(currYearItems.retailPrice),2) AS currYearRetail,
   ROUND(SUM(lastYearItems.merchantPrice),2) AS LastYearPurchases,
   ROUND(SUM(lastYearItems.retailPrice),2) AS LastYearRetail
    FROM stores
		 LEFT JOIN invoice_details ON (invoice_details.sCustNum = stores.sCustNum)
		 JOIN invoice_items AS lastYearItems ON (invoice_details.invoiceNumber = lastYearItems.invNumber && STR_TO_DATE(invoice_details.invoiceDate,'%m/%d/%y') BETWEEN '2011-01-01' AND '2011-12-31' )
		 JOIN invoice_items AS currYearItems ON (invoice_details.invoiceNumber = currYearItems.invNumber && STR_TO_DATE(invoice_details.invoiceDate,'%m/%d/%y') BETWEEN '2012-01-01' AND '2012-12-31' )
    WHERE stores.BillToId = 7161 AND stores.sSalesman = 508
	    GROUP BY stores.sCustNum

When joining and summing at the same time you get a multiplying effect on the totals. A way round is to group then join.

 

Not tested (no data) but try

 

SELECT store.sCustNum, BillToId, sNumber, sSalesman, sName, sAddress,
   totals1.purchases as purchases1,
   totals1.retail as retail1,
   totals2.purchases as purchases2,
   totals2.retail as retail2,

FROM stores AS store

LEFT JOIN 
   (
   SELECT invDetails.sCustNum,
       SUM(invItems.merchantPrice) AS purchases, 
       SUM(invItems.retailPrice) AS retail
   FROM invoice_items AS invItems
   JOIN invoice_details AS invDetails
       ON invDetails.invoiceNumber = invItems.invNumber
   WHERE STR_TO_DATE(invDetails.invoiceDate,'%m/%d/%y') 
           BETWEEN '{$fromDate}' AND '{$toDate}'
   GROUP BY invDetails.sCustNum    
   ) as totals1
   ON store.sCustnum = totals1.sCustNum

LEFT JOIN 
   (
   SELECT invDetails.sCustNum,
       SUM(invItems.merchantPrice) AS purchases, 
       SUM(invItems.retailPrice) AS retail
   FROM invoice_items AS invItems
   JOIN invoice_details AS invDetails
       ON invDetails.invoiceNumber = invItems.invNumber
   WHERE STR_TO_DATE(invDetails.invoiceDate,'%m/%d/%y') 
           BETWEEN '{$fromDateLastYear}' AND '{$toDateLastYear}'
   GROUP BY invDetails.sCustNum    
   ) as totals2
   ON store.sCustnum = totals2.sCustNum


WHERE store.sSalesman = 508 AND store.BillToId = 7161

Edited by Barand

Hi Barand -

 

Thanks for helping on this issue. I'm learning which is good. I've played around with the solution you included and it seems logical, howerver, when I put actual dates in the BETWEEN clause it just crunches (forever) and doesn't complete. Or I'm not patient enough to wait but should it really take that long?

 

Note: If I leave the date fields blank I get all columns with expected results but NULL for Purchase1, Retail1, Purchase2 and Retail2

 

I don't know of any other way of testing this issues but sending you the data. I've been using MySQL Workbench to do my testing. If you'd prefer, send me a message with your email address and I'll send you a dump of the database. I would attach the data to this post but the database is very large. 6 MB compressed.

 

Thanks again!

Since you are doing a conversion (STR_TO_DATE) on the date column, I'm pretty sure the index would be useless. You need to store the dates as a DATE datatype. Then an index could be used for the BETWEEN. As it is, the server is going to do a table scan (read through every row in the table) for both of the derived tables (the LEFT JOIN sub-queries), and convert every invoice date in the database (twice since you have two joins).

Since you are doing a conversion (STR_TO_DATE) on the date column, I'm pretty sure the index would be useless. You need to store the dates as a DATE datatype. Then an index could be used for the BETWEEN. As it is, the server is going to do a table scan (read through every row in the table) for both of the derived tables (the LEFT JOIN sub-queries), and convert every invoice date in the database (twice since you have two joins).

 

I understand... I'm using data that is imported from another system (Access) which the dates are stored as m/d/y. I'm wondering if I could write something to convert the date to a datetime format when importing? Currently the data is brought in using a simple CSV import with no "clean-up". The data I'm importing is historical... moving forward the data is saved as DATE datatype.

 

Thank you for your input. I would welcome any suggestions on how I could import the historical data converting the invoiceDate field using MySQL? My guess would be that I'd have to create an import method of sorts?

 

Thanks again!

Edited by viviosoft

Without more detail, it is hard to be specific. One of the following may be possible:

 

1) If you have any influence over the export (from Access) ask them to export as "Y-M-D", then change the datatype on your column and import.

 

2) If the import (to mySql) is a PHP script, put the STR_TO_DATE call in the INSERT statement and change the datatype.

 

3) If the import is a mySql LOAD DATA statement, change the column name to invoiceDateMDY (or something) then add another column at the end of the row as invoiceDate DATE DEFAULT NULL. After each import run an update UPDATE table SET invoiceDate = STR_TO_DATE(invoiceDateMDY, "%m/%d/%y") WHERE invoiceDate IS NULL. Eventually, you can drop the invoiceDateMDY column from the table. (I have not used LOAD DATA much, and I don't know if it will import fewer columns than exist in the table, but it is worth a try)

 

Be sure to add an INDEX on the true invoiceDate column.

So I've added a column invoiceDateYMD and I'm GETTING a result BUT it takes 129 secs to complete and I'm only querying a month? Here's what I have: This is Barand indexing the new column name invoiceDateYMD. Is there a way to speed this up? If the client has a range of a year they could be siting there until Christmas.

 

This makes sense why it's taking so long to get results back... The JOINS are querying the entire database and not WHERE store.sSalesman = 508 AND store.BillToId = 7161 Right?

 

SELECT store.sCustNum, BillToId, sNumber, sSalesman, sName, sAddress,
 totals1.purchases as purchases1,
 totals1.retail as retail1,
 totals2.purchases as purchases2,
 totals2.retail as retail2
FROM stores AS store
LEFT JOIN
 (
 SELECT invDetails.sCustNum,
		 SUM(invItems.merchantPrice) AS purchases,
		 SUM(invItems.retailPrice) AS retail
 FROM invoice_items AS invItems
 JOIN invoice_details AS invDetails
		 ON invDetails.invoiceNumber = invItems.invNumber
 WHERE invDetails.invoiceDateYMD BETWEEN '2012-01-01' AND '2012-01-31'
 GROUP BY invDetails.sCustNum
 ) as totals1
 ON store.sCustnum = totals1.sCustNum
LEFT JOIN
 (
 SELECT invDetails.sCustNum,
		 SUM(invItems.merchantPrice) AS purchases,
		 SUM(invItems.retailPrice) AS retail
 FROM invoice_items AS invItems
 JOIN invoice_details AS invDetails
		 ON invDetails.invoiceNumber = invItems.invNumber
 WHERE invDetails.invoiceDateYMD BETWEEN '2011-01-01' AND '2011-01-31'
 GROUP BY invDetails.sCustNum
 ) as totals2
 ON store.sCustnum = totals2.sCustNum

WHERE store.sSalesman = 508 AND store.BillToId = 7161
GROUP BY store.sCustNum

Edited by viviosoft

Thanks for the data dump.

 

I have run the query for date ranges of three months this year and three months last year. Query time 0.3 secs.

 

SELECT store.sCustNum, BillToId, sNumber, sSalesman, sName, sAddress,
    totals1.purchases as purchases1,
    totals1.retail as retail1,
    totals2.purchases as purchases2,
    totals2.retail as retail2

FROM stores AS store

LEFT JOIN
    (
    SELECT invDetails.sCustNum,
		    SUM(invItems.merchantPrice) AS purchases,
		    SUM(invItems.retailPrice) AS retail
    FROM invoice_items AS invItems
    JOIN invoice_details AS invDetails
		    ON invDetails.invoiceNumber = invItems.invNumber
    WHERE invdate BETWEEN '2012-08-26' AND '2012-11-25'
    GROUP BY invDetails.sCustNum    
    ) as totals1
    ON store.sCustnum = totals1.sCustNum

LEFT JOIN
    (
    SELECT invDetails.sCustNum,
		    SUM(invItems.merchantPrice) AS purchases,
		    SUM(invItems.retailPrice) AS retail
    FROM invoice_items AS invItems
    JOIN invoice_details AS invDetails
		    ON invDetails.invoiceNumber = invItems.invNumber
    WHERE invdate BETWEEN '2011-08-26' AND '2011-11-25'
    GROUP BY invDetails.sCustNum    
    ) as totals2
    ON store.sCustnum = totals2.sCustNum


WHERE store.sSalesman = 508 AND store.BillToId = 7161

 

The biggest time difference was after I added indexes on invoice number on both your details and items tables. It should be primary key on the invoice details table (except you have duplicate values ???) and a foreign key in the items table. Your tables appear to have no indexes!

wow... didn't realize Indexes made that impact on querying data? Thank you. I guess this would have worked MUCH better if I have indexed those in the first place. I'll be sure to clean up the table structure. Sorry. It's working as expected and MUCH faster... thanks again!

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.