viviosoft Posted December 1, 2012 Share Posted December 1, 2012 Hello All - I'm new to creating somewhat complicated queries (at least for me they are). I have several tables that have data in them and criteria that needs met in order to complete the query. I was going about this looping through the first Query: SELECT * FROM stores AS store INNER JOIN invoice_details AS invDetails ON store.sCustNum = invDetails.sCustNum WHERE store.sSalesman = 508 AND store.BillToId = 7161 AND STR_TO_DATE(invDetails.invoiceDate,'%m/%d/%y') BETWEEN '2012-01-01' AND '2012-12-31' GROUP BY store.sCustNum Then using the following query to get the purchases from each customer based on the output of the above query. So I was looping through this query to get the SUMS of purchases for each customer. This wasn't working as it was just "stuck" with no output. Damn, I'm really envious of you pros. SELECT SUM(invItems.merchantPrice) AS purchases FROM invoice_items AS invItems INNER JOIN invoice_details AS invDetails ON invDetails.invoiceNumber = invItems.invNumber WHERE invDetails.sCustNum = '{$store->sCustNum}' GROUP BY invDetails.sCustNum It then dawned on me that I may be able to accomplish this using one query. I don't like others to do my work for me but I'm not sure how to approach this? I was looking into multiple SELECTs but again I'm not sure if that's the right direction. Really, thanks for any help on this matter. Quote Link to comment Share on other sites More sharing options...
Barand Posted December 1, 2012 Share Posted December 1, 2012 Two main problems with using "SELECT * " 1) it's inefficient to select more than you need 2) people like me have no idea what your query is really doing in order to help you to rewrite it Quote Link to comment Share on other sites More sharing options...
viviosoft Posted December 1, 2012 Author Share Posted December 1, 2012 Barand, fare enough ... I will work to come up with a good query to explain my problem. Quote Link to comment Share on other sites More sharing options...
viviosoft Posted December 2, 2012 Author Share Posted December 2, 2012 Okay - I have to following queries I'm using to get the data I want form the database. I do get what I'm after but don't think it's the most practical approach and it takes some time to query the results. I would be very grateful if someone could help me combine these into one statement. Maybe it can't be done and I'm going about it the right way? Query One: $query = "SELECT sCustNum, BillToId, sNumber, sSalesman, sName, sAddress FROM stores AS store WHERE store.sSalesman = 508 AND store.BillToId = 7161 GROUP BY store.sCustNum "; Query Two: $query = "SELECT SUM(invItems.merchantPrice) AS purchases, SUM(invItems.retailPrice) AS retail FROM invoice_items AS invItems JOIN invoice_details AS invDetails ON invDetails.invoiceNumber = invItems.invNumber WHERE invDetails.sCustNum = '" . $store->sCustNum . "' AND STR_TO_DATE(invDetails.invoiceDate,'%m/%d/%y') BETWEEN '{$fromDate}' AND '{$toDate}' "; Query Three - Same as Query Two but with different dates: $query = "SELECT SUM(invItems.merchantPrice) AS purchases, SUM(invItems.retailPrice) AS retail FROM invoice_items AS invItems JOIN invoice_details AS invDetails ON invDetails.invoiceNumber = invItems.invNumber WHERE invDetails.sCustNum = '" . $store->sCustNum . "' AND STR_TO_DATE(invDetails.invoiceDate,'%m/%d/%y') BETWEEN '{$fromDateLastYear}' AND '{$toDateLastYear}' "; Quote Link to comment Share on other sites More sharing options...
viviosoft Posted December 2, 2012 Author Share Posted December 2, 2012 So I'm working through my problem. If I comment out the third JOIN and the SELECT SUMs I get results as expected. But, if I include the third JOIN it doesn't produce results? You'll notice the "invoice_items AS lastYearItems" JOIN has a different date. I would expect to see NULL in the LastYearPurchases and LastYearRetail IF the dates didn't exist but they do? What could I be doing wrong? SELECT stores.sName, stores.BillToId, stores.sCustNum, SUM(invoice_items.merchantPrice) AS currYearPurchases, SUM(invoice_items.retailPrice) AS currYearRetail, SUM(lastYearItems.merchantPrice) AS LastYearPurchases, SUM(lastYearItems.retailPrice) AS LastYearRetail FROM invoice_details JOIN stores ON (invoice_details.sCustNum = stores.sCustNum) JOIN invoice_items ON (invoice_details.invoiceNumber = invoice_items.invNumber && STR_TO_DATE(invoice_details.invoiceDate,'%m/%d/%y') BETWEEN '2012-01-01' AND '2012-12-31' ) JOIN invoice_items AS lastYearItems ON (invoice_details.invoiceNumber = lastYearItems.invNumber && STR_TO_DATE(invoice_details.invoiceDate,'%m/%d/%y') BETWEEN '2011-01-01' AND '2011-12-31' ) WHERE stores.BillToId = 7161 AND stores.sSalesman = 508 GROUP BY stores.sCustNum Quote Link to comment Share on other sites More sharing options...
viviosoft Posted December 2, 2012 Author Share Posted December 2, 2012 I've changed the query a bit but still not able to get results? SELECT stores.sName, stores.BillToId, stores.sCustNum, invoice_details.invoiceDate, ROUND(SUM(currYearItems.merchantPrice),2) AS currYearPurchases, ROUND(SUM(currYearItems.retailPrice),2) AS currYearRetail, ROUND(SUM(lastYearItems.merchantPrice),2) AS LastYearPurchases, ROUND(SUM(lastYearItems.retailPrice),2) AS LastYearRetail FROM stores LEFT JOIN invoice_details ON (invoice_details.sCustNum = stores.sCustNum) JOIN invoice_items AS lastYearItems ON (invoice_details.invoiceNumber = lastYearItems.invNumber && STR_TO_DATE(invoice_details.invoiceDate,'%m/%d/%y') BETWEEN '2011-01-01' AND '2011-12-31' ) JOIN invoice_items AS currYearItems ON (invoice_details.invoiceNumber = currYearItems.invNumber && STR_TO_DATE(invoice_details.invoiceDate,'%m/%d/%y') BETWEEN '2012-01-01' AND '2012-12-31' ) WHERE stores.BillToId = 7161 AND stores.sSalesman = 508 GROUP BY stores.sCustNum Quote Link to comment Share on other sites More sharing options...
Barand Posted December 2, 2012 Share Posted December 2, 2012 (edited) When joining and summing at the same time you get a multiplying effect on the totals. A way round is to group then join. Not tested (no data) but try SELECT store.sCustNum, BillToId, sNumber, sSalesman, sName, sAddress, totals1.purchases as purchases1, totals1.retail as retail1, totals2.purchases as purchases2, totals2.retail as retail2, FROM stores AS store LEFT JOIN ( SELECT invDetails.sCustNum, SUM(invItems.merchantPrice) AS purchases, SUM(invItems.retailPrice) AS retail FROM invoice_items AS invItems JOIN invoice_details AS invDetails ON invDetails.invoiceNumber = invItems.invNumber WHERE STR_TO_DATE(invDetails.invoiceDate,'%m/%d/%y') BETWEEN '{$fromDate}' AND '{$toDate}' GROUP BY invDetails.sCustNum ) as totals1 ON store.sCustnum = totals1.sCustNum LEFT JOIN ( SELECT invDetails.sCustNum, SUM(invItems.merchantPrice) AS purchases, SUM(invItems.retailPrice) AS retail FROM invoice_items AS invItems JOIN invoice_details AS invDetails ON invDetails.invoiceNumber = invItems.invNumber WHERE STR_TO_DATE(invDetails.invoiceDate,'%m/%d/%y') BETWEEN '{$fromDateLastYear}' AND '{$toDateLastYear}' GROUP BY invDetails.sCustNum ) as totals2 ON store.sCustnum = totals2.sCustNum WHERE store.sSalesman = 508 AND store.BillToId = 7161 Edited December 2, 2012 by Barand Quote Link to comment Share on other sites More sharing options...
viviosoft Posted December 2, 2012 Author Share Posted December 2, 2012 Hi Barand - Thanks for helping on this issue. I'm learning which is good. I've played around with the solution you included and it seems logical, howerver, when I put actual dates in the BETWEEN clause it just crunches (forever) and doesn't complete. Or I'm not patient enough to wait but should it really take that long? Note: If I leave the date fields blank I get all columns with expected results but NULL for Purchase1, Retail1, Purchase2 and Retail2 I don't know of any other way of testing this issues but sending you the data. I've been using MySQL Workbench to do my testing. If you'd prefer, send me a message with your email address and I'll send you a dump of the database. I would attach the data to this post but the database is very large. 6 MB compressed. Thanks again! Quote Link to comment Share on other sites More sharing options...
Barand Posted December 2, 2012 Share Posted December 2, 2012 Sounds like you need an index on the date - but with that unsortable format I'm not sure if that would be a help or a hindrance. Quote Link to comment Share on other sites More sharing options...
DavidAM Posted December 2, 2012 Share Posted December 2, 2012 Since you are doing a conversion (STR_TO_DATE) on the date column, I'm pretty sure the index would be useless. You need to store the dates as a DATE datatype. Then an index could be used for the BETWEEN. As it is, the server is going to do a table scan (read through every row in the table) for both of the derived tables (the LEFT JOIN sub-queries), and convert every invoice date in the database (twice since you have two joins). Quote Link to comment Share on other sites More sharing options...
viviosoft Posted December 2, 2012 Author Share Posted December 2, 2012 (edited) Since you are doing a conversion (STR_TO_DATE) on the date column, I'm pretty sure the index would be useless. You need to store the dates as a DATE datatype. Then an index could be used for the BETWEEN. As it is, the server is going to do a table scan (read through every row in the table) for both of the derived tables (the LEFT JOIN sub-queries), and convert every invoice date in the database (twice since you have two joins). I understand... I'm using data that is imported from another system (Access) which the dates are stored as m/d/y. I'm wondering if I could write something to convert the date to a datetime format when importing? Currently the data is brought in using a simple CSV import with no "clean-up". The data I'm importing is historical... moving forward the data is saved as DATE datatype. Thank you for your input. I would welcome any suggestions on how I could import the historical data converting the invoiceDate field using MySQL? My guess would be that I'd have to create an import method of sorts? Thanks again! Edited December 2, 2012 by viviosoft Quote Link to comment Share on other sites More sharing options...
DavidAM Posted December 2, 2012 Share Posted December 2, 2012 Without more detail, it is hard to be specific. One of the following may be possible: 1) If you have any influence over the export (from Access) ask them to export as "Y-M-D", then change the datatype on your column and import. 2) If the import (to mySql) is a PHP script, put the STR_TO_DATE call in the INSERT statement and change the datatype. 3) If the import is a mySql LOAD DATA statement, change the column name to invoiceDateMDY (or something) then add another column at the end of the row as invoiceDate DATE DEFAULT NULL. After each import run an update UPDATE table SET invoiceDate = STR_TO_DATE(invoiceDateMDY, "%m/%d/%y") WHERE invoiceDate IS NULL. Eventually, you can drop the invoiceDateMDY column from the table. (I have not used LOAD DATA much, and I don't know if it will import fewer columns than exist in the table, but it is worth a try) Be sure to add an INDEX on the true invoiceDate column. Quote Link to comment Share on other sites More sharing options...
Barand Posted December 2, 2012 Share Posted December 2, 2012 LOAD DATA lets you specify which columns are being imported Quote Link to comment Share on other sites More sharing options...
viviosoft Posted December 2, 2012 Author Share Posted December 2, 2012 Thanks guys... so I'm wondering if this will solve my initial problem? I think I'm going to try option 3 and update the NEW field column with the new date and see where that gets me. Quote Link to comment Share on other sites More sharing options...
viviosoft Posted December 2, 2012 Author Share Posted December 2, 2012 (edited) So I've added a column invoiceDateYMD and I'm GETTING a result BUT it takes 129 secs to complete and I'm only querying a month? Here's what I have: This is Barand indexing the new column name invoiceDateYMD. Is there a way to speed this up? If the client has a range of a year they could be siting there until Christmas. This makes sense why it's taking so long to get results back... The JOINS are querying the entire database and not WHERE store.sSalesman = 508 AND store.BillToId = 7161 Right? SELECT store.sCustNum, BillToId, sNumber, sSalesman, sName, sAddress, totals1.purchases as purchases1, totals1.retail as retail1, totals2.purchases as purchases2, totals2.retail as retail2 FROM stores AS store LEFT JOIN ( SELECT invDetails.sCustNum, SUM(invItems.merchantPrice) AS purchases, SUM(invItems.retailPrice) AS retail FROM invoice_items AS invItems JOIN invoice_details AS invDetails ON invDetails.invoiceNumber = invItems.invNumber WHERE invDetails.invoiceDateYMD BETWEEN '2012-01-01' AND '2012-01-31' GROUP BY invDetails.sCustNum ) as totals1 ON store.sCustnum = totals1.sCustNum LEFT JOIN ( SELECT invDetails.sCustNum, SUM(invItems.merchantPrice) AS purchases, SUM(invItems.retailPrice) AS retail FROM invoice_items AS invItems JOIN invoice_details AS invDetails ON invDetails.invoiceNumber = invItems.invNumber WHERE invDetails.invoiceDateYMD BETWEEN '2011-01-01' AND '2011-01-31' GROUP BY invDetails.sCustNum ) as totals2 ON store.sCustnum = totals2.sCustNum WHERE store.sSalesman = 508 AND store.BillToId = 7161 GROUP BY store.sCustNum Edited December 2, 2012 by viviosoft Quote Link to comment Share on other sites More sharing options...
Barand Posted December 2, 2012 Share Posted December 2, 2012 Thanks for the data dump. I have run the query for date ranges of three months this year and three months last year. Query time 0.3 secs. SELECT store.sCustNum, BillToId, sNumber, sSalesman, sName, sAddress, totals1.purchases as purchases1, totals1.retail as retail1, totals2.purchases as purchases2, totals2.retail as retail2 FROM stores AS store LEFT JOIN ( SELECT invDetails.sCustNum, SUM(invItems.merchantPrice) AS purchases, SUM(invItems.retailPrice) AS retail FROM invoice_items AS invItems JOIN invoice_details AS invDetails ON invDetails.invoiceNumber = invItems.invNumber WHERE invdate BETWEEN '2012-08-26' AND '2012-11-25' GROUP BY invDetails.sCustNum ) as totals1 ON store.sCustnum = totals1.sCustNum LEFT JOIN ( SELECT invDetails.sCustNum, SUM(invItems.merchantPrice) AS purchases, SUM(invItems.retailPrice) AS retail FROM invoice_items AS invItems JOIN invoice_details AS invDetails ON invDetails.invoiceNumber = invItems.invNumber WHERE invdate BETWEEN '2011-08-26' AND '2011-11-25' GROUP BY invDetails.sCustNum ) as totals2 ON store.sCustnum = totals2.sCustNum WHERE store.sSalesman = 508 AND store.BillToId = 7161 The biggest time difference was after I added indexes on invoice number on both your details and items tables. It should be primary key on the invoice details table (except you have duplicate values ???) and a foreign key in the items table. Your tables appear to have no indexes! Quote Link to comment Share on other sites More sharing options...
viviosoft Posted December 2, 2012 Author Share Posted December 2, 2012 wow... didn't realize Indexes made that impact on querying data? Thank you. I guess this would have worked MUCH better if I have indexed those in the first place. I'll be sure to clean up the table structure. Sorry. It's working as expected and MUCH faster... thanks again! Quote Link to comment Share on other sites More sharing options...
Barand Posted December 2, 2012 Share Posted December 2, 2012 PS My invDate is your invoiceDateYMD (added to the table) Quote Link to comment Share on other sites More sharing options...
viviosoft Posted December 2, 2012 Author Share Posted December 2, 2012 Got it, Thanks again, it's all working as expected! Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.