OM2 Posted August 22, 2017 Share Posted August 22, 2017 I'm creating a service that will create sales data for sellers. I need some advice on using a database and looking up many many times. Let's say a seller has 100,000 past sales. Let's say today they have 200 sales. One of the things I want to do for the 200 sales is to compare with the past 100,000 sales and give a mini report telling the seller if a customer is a repeat customer. Need to know if a loyal customer who has spent a lot OR if a bad customer who has caused trouble. Great idea. (I think!!) Problem: is that 200 x 100,000 database lookups? Is there a way to lessen the load on the server? 200 x 100,000 isn't that much of a problem I don't think. Let's assume I have 150 customers. I doubt if they will all run simultaneously - but a good chance all will run within 5 minutes (like if it's standard practice to start sorting orders at 2pm for example). Those 5 minutes could bring down my server! I'm thinking what would be the best way to store my data and then cope with the CPU requirement to go through that amount of data? Cloud storage + processing? I have no experience of this at all! Just looking for advice. Thanks. Quote Link to comment Share on other sites More sharing options...
Psycho Posted August 22, 2017 Share Posted August 22, 2017 (edited) There is absolutely no reason to multiple queries. You can get all the data (for all the customers) with a single query. I don't know your table structure or what specific data you want. But, I can give you an example: Let's assume we have these three tables: Salespeople sales_id, sales_name, etc. Customers: cust_id, cust_name, etc Orders: order_id, cust_id, sales_id, order_date, total, etc. - Note: will assume order date is a timestamp I can pull a report for a particular salesperson to show the average historical order total for all of the customers he had today. Let me walk you through a process on how to get there: Start by defining a simple query to get the average order total for all customers SELECT o.cust_id, AVG(o.total) as avg_order FROM orders o GROUP BY o.cust_id Ok, that gets a list of cust_ids and their average order totals. But, we probably want their name and we want to exclude any orders placed today. So, let's make a couple changes SELECT o.cust_id, c.cust_name, AVG(o.total) as avg_order FROM orders o JOIN customers c ON o.cust_id = c.cust_id WHERE o.order_date < CURDATE() GROUP BY o.cust_id OK, now we have results that will include the average historical order total for all customers (along with their name). So, all we need now is to limit that result to the customers that placed an order today with a specific sales person. So, let's create a separate query to find just the customers that placed an order with a specific sales person. SELECT cust_id FROM orders WHERE DATE(order_date) = CURDATE() AND sales_id = [salespersons-id] That should get us a result of every unique customer ID that placed an order with the provided sales_id today. We can use this new query to limit the results of the previous query by adding a new WHERE condition with the IN clause and using this new query as a subquery with the values. It would look like this: SELECT o.cust_id, c.cust_name, AVG(o.total) as avg_order FROM orders o JOIN customers c ON o.cust_id = c.cust_id WHERE DATE(o.order_date) < CURDATE() AND o.cust_id IN ( SELECT DISTICNT cust_id FROM orders WHERE DATE(order_date) = CURDATE() AND sales_id = [salespersons-id] ) GROUP BY o.cust_id So, that's just a single example. What you need to accomplish may be very different or require a different approach. But, typically, I would suggest a similar process. Break down the problem into manageable parts and "build up" the query to get your end result. Edited August 22, 2017 by Psycho Quote Link to comment Share on other sites More sharing options...
OM2 Posted August 22, 2017 Author Share Posted August 22, 2017 @Psycho That's just awesome THANKS! if you say it's OK... then that's cool with me I was just worried about my server being bogged down and serving no one Quote Link to comment Share on other sites More sharing options...
Psycho Posted August 22, 2017 Share Posted August 22, 2017 Also, If you plan on creating the reports as a one-time activity each day (e.g. saved as a file) as opposed to creating them on-demand whenever the salesperson wants to run it, you still only need to run one query to get the data for all salespeople. It adds a little more complexity, but not terribly so. In any case it should not take 5 minutes to run! You definitely need to ensure you are using good practices when creating your database schema and, most importantly, setting appropriate fields to be indexed. A good rule of thumb is that any "id" fields (e.g. Primary & Foreign keys) should be indexed. EDIT: If you plan on creating the reports as a one-time activity (e.g. saved as a file) each day as opposed to creating them on-demand whenever the salesperson wants to run it, you still only need to run one query. However, it would add a little more complexity. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.