Search the Community
Showing results for tags 'database provisioning'.
Hi, I have some questions based on a MS-SQL project I'm, now, apart of to see if the current db structure the most efficient - basically, asking about db provisioning 101 / theory as I'm getting push back on recommended edits to improve the current db design based on 'expected' future growth. Here's the current db structure for the application: There is one MS-SQL database server, running one instance, to which the application is (currently) creating a new database for each (new) client that signs up to the service - all on that one db server. The current devs stated that they designed the db process in this manner so that each client's data would be, completely, separate from the other clients' data & this way db administrative efforts would be less hectic & more efficient My view: I think there is a better way to address the db / data requirements - see below to which I would like feedback which approach is more efficient, short & long-term based on resource requirements & load / latency. Issue / Concern: (Giving a scenario of having - i.e 1k or even 10k clients at some point to which will accumulate, total, of around 30G of data over time) Q: Won't having that number of (individual) db's - all, literately, an exact replicate of the other db - actually create more (unnecessary) latency / load on the db server / resources then just having ONE 'master' database & housing ALL the clients' data within that (optimized) database (of course Indexing crucial to keep that db optimized to handle the total load for all client data; it's same as having multiple db's in my view based on proper indexing - just a more efficient design)? OR (even better approach based on the expected growth of data, overall) Distribute the, pertinent, data sets based on their role / service - individually - across separate database servers - so that the load / latency issues minimized for each db server (itself).For example: all email marketing (results) data would be housed on db1 server whereas all client specific data would be housed on db2 server. Reason: the bulk email marketing application demands a tremendous amount of resources & disk space so by isolating those services / operations removes contention between the two db's (Yes / No?) To hook the separate sets of data when required (client or email data)? Answer: - the SQL queries would just JOIN / UNION data sets from the (separate) specified MS-SQL db servers based on the required data to produce SQL results (JOIN / UNION already the method with current BI queries using to date; I'm just suggesting the data be segmented to different db servers based on their role / service to the application). Note: what I'm wanting to get other's views on is this: Q: does it make sense to have multiple copies of the same (exact) db schema for client data vs one 'master' db schema which can house ALL the data so there is less db administrative work (1st off) & (2ndly) reducing the actual load on the 'one server' itself. Note: I know storage pools can be implemented to expand disk space for the MS-SQL service but that's another topic regarding 'limited' disk space issues (per server) as the data grows, collectively. I would appreciate feedback on those questions to get others opinion(s) based on their experience in working with a 'global' / distributed cloud-based applications & MS-SQL. I have some ideas - 1. compiling all the (current), individual, client dbs to the one 'global' clients db then implementing a data warehouse (DW) to pull all data that's over seven days into the DW to which I'll then be able to manage said data using a variety of 'external' sources - MySQL Storage Area Network (SAN) solution & even Excel spreadsheets <for very old data that will be older than five years - to free disk space on the DW, etc...> 2. Implement Redis cache / cluster to offload (default) queries from both the 'source' db servers & DW for the majority of the queries - being the more requested data set / queries - will (now) be in-memory for each client. Note: Redis cache will NOT load those client's data sets until said client (any member of that group) 'successfully' authenticated into the system. 3. (still speaking on Redis) - data sets that have been sitting idle for a specified period of time (short # of days) would be flushed from cache to free of memory. Note: the cluster will be defined so that there is ample redundancy aside from persistence of the DW which will have 50G of disk space along with the use of an external SAN solution which will house another 50G of disk space - acting as source / destination to house Excel files along with the oldest sets of data within MySQL. Using data mgt services, the MS-SQL (DW) will pull any 'requested' data back into the DW from the SAN (destination) to serve the application, dynamically. Thus, this solution would make the current db solution more efficient & robust to expand both - horizontally & vertically - based on biz / data demands. Ok, those are my thoughts to make the current db design more efficient with the objective of thinking 'short & long-term' based on expected growth of data. Let me know your thoughts - suggestions / comments - all welcome. Thx in advance for responses!