Posted 02 October 2006 - 06:58 PM
I am new to PHP and MYSQL, but I have a basic understanding of DB concept and some programming experience with other languages.
I am currently working on a project and I need your input as to the best way to design my website.
Here is the project description: a vendor rating system.
The process: There are 9,000 vendors who sell to customers in multiple states. A vendor may or may not operate in all States.
-Each customer has different characteristics for instance age, marital status, gender, range, etc.
-Each vendor sells a combination of different products (there are about 6 different types of product on the market and these products are mutually exclusive).
-A customer can use only one product type at a time. There are 8 different ways each product can be used. For instance, a body soap can be used to take bath and wash hand. It's not frequent to find people who collect different brand of body soap.
- Each vendor can have relationship (on a different matter: for instance exchanging information about their customers) with other vendors.
- I have the raw data about all the products sold from all the vendors along with all the characteristics previously described.
Goals: I would like to assign a score (based on customer experience with the product) to each vendor. Specifically, I want to accomplish the following:
0- Define the scoring system: for instance, the percentage of specific group of satisfied customers for a vendor, after taking into account whether or not the vendor sold at least 2% of the product to the specified group.
1. The customer comes to the website, inputs the characteristics and the zipcode, then the web application will show the top ranked vendors for the customer.
2. The customer can decide not to give all the characteristics, and I still should be able to provide top vendors.
3. The customer may want to see the top vendors in the county, state or nationwide instead and the web application should be able to give it as well.
4. I also want to show the customer the list of vendors that do business with a particular vendor.
Now my questions are:
How many tables do I need in this situation?
How do I go about designing a consistent database for this?
Any help will be appreciated.
Posted 02 October 2006 - 09:46 PM
vendor_id | vname | ... >> vendor table
state_id | name | abbreviation | ... >> state table
prod_id | name | cat_id | ... >> product table
cat_id | name | ... >> product category table
The ... represents any extra information required for each row in that table, such as an address for the customer table. Given those 4 tables we can start relating your data.
rel_id | vendor_id | cust_id >> vendor / customer relationship table
Each row in the table is a unique combination of vendor_id and cust_id.
rel_id | vendor_id | state_id >> vendor / state relationship table
Each row in the table is a unique combination of which state a particular vendor is active in.
rel_id | vendor_id | prod_id >> vendor / product relationship table
Each row in the table is a unique combination of a product carried by a vendor.
purchase_id | vendor_id | cust_id | prod_id | date | cost | ... >> purchase history table
This table will allow you to generate a purchase history of all customers across all vendors
for all products.
That's just one possible DB setup and possible relating tables. Remember to keep in mind when creating your DB how the data relates to itself. Define clearly what a product is, what a customer is, what a vendor is, etc. The number of tables required will depend on the types of relationships the data has. Is the relationship between a product and vendor one to one (only that vendor carries that product) or one to many (several vendors carry that product)? Does each vendor operate in only one state (one to one) or can they operate in several states (one to many)?
Posted 03 October 2006 - 02:12 PM
I did not think of creating the relationships as you did. This helps alot.
I am not sure how big the purchase history table is going to be because the customer table alone has more than 3 millions records. Is there a potential to slow down mysql if I get this many records in one table?
For the querries, should I just do all the background calculations (goals 1 thru 4) and then post the results in one simple table or just let customers to querry the database from the raw data? I think the web application will be faster if I store only the results in a table. But again, the number of reports will be some what limited and the application will not be that flexible.
As to your questions, the relationship between a product and a vendor is many to many (a vendor can carry many products and several vendors can carry the same product).
Each vendor can operate in many states.
Posted 03 October 2006 - 04:40 PM
You say the customer table has 3 million records, is that across all vendors or per vendor?
If it's the average per vendor and you have 9k vendors, you might want to give each vendor a separate DB, all of which have the same tables. Otherwise you'd be looking at billions of records.
But seeing as how you want to run queries against all vendors, your life will probably be simpler if everything is in the same DB.
Here's my recommendation:
Put everything into a single DB, leave indexes off your tables (with the exception of the primary key), and write your site. Set your site up with a debugging mode that will dump all of your SQL queries as sent to MySQL. For any sluggish portion of the site that you then come across, take note of the queries and the columns searched against (i.e. Look at the WHERE clauses). Take note of the most common fields (product name, vendor name, etc.) and go back and use ALTER TABLE statements to add indexes on those columns. You should see some performance gain that way.
As for the particular reports you want to display, such as vendor rating, I'd try to do them in real time first and see what kind of performance you get. If the performance is tolerable, leave it. If there's just too much data for a particular report to be done on the fly, consider setting up a cron job to run daily or weekly to precompile the results into a table and then you can just SELECT from that table.
I've only been working with PHP and MySQL for a little more than a year and only 6 months professionally; if I'm totally off-base hopefully someone with more experience will chime in.
Posted 03 October 2006 - 05:53 PM
As for the database size, I have about 12 table with more than 3 millions records. So, I have a minimum of 36 millions records in total. I am not sure if MYSQL will handle it, but I think that with a better database design, it can be done. The whole question is to find the right design.
This is what I was previously thinking:
Have all 36 millions records in one table. Then create all the possible combinations of customers' profiles by concatenating the fields.
For instance, I will concatenate Cust-Id, State, County, zipcode, vendor-Id and product to identify the products sold in a particular zip code by a particular vendor. That way, I can run a select querry from the DB. But if this has to run through 36 millions records each time someone uses the web application, I can anticipate trouble down the road.
Thanks for your input.
Posted 03 October 2006 - 07:08 PM
Posted 03 October 2006 - 07:30 PM
How does the vendor data get updated? Do the vendors import it? Do you import it?
Posted 03 October 2006 - 07:54 PM
Who is going to be inputing the data? Are you going to have a user management system? That may be something else to think about too if the vendors login and add what ever data.. Just a thought.
Posted 04 October 2006 - 02:03 PM
The vendor data get updated by an employee of the vendor.
Like, whenever there is a sale, an employee will go online, input the information relating to the sale.
Yes, it looks like I will need a user management system.
Also, each vendor may need a full report on its activity at the end of the month. This is bunch of table. I am not considering graphs at this point, that will be over my head. The format of the report will be the same (about 50 pages) and should generate specifically on the activity of each vendor. There is a need for tracking these reports over time. Will I need a different database to store the reports in pdf format?
I need help breaking down this project into doable pieces.
Posted 04 October 2006 - 06:26 PM
I would also start gathering information from your vendors as to how they'd like to use the site. Do they want static reports that are built every day? Or do they want up-to-date reports generated on the fly? Some may want both.
It doesn't matter how you program the site if it doesn't work for them.
(EDIT) Forgot to mention something. It might be beneficial to allow super-users to tell the site to treat them as a particular user. For example, you're the admin, you're logged in, but you need to switch to one of your regular users to test something. Rather than logging out and back in, just add a feature for admins to switch to a specific user and then switch back when they're done. It's something I wish I had done.
0 user(s) are reading this topic
0 members, 0 guests, 0 anonymous users