Jump to content

Archived

This topic is now archived and is closed to further replies.

SecretAgentRege

DB Setup/Size

Recommended Posts

Hello,

I'm planning out a database, however, it has the potential to be quite large, and I don't know how large is too large. While I've worked with MYSQL quite alot, I have never really come across the best database planning, so my plan below is up for criticism.

Essentially, I will have one PARENT database containing general data about a particular object.

There will then be another CHILD database that will have data about objects linked to the parent object. [i][You can think of it like the PARENT is a forum post, and the CHILD are the replies, though I'm not designing a forum, but you get the idea].[/i]

My situation is that the PARENT database could have thousands of objects/rows. Each parent object could have 5-100+ child objects. So, if the PARENT database holds 6000 objects, and each object has 60 CHILD objects, that would make the PARENT db with 6,000 rows, and the CHILD database with 360,000 rows; all with the potential to grow 2, 3, 5x the size.

I have never dealt with a database quite of this size. Is this too much and should I think about another method to break up the CHILDren db's?

Thanks,
Rege

Share this post


Link to post
Share on other sites
First, I hope you mean "table" when you say "database" -- there's no need to have separate databases for this design (or almost any design). Second, 360K is not an issue for MySQL -- you'll just need to make sure you choose your data types carefully, and be smart about which indexes to add to keep queries running smoothly.

Share this post


Link to post
Share on other sites
D'oh! Yes, table. Not seperate databases.

Thanks so much for your reply! Well, when I think of how to index the table, I usually think of how I will be searching for the data.

Since I will be primarily searching by the link to the parent, and then sorting those linked to the parent by a process of one or two fields.. so the link and sorting fields would be the ones to index.

Does that make sense? (Sorry, I'm completely self-taught, so I sometimes second-guess my process because no one has told me one way is better than another).

Rege

Share this post


Link to post
Share on other sites
The easiest way for a database to balloon up to unmanageable size is with a bad database design that involves duplicate and redundant data. As long as you avoid that, you will be fine going with the simplest solution.

For instance, are these child objects ever linked to more than one parent? When that happens, do you have 2 rows in the child table for the same object, just because it has two parents? That would be bad.

It sounds like you have a pretty good handle on indexing, so you should be ok there. Just be sure to watch all your queries closely. Every time you write one, time it and make sure it only takes a few tens of milliseconds. If it takes a lot longer than your other queries, you may need to rethink it or add indexes.

Share this post


Link to post
Share on other sites
hi wickning1,

The child objects will never be linked to more than one parent, but they could contain similar, though not exact, data. However, now I've run into a situation, as to the best method to go about the following situation:

The CHILD object could contain its own CHILD objects; which could be between 0-10+ avg per original CHILD object, which would be another table up to 10 times the size of the original CHILD table.

These new child objects wouldn't be too complex, with probably 2 or 3 data fields, and so I must ask, would it be more beneficial to stick these perhaps as text/XML in the original CHILDs table? How big can a text field be if I was to go this route?

Or is this something that is neglegant?

Thanks,
Rege

Share this post


Link to post
Share on other sites
TEXT fields can hold up to 2^16 bytes (65K), while MEDIUMTEXT supports 1.5 MB. LONGTEXT fields are _really_ long. So you won't have a problem. Second, there shouldn't be a problem adding another table anyway.

Share this post


Link to post
Share on other sites
Thanks for the help fenway,

I think I might go for the big text field, for a fe reasons. One of which, users might be able to edit the contents of this field all at once, even if it appears to be different entries... the best I can think of is like a wiki of sorts.

Another thing to theorize: Say I have a table that holds contact information. Each row consists of a person's Name, Telephone, Address, etc.. But say users want to add a unlimited multitude of Telephone Numbers, cells, home, office, vacation, etc. etc. (bad example, but you get the picture).

Would it make more sense to have a seperate table of phone numbers that links back to each person in the original table, or just have each telephone number they add listed in the 'telephone' field with some common seperator so any middlewhere and tell when one phone number ends and the next begins?

Hmm.. i guess that example might lend itself to the seperate table... which would contradict my original thought process... darn.

Share this post


Link to post
Share on other sites
How are these objects being created? Database size should generally increase linearly with human activity, so are there humans making all these objects? Under what circumstances is a new object created?

Share this post


Link to post
Share on other sites
I will be building the initial database to start from XML files: Just over 2,500 PARENT objects with each having anywhere from 10-100 CHILD objects, with the average being probably 25-60 child objects.

After that, most creation will be human input. Probably starting off steeper (especially child objects) and then hitting a plateau at about 1:20 new parent:child creation spread across the objects; at the flatest part, probably an average 1-10 children per week; at the steepest probably 50-100 children a week (before plateau).

As far as these new CHILD table I spoke of (UNDER the ORIGINAL CHILD), I will probably not do this for the editing reason I spoke of.

Say each PARENT is a book, and each CHILD is a chapter. Why break each chapter into another CHILD of each paragraphs linked to the chapters? When you want to edit your chapter, you probably do not want to have it seperated into paragraphs.

While the actual information will be unrelated objects (probably more like the chapters in the example above), when that field is edited, I would like all information within that section to be edited much like a wiki would. This seems to lend itself to have all this information in one field, rather than have them as seperate objects in a different table.

Does that make more sense.

Also, I can't help but continue to thank you guys for your help. Thanks!

Share this post


Link to post
Share on other sites
IMHO, the ultimate choice between storing a field in a particular table or in a new table via a foriegn key comes down to atomicity, and in the real world, how the data will be used. Obviously, for phone numbers, it makes sense to have another table, so you can change each one invidiually (=atomically), and easily add/delete telephone entries. However, for something like a bunch of paragraphs, it often is better to store them together, for obvious reasons, some of which you have mentioned.

Share this post


Link to post
Share on other sites

×

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.