Jump to content


Photo

DB Setup/Size


  • Please log in to reply
9 replies to this topic

#1 SecretAgentRege

SecretAgentRege
  • Members
  • Pip
  • Newbie
  • 8 posts

Posted 28 March 2006 - 08:46 PM

Hello,

I'm planning out a database, however, it has the potential to be quite large, and I don't know how large is too large. While I've worked with MYSQL quite alot, I have never really come across the best database planning, so my plan below is up for criticism.

Essentially, I will have one PARENT database containing general data about a particular object.

There will then be another CHILD database that will have data about objects linked to the parent object. [You can think of it like the PARENT is a forum post, and the CHILD are the replies, though I'm not designing a forum, but you get the idea].

My situation is that the PARENT database could have thousands of objects/rows. Each parent object could have 5-100+ child objects. So, if the PARENT database holds 6000 objects, and each object has 60 CHILD objects, that would make the PARENT db with 6,000 rows, and the CHILD database with 360,000 rows; all with the potential to grow 2, 3, 5x the size.

I have never dealt with a database quite of this size. Is this too much and should I think about another method to break up the CHILDren db's?

Thanks,
Rege

#2 fenway

fenway
  • Staff Alumni
  • MySQL Si-Fu / PHP Resident Alien
  • 16,199 posts
  • LocationToronto, ON

Posted 28 March 2006 - 09:17 PM

First, I hope you mean "table" when you say "database" -- there's no need to have separate databases for this design (or almost any design). Second, 360K is not an issue for MySQL -- you'll just need to make sure you choose your data types carefully, and be smart about which indexes to add to keep queries running smoothly.
Seriously... if people don't start reading this before posting, I'm going to consider not answering at all.

#3 SecretAgentRege

SecretAgentRege
  • Members
  • Pip
  • Newbie
  • 8 posts

Posted 28 March 2006 - 09:30 PM

D'oh! Yes, table. Not seperate databases.

Thanks so much for your reply! Well, when I think of how to index the table, I usually think of how I will be searching for the data.

Since I will be primarily searching by the link to the parent, and then sorting those linked to the parent by a process of one or two fields.. so the link and sorting fields would be the ones to index.

Does that make sense? (Sorry, I'm completely self-taught, so I sometimes second-guess my process because no one has told me one way is better than another).

Rege

#4 wickning1

wickning1
  • Members
  • PipPipPip
  • Advanced Member
  • 405 posts

Posted 28 March 2006 - 10:49 PM

The easiest way for a database to balloon up to unmanageable size is with a bad database design that involves duplicate and redundant data. As long as you avoid that, you will be fine going with the simplest solution.

For instance, are these child objects ever linked to more than one parent? When that happens, do you have 2 rows in the child table for the same object, just because it has two parents? That would be bad.

It sounds like you have a pretty good handle on indexing, so you should be ok there. Just be sure to watch all your queries closely. Every time you write one, time it and make sure it only takes a few tens of milliseconds. If it takes a lot longer than your other queries, you may need to rethink it or add indexes.

#5 SecretAgentRege

SecretAgentRege
  • Members
  • Pip
  • Newbie
  • 8 posts

Posted 29 March 2006 - 06:22 PM

hi wickning1,

The child objects will never be linked to more than one parent, but they could contain similar, though not exact, data. However, now I've run into a situation, as to the best method to go about the following situation:

The CHILD object could contain its own CHILD objects; which could be between 0-10+ avg per original CHILD object, which would be another table up to 10 times the size of the original CHILD table.

These new child objects wouldn't be too complex, with probably 2 or 3 data fields, and so I must ask, would it be more beneficial to stick these perhaps as text/XML in the original CHILDs table? How big can a text field be if I was to go this route?

Or is this something that is neglegant?

Thanks,
Rege

#6 fenway

fenway
  • Staff Alumni
  • MySQL Si-Fu / PHP Resident Alien
  • 16,199 posts
  • LocationToronto, ON

Posted 29 March 2006 - 07:12 PM

TEXT fields can hold up to 2^16 bytes (65K), while MEDIUMTEXT supports 1.5 MB. LONGTEXT fields are _really_ long. So you won't have a problem. Second, there shouldn't be a problem adding another table anyway.
Seriously... if people don't start reading this before posting, I'm going to consider not answering at all.

#7 SecretAgentRege

SecretAgentRege
  • Members
  • Pip
  • Newbie
  • 8 posts

Posted 29 March 2006 - 09:01 PM

Thanks for the help fenway,

I think I might go for the big text field, for a fe reasons. One of which, users might be able to edit the contents of this field all at once, even if it appears to be different entries... the best I can think of is like a wiki of sorts.

Another thing to theorize: Say I have a table that holds contact information. Each row consists of a person's Name, Telephone, Address, etc.. But say users want to add a unlimited multitude of Telephone Numbers, cells, home, office, vacation, etc. etc. (bad example, but you get the picture).

Would it make more sense to have a seperate table of phone numbers that links back to each person in the original table, or just have each telephone number they add listed in the 'telephone' field with some common seperator so any middlewhere and tell when one phone number ends and the next begins?

Hmm.. i guess that example might lend itself to the seperate table... which would contradict my original thought process... darn.

#8 wickning1

wickning1
  • Members
  • PipPipPip
  • Advanced Member
  • 405 posts

Posted 29 March 2006 - 09:01 PM

How are these objects being created? Database size should generally increase linearly with human activity, so are there humans making all these objects? Under what circumstances is a new object created?

#9 SecretAgentRege

SecretAgentRege
  • Members
  • Pip
  • Newbie
  • 8 posts

Posted 29 March 2006 - 09:33 PM

I will be building the initial database to start from XML files: Just over 2,500 PARENT objects with each having anywhere from 10-100 CHILD objects, with the average being probably 25-60 child objects.

After that, most creation will be human input. Probably starting off steeper (especially child objects) and then hitting a plateau at about 1:20 new parent:child creation spread across the objects; at the flatest part, probably an average 1-10 children per week; at the steepest probably 50-100 children a week (before plateau).

As far as these new CHILD table I spoke of (UNDER the ORIGINAL CHILD), I will probably not do this for the editing reason I spoke of.

Say each PARENT is a book, and each CHILD is a chapter. Why break each chapter into another CHILD of each paragraphs linked to the chapters? When you want to edit your chapter, you probably do not want to have it seperated into paragraphs.

While the actual information will be unrelated objects (probably more like the chapters in the example above), when that field is edited, I would like all information within that section to be edited much like a wiki would. This seems to lend itself to have all this information in one field, rather than have them as seperate objects in a different table.

Does that make more sense.

Also, I can't help but continue to thank you guys for your help. Thanks!

#10 fenway

fenway
  • Staff Alumni
  • MySQL Si-Fu / PHP Resident Alien
  • 16,199 posts
  • LocationToronto, ON

Posted 30 March 2006 - 07:26 AM

IMHO, the ultimate choice between storing a field in a particular table or in a new table via a foriegn key comes down to atomicity, and in the real world, how the data will be used. Obviously, for phone numbers, it makes sense to have another table, so you can change each one invidiually (=atomically), and easily add/delete telephone entries. However, for something like a bunch of paragraphs, it often is better to store them together, for obvious reasons, some of which you have mentioned.
Seriously... if people don't start reading this before posting, I'm going to consider not answering at all.




0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users