Implications of database schema on PHP implementation

NotionCommotion · February 13, 2016

From purely a PHP coding perspective, what are the pros and cons of the two below database designs?

element121 · February 13, 2016

Can you describe your proposed use case for the schema?

When I see multiple tables that appear to have the same definition, they look redundant to me.

If you need a table4, table5 you will just add them in the future?

You can combine all these in one table for example: table_combined with a column to indicate a different type (values 1, 2, 3) and 4, 5, 6... etc if required in the future.

NotionCommotion · February 13, 2016

Can you describe your proposed use case for the schema?

table1, table2, and table3 are People, Companies, and Projects. There are actually more as you suggested.

Currently, I've been using the first schema, but it is making my PHP complicated. It seems like the second schema I showed might result in much simpler PHP script.

You can combine all these in one table for example: table_combined with a column to indicate a different type (values 1, 2, 3) and 4, 5, 6... etc

You mean as I show in the second schema?

JonnoTheDev · February 13, 2016

I can only see the need for 2 tables. A documents table and a document_types table.

The documents table references the documents_type table by type_id I.e.

documents

----------

doc_id

type_id

date_added

doc_name

document_types

----------

type_id

type_title

document_types would contain people, companies, and projects, and any other types. Each has an id and is used in a join from documents when you want to say fetch all company documents.

You do not want to be creating separate tables for each type as there is no flexibility in the design and as you have already found out it is hard to create the business logic.

NotionCommotion · February 13, 2016

I can only see the need for 2 tables. A documents table and a document_types table.

Each document needs to be associated with a given person, company, or project, and not just be of one of the people, companies, and projects types. Given this requirement, do you still feel it could be accomplished with two tables?

You do not want to be creating separate tables for each type as ... it is hard to create the business logic.

You are absolutely correct! It is amazing how the database schema could complicate other issues.

benanamen · February 13, 2016

As pointed out neither of your designs are any good. What you need to do is learn about "Database Normalization'. It is not a complicated subject and you should be able to master the understanding of it very quickly. It is an important thing to know and will get you on your way to solving your problem.

NotionCommotion · February 13, 2016

As pointed out neither of your designs are any good. What you need to do is learn about "Database Normalization'. It is not a complicated subject and you should be able to master the understanding of it very quickly. It is an important thing to know and will get you on your way to solving your problem.

While I agree that normalization is very important, I don't believe either of these designs are not normalized up to the 3rd normal form. Where do you feel they aren't?

Why do you believe second design is no good?

benanamen · February 14, 2016

While I agree that normalization is very important, I don't believe either of these designs are not normalized up to the 3rd normal form. Where do you feel they aren't?

The fact that you are asking this question tells me you do not understand Database Normalization. It isn't something I feel, it is just a fact your DB is not normalized and neither is the second version. The first obvious sign is that you are duplicating data. You were already given correct responses from the other forum members.

Edited February 14, 2016 by benanamen

ignace · February 14, 2016

It's never a good idea to simply write table1, table2, table3 as it becomes more difficult to understand what each thing is and developers then start to abstract it to document/type.

If the tables represent person, company, and project. Then write that. Each of these entities have more columns then just id and data. So they should have their own table. As such they should also have their own m2m tables as in:

person_has_document

company_has_document

project_has_document

This means your first design is correct. In the second design you assume there is a shared PK and columns between project, company, and person. In a natural PK DB design there is no shared PK, for a company this is the vat number, for a person his ssn number, and for a project the internal company code. If you use a surrogate PK like a GUID this is possible but it would have no shared columns (and no name is not a shared column for a person this is the first_name/last_name, a company is registered_name, and for project this is the title). So the first design remains the way to go.

added_by, deleted_by, added_at, deleted_at should not be in the m2m table but instead be on the document table and be a FK to the person table. I would also add a is_deleted field instead of checking deleted_by IS NOT NULL. That way you can constrain the deleted_by field to ON DELETE SET NULL when you remove a person from the system and your document remains deleted.

Edited February 14, 2016 by ignace

NotionCommotion · February 14, 2016

Thanks ignace,

The same document can be attached to different entities, so added_by, deleted_by, added_at, deleted_at needs to remain in the m2m table. I am also using a surrogate for people, companies, and projects, and have a little extra data located in superTable such as when the entity was added, etc.

My hopes were not as much to debate the database design, but discuss how the schema has implications on the associated PHP code. As JonnoTheDev pointed out "You do not want to be creating separate tables for each type as there is no flexibility in the design and as you have already found out it is hard to create the business logic." By using a super table, my PHP code to add/view/delete documents becomes almost the same and the application becomes much simpler. I did not expect that the database schema would have such a great impact on the PHP code.

ignace · February 14, 2016

A database design has never implications on your PHP code. A good database design ensures the data and it's constraints are met no matter the programming language used. Otherwise use NoSQL, this is the easiest way to store your array's. No more worries about join's or SQL for that matter.

Also if added_by, deleted_by should be on the m2m table then person_has_document does not exist. Because then a stranger can add documents to you? Or even delete your documents? Also in my example above this should have been person_owns_document. Which is basically the same as added_by. So the table is useless.

Edited February 14, 2016 by ignace

NotionCommotion · February 14, 2016

A database design has never implications on your PHP code.

I recognize that you have much more expertise than me regarding programming, but respectfully disagree.

I have dealt with some database designs which complicate the PHP code and some which make it very simple. I don't know for sure, but highly suspect that the database design often has the same impact regardless of the programming language used.

maxxd · February 15, 2016

I have dealt with some database designs which complicate the PHP code and some which make it very simple. I don't know for sure, but highly suspect that the database design often has the same impact regardless of the programming language used.

To me, the major thing to remember is that the only way the database affects the programming in these situations is by being badly designed. The overall methodology of interacting with said database is going to be the same no matter what - for instance, "Grab an ID from this table, link it to a FK in this table, update the record in the third table that's defined by the FKs from table 1 and 2". However, when suddenly the FK from table 2 is actually an FK from table 7 that you have to get to by joining tables 4, 5, and 6; or the FK is actually a column in the current table that only has meaning when defined by magic joins to table 4, 5, and 6 - that's where the issues begin to happen. Again, the business logic is the same - it's the amount of time and cursing from the programmer that changes.

NotionCommotion · February 16, 2016

To me, the major thing to remember is that the only way the database affects the programming in these situations is by being badly designed. The overall methodology of interacting with said database is going to be the same no matter what - for instance, "Grab an ID from this table, link it to a FK in this table, update the record in the third table that's defined by the FKs from table 1 and 2". However, when suddenly the FK from table 2 is actually an FK from table 7 that you have to get to by joining tables 4, 5, and 6; or the FK is actually a column in the current table that only has meaning when defined by magic joins to table 4, 5, and 6 - that's where the issues begin to happen. Again, the business logic is the same - it's the amount of time and cursing from the programmer that changes.

Thanks Maxxd.

Yea, I agree those are the biggies which a bad design could bring about. But is it okay to "just do things right", are there things a real good database designer could do which would make the programmer's life better?

I only bring up the super/sub table design pattern because that is where I have witnessed it, but I expect there are more opportunities.

Yes, I know this is a database/SQL question, but it has PHP implications so please stay with me. Say you have three tables: teachers, students, and parents, and each have a first name, last name, username, email, bla, bla, bla, and also a couple of other unique to each fields. Would you create a separate table for each? I think most PHP developers would. The design is normalized so we did our duty, right? No duplicated data (feel free to pip in benanamen), right? So, all is good...

But now we want to link all these type of people (i.e. teachers, students, parents) to their favorite movie. Well, that is three separate queries for each. Do we embed different queries in the application for each, and be forced to maintain them? Do we get fancy (i.e. complicated) and have the application create the query?

Or, should the database developer have the foresight to create a super table for us so we could just JOIN to that?

Have these questions not been asked before, or am I just asking things which aren't important?

ignace · February 16, 2016

But is it okay to "just do things right", are there things a real good database designer could do which would make the programmer's life better?

Usually the DB admin will create VIEWs and PROCEDUREs. It makes it easier to query data and as a bonus adds a layer of security.

Say you have three tables: teachers, students, and parents, and each have a first name, last name, username, email, bla, bla, bla, and also a couple of other unique to each fields. Would you create a separate table for each?

A teacher, a student, and a parent all would be in the same table. It's possible to create additional dependent tables for each if they have information specific to them. For the parent to signify the type of parent/family relation: dad, mom, ..

For the teacher this would be for additional relations to courses he teaches. For the student to which courses he is enrolled etc.. So in this case the extra dependent tables are required as you need it to create additional relations.

Do we get fancy (i.e. complicated) and have the application create the query?

Keep it simple. Start with something that could possibly work. Is one table sufficient? Then start there. Do you have extra use-cases, then keep refactoring your DB design until it satisfies all use-cases. Never go creating tables because maybe in the future you may need them. You Aint Gonna Need It (YAGNI).

Edited February 16, 2016 by ignace

maxxd · February 16, 2016

I agree with ignace regarding using only one table in this example. Parents, teachers, and students are all users. So, create the table Users and a secondary UserType table. This allows future expansion - for instance, let's say the higher-ups suddenly want to track grad students and undergrad students separately. Instead of having to create two more tables (Grad and UnderGrad), then figure out the SQL to separate out the current contents of Student, you'd simply add 'Graduate' and 'Undergradute' to the UserType table and update the foreign key in Users. And in a few months when it becomes imperative to track tutors as well as teachers, you add that key to the UserType table and start entering data.

'Doing things right' - to me - means trying your best to plan for future business logic changes during the design phase of your project. You're not always going to succeed; there are always going to be edge-case scenarios that pop up and make you rethink the design, and there are going to be situations where you over-plan. However, normalizing a database as much as possible and making sure that the data contained in a table is consistent and usable is going to mitigate refactoring both your SQL and PHP. I always say that the overall goal is to design and implement a system in such a way as to allow the maintainer of that system to be as lazy as possible.

Hopefully that makes sense - I'm only on my second cup of coffee yet this morning...

Implications of database schema on PHP implementation

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Join the conversation

Important Information