Jump to content

Querying across multiple tables / Normalization


Recommended Posts

MySQL version - 4.1.21-max-log

 

Problem - generating SQL for normalized database to "search" across multiple tables.

 

 

 

I'm not new to databases, but am new to normalization. In the past I've built databases with primary key/foreign key relationships, but on a new project I'm starting I've decided to take it a step further, and am running into trouble with my queries.

 

The structure is setup like this (this is a simplified version):

 

table name - columns
tbl_resource - Pk_resource, resource_name, resource_status, date_added
tbl_contact - P_contact, contact_name
tbl_descriptor - Pk_descriptor, descriptor_name
tbl_resource_contact - PK, Fk_resource, Fk_contact
tbl_resource_descriptor - PK, Fk_resource, Fk_descriptor

 

One resource might have 3 contacts and two descriptors.

 

 

 

In the past, in order to have multiple contacts attached to one resource, I would normally store the primary keys of the contacts in one column separated by commas. The structure would look like this:

 

tbl_resource - Pk_resource, resource_name, resource_status, date_added, Fk_contact, Fk_descriptor
tbl_contact - Pk_contact, contact_name
tbl_descriptor - Pk_descriptor, descriptor_name

 

where Fk_contact might contains the IDs: "1, 3, 4"

 

Reading online the benefits of normalization, I want to take this a step further by creating tables which link the contact and table, according to this page:

 

http://www.datamodel.org/NormalizationRules.html#two

 

I therefore add two tables to link resources to contacts and descriptors:

 

tbl_resource_contact - PK, Fk_resource, Fk_contact
tbl_resource_descriptor - PK, Fk_resource, Fk_descriptor

 

However, I'm having trouble visualizing the steps/SQL necessary to do a *complex* query searching for multiple bits of info across multiple tables. Is there a way to write a SQL statement where a single statement searches for the resource that has a certain contact_name, and a certain descriptor_name? If I were writing this I might do:

 

SELECT Pk_contact FROM tbl_contact WHERE contact_name = 'search'
SELECT Pk_descriptor FROM tbl_descriptor WHERE descriptor_name = 'search'

 

Then I would take the results from the previous selects, and do another select on the linking table:

 

SELECT Fk_resource FROM tbl_resource_contact WHERE Fk_contact = Pk_contact
SELECT Fk_resource FROM tbl_resource_descriptor WHERE Fk_descriptor = Pk_descriptor

 

Then finally, I would do a final SELECT:

 

SELECT * FROM tbl_resource WHERE Pk_recource = ResourceID(s)

 

 

 

 

My problem is in the complexity of the selects and what IDs are searched on and how they are used. Searching for all resources with one contact - not a problem. But if it gets any more complicated (specific contact, two specific resources; specific contact, one of two specific resources)... and so on I'm having trouble figuring out the chain of logic.

 

Thanks.

am not sure if am getting right to your problem (since I didn't read all of it, just scanned - am too lazy, lol)... but i guess for that, what you were looking for is about JOIN (or maybe UNION). You might want to check that out, it might help. :)

 

Jay,

am not sure if am getting right to your problem (since I didn't read all of it, just scanned - am too lazy, lol)... but i guess for that, what you were looking for is about JOIN (or maybe UNION). You might want to check that out, it might help. :)

 

Jay,

 

Perhaps I was a bit too wordy :D

 

Let me restate:

 

I don't know if a "massive" join would degrade system performance.

 

I have a DB that stores information about a resource, it contains the following "tables":

 

Table (columns)

Resource (name, description, visible, website)

Term (name)

Contact (name, phone, email, address)

Sponsor (name)

Unit (name)

Location (name, city, state)

Audience (name)

Category (name)

 

 

The primary goal for the DB is to hold information about each resource. So, for Resource #1, it might relate to 1 contact, 2 sponsors, 3 locations, a unit, a category, and no audience. If I were to join the tables together such that a resource was always present, I'd get a total of:

 

1 resource * 1 contact * 2 sponsors * 3 locations * 1 unit * 1 category = 6 rows

 

but another case (which is possible):

 

1 resource * 3 contacts * 4 sponsors * 3 locations * 4 units * 6 categories = 864 rows

 

with there being 500 resources, I'm concerned about the size of the join.

 

 

 

Also, I'm unsure if I could effectively query this table. If someone want to search for, lets say:

 

Location 1 AND Location 2

 

Locations will be stored in separate rows (the common element between the rows will be the Primary Key of the resource) - how can I write SQL that will search the two rows but ONLY where the primary key of the resource is the same between the two rows? I won't know what the primary key is.

 

Perhaps I was a bit too wordy  :D

lol... not really, am just not the type to read long posts, just that :). Sorry bout that anyway.

 

Going back, I do not know why would you pair all rows from one table to all rows of another table. Even more, why would you multiply it by then?

 

Looking at your structure, i assumed these are all tables without any relation - as how i understood it though:

 

Table (columns)

Resource (re_id, name, description, visible, website)

Term (te_id, name)

Contact (co_id, name, phone, email, address)

Sponsor (sp_id, name)

Unit (un_id, name)

Location (lo_id, name, city, state)

Audience (au_id, name)

Category (ca_id, name)

 

with JOIN, you must have something that binds them (FKs) which is not present in anyway. Firstly, try to specify PKs - which is in italic, i do not really recommend having string-type as a PK. These PKs (Primary Keys) are usually in interger-type or better, try auto-increment. After then, create a relationship table that would mirror the interconnection of a row from one table to another row(s) of a different table - for complex relationship; or just simply add Foreign Key(s) to an existing table whenever possible.

 

To understand bout this more (just in case), read some references bout Relational Databases and Normalization, a lot of articles can be found on the web.

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.