Question about FK and Indexes

SaranacLake · May 21, 2020

Hello. I have a question about the correct way to do a FK and Index(es)...

I have the following table...

	PRODUCT
	- id (pk)(uk)
	- product_type_code (fk)(uk)

Here is what I know...

1.) In MySQL, you *must* have an index on a column in order to create a FK constraint.

2.) The "product_type_code" is a FK and points to a lookup table.

3.) The "id" + "product_type_code" need to form a UK because I will be using that as a join to another table.

Questions:

a.) Do I need to create a regular Index on "product_type_code" so that I can in turn assign a FK to that single column?

b.) Or is it possible to double up and use my composite UK on the "id" + "product_type_code" both as a way to join those two fields to another table, PLUS as a way to serve as an index for my FK on "product_type_code"?

If I had to guess, I would say that i should first create an IDX on "product_type_code", then create my FK on 'product_type_code' and then I could create the composite UK on the "id" + "product_type_code" columns...

So which is the correct way??

requinix · May 21, 2020

53 minutes ago, SaranacLake said:

1.) In MySQL, you *must* have an index on a column in order to create a FK constraint.

Yes.

53 minutes ago, SaranacLake said:

2.) The "product_type_code" is a FK and points to a lookup table.

Is it an actual foreign key as in you told MySQL it's a foreign key and there's now a constraint and index for it?

53 minutes ago, SaranacLake said:

3.) The "id" + "product_type_code" need to form a UK because I will be using that as a join to another table.

No. Stop that.

The ID is a primary key. It is unique by itself. Do not combine it with some other column to create a composite key. That's stupid and wasteful and did I mention stupid.

53 minutes ago, SaranacLake said:

a.) Do I need to create a regular Index on "product_type_code" so that I can in turn assign a FK to that single column?

Technically no, because if you don't then MySQL will do it for you. Which is what the documentation says, so either you didn't read the documentation or... well, that's it.

53 minutes ago, SaranacLake said:

b.) Or is it possible to double up and use my composite UK on the "id" + "product_type_code" both as a way to join those two fields to another table, PLUS as a way to serve as an index for my FK on "product_type_code"?

I stopped reading your question when you mentioned the dumb composite key thing again, but the answer is probably either "no" or "don't do that because it's stupid".

kicken · May 21, 2020

The foreign key constraint needs a suitable index on the column, but it doesn't need to be an index dedicated to that column.

So yes, you can double up your unique index if you create it properly. In order to re-use it, the foreign key column must be the first column in the index, so when you create it you want to do CONSTRAINT UQ_blah UNIQUE (product_type_code, id) and not CONSTRAINT UQ_blah UNIQUE (id, product_type_code).

As mentioned though, in the scenario you laid out, the unique constraint is entirely unnecessary as your ID column is already unique by virtue of being the primary key so you'd just make a simple index on the product_type_code column.

In a scenario where Id wasn't a primary key and a unique constraint was necessary then you could double-dip like that.

SaranacLake · May 21, 2020

9 hours ago, requinix said:

No. Stop that.

The ID is a primary key. It is unique by itself. Do not combine it with some other column to create a composite key. That's stupid and wasteful and did I mention stupid.

Technically no, because if you don't then MySQL will do it for you. Which is what the documentation says, so either you didn't read the documentation or... well, that's it.

I stopped reading your question when you mentioned the dumb composite key thing again, but the answer is probably either "no" or "don't do that because it's stupid".

And I stopped reading when you chose to go down the "That's stupid" path...

SaranacLake · May 21, 2020

1 hour ago, kicken said:

The foreign key constraint needs a suitable index on the column, but it doesn't need to be an index dedicated to that column.

Okay, glad I asked.

1 hour ago, kicken said:

So yes, you can double up your unique index if you create it properly.

Okay, there is the nugget of wisdom I was looking for!

1 hour ago, kicken said:

In order to re-use it, the foreign key column must be the first column in the index, so when you create it you want to do CONSTRAINT UQ_blah UNIQUE (product_type_code, id) and not CONSTRAINT UQ_blah UNIQUE (id, product_type_code).

Why exactly do you have to do it in that order?

Is that because if the UK starts off with the ID, then when MySQL starts scanning the composite key and only sees ID, it sorta ignores the rest?

1 hour ago, kicken said:

As mentioned though, in the scenario you laid out, the unique constraint is entirely unnecessary as your ID column is already unique by virtue of being the primary key so you'd just make a simple index on the product_type_code column.

In a scenario where Id wasn't a primary key and a unique constraint was necessary then you could double-dip like that.

Aha, but that isn't so in my case...

I only showed a snippet of the larger design, which includes me implementing a rally neat implementation of Super/SubTypes in MySQL.

So, in the simplified example above, my PRODUCT table has an ID that is the PK. So if the PRODUCT table was in isolation, then yes, i agree that is all I need to uniquely identify records.

However, I am creating a composite UK so that I can join my Supertype PRODUCT table with the SubType SUBSCRIPTION table, and those two tables are linked by using "id" and "product_type_code".

Using this approach - and this is the cool part - ensures that there can only be ONE SubType that maps back up to the parent SuperType.

See this link for the details...

stackoverflow.com/questions/17405702/mysql-supertype-subtype-design

kicken · May 22, 2020

4 hours ago, SaranacLake said:

Aha, but that isn't so in my case...

It is in the case you setup in your original post. I know you had some other thread ended up talking about this whole object-oriented sub/super type thing but I didn't really pay attention to it so am unfamiliar with the details. Looking at just this thread in it's own context the answer that your unique constraint is unnecessary is factual.

In your original post you show that ID is a primary key (signaled by the (pk) bit). By definition, a primary key is unique across the table which means it's the only thing needed to find any particular row in the table when doing a join or a search. As a result, including it in another unique index is mostly pointless as discussed before.

4 hours ago, SaranacLake said:

See this link for the details...

Your link describes a somewhat different situation from what you originally posted. The unique constraint is there because it's necessary for a foreign key in another table, and that foreign key has more to do with enforcing row types than to support joining of tables (as implied in your question "..using that as a join to another table").

In all the tables in that stack exchange example, the veh_id column is the only thing strictly necessary for the DB to handle all the joins and link the tables together. The rest of the columns have to do with your type enforcement. I'm not sure why they used separate primary/unique keys, I think the whole thing could have been done by making a single multi-column primary key which might make the whole thing easier to understand.

5 hours ago, SaranacLake said:

Why exactly do you have to do it in that order?

Because indexes work essentially left-to-right, so if you want to take advantage of a column being part of an index, it either has to be the left-most column of the index, or you also have to specify all the columns prior to it. If you define the order as (id, product_type_code) then you can't use that index to search for a product_type_code unless you also specify an ID (and since ID is unique, that makes the product_type_code mostly useless).

However if you define the order as (product_type_code, id) then you can search for a product_type_code using that index and get all the matching ID's as a result.

Applying this to your stack exchange example, it also means that you in fact cannot double-dip on your indexes and will need to create extra ones to support your foreign key definitions. Because veh_id is the value that ties all the tables together, you want it to be the first value in your unique / primary key constraint. As a result the veh_type_code will need it's own index to support that foreign key definition.

SaranacLake · May 22, 2020

1 hour ago, kicken said:

It is in the case you setup in your original post. I know you had some other thread ended up talking about this whole object-oriented sub/super type thing but I didn't really pay attention to it so am unfamiliar with the details. Looking at just this thread in it's own context the answer that your unique constraint is unnecessary is factual.

As my site grows it isn't practical to post everything, and it would be nice if some people - not you - would give me the benefit of the doubt before saying I'm stupid...

1 hour ago, kicken said:

In your original post you show that ID is a primary key (signaled by the (pk) bit). By definition, a primary key is unique across the table which means it's the only thing needed to find any particular row in the table when doing a join or a search. As a result, including it in another unique index is mostly pointless as discussed before.

Right, I know that.

1 hour ago, kicken said:

Your link describes a somewhat different situation from what you originally posted. The unique constraint is there because it's necessary for a foreign key in another table, and that foreign key has more to do with enforcing row types than to support joining of tables (as implied in your question "..using that as a join to another table").

Well it is a join plain and simple. But, yes, the reason for the 2nd part of the composite key is to enforce that only ONE subtype can be chosen for a given record.

1 hour ago, kicken said:

In all the tables in that stack exchange example, the veh_id column is the only thing strictly necessary for the DB to handle all the joins and link the tables together.

Right, but the reason they did that is this..

In my case, the PRODUCT could have an ID=1 and then I could have a SUBSCRIPTION with an ID=1, a BOOK with an ID=1 and a TSHIRT with an ID=1.

By having that additional key/join, a given Product record can ONLY map to ONE subtype - it's a pretty brilliant workaround when you can't use a Check constraint!!

1 hour ago, kicken said:

The rest of the columns have to do with your type enforcement.

Right.

1 hour ago, kicken said:

I'm not sure why they used separate primary/unique keys, I think the whole thing could have been done by making a single multi-column primary key which might make the whole thing easier to understand.

Probably for efficiency. There are joins that ONLY is the PK, and the UK exists solely to enforce type.

In my ERD, I know that I used just the PK in some place to do joins for efficiencies sake.

But it probably would be better to lose the ID and create a composite PK as you mention.

Design is subjective!

1 hour ago, kicken said:

Because indexes work essentially left-to-right, so if you want to take advantage of a column being part of an index, it either has to be the left-most column of the index, or you also have to specify all the columns prior to it. If you define the order as (id, product_type_code) then you can't use that index to search for a product_type_code unless you also specify an ID (and since ID is unique, that makes the product_type_code mostly useless).

However if you define the order as (product_type_code, id) then you can search for a product_type_code using that index and get all the matching ID's as a result.

Okay.

1 hour ago, kicken said:

Applying this to your stack exchange example, it also means that you in fact cannot double-dip on your indexes and will need to create extra ones to support your foreign key definitions. Because veh_id is the value that ties all the tables together, you want it to be the first value in your unique / primary key constraint. As a result the veh_type_code will need it's own index to support that foreign key definition.

Sound doubling up on Indexes is okay, right?

Actually while I was waiting for your response, I had another question that relates to this last point...

Let's say you have a textbook order system modeled out, and you have our favorite ORDER -||-----|<- ORDER_DETAILS ->|-----||- PRODUCT

And the way I prefer to do things, I would have this for my junction table...

	ORDER_DETAILS
	- id (PK)
	- order_id (FK1)(UK1)
	- product_id (FK2)(UK1)
	- purchase_price
	- quantity

Now, if I took what you were saying above, my UK1 would be correct one-way, but not the other way.

So, in that case, I would have an index for FK1, an index for FK2, and then a composite index for UK1, right?

And hopefully that would break anything in MySQL, nor would it slow things down having the extra index, right?

kicken · May 22, 2020

22 minutes ago, SaranacLake said:

Now, if I took what you were saying above, my UK1 would be correct one-way, but not the other way.

The order you want depends mostly on how you'll be using the index in your queries.

I'd venture that in an order system you're most likely going to be joining orders to the details then to the products to generate invoices and such, which would have queries like:

select *
from orders o
inner join order_details od on od.order_id=o.id
inner join products p on p.id=od.product_id

A query such as that would want your UK defined as (order_id,product_id). With that order the UK can be used to enforce your foreign key relationship on the order_id column and be used to speed up the join between the orders and order_details tables.

If on the other hand you were doing a query based on products, such as to find popular products you might do queries like:

select *
from products p
inner join order_details od on od.product_id=p.id

For that kind of query you might consider the (product_id,order_id) order so you could use the UK to handle your product_id foreign key and the join to the products table.

In either case, assuming both order_id and product_id are foreign keys then your UK can satisfy the index requirement for one of them and the other would need it's own index. Neither order is particularly bad or good in this situation as you need at least two indexes anyway. In such a table design you don't necessarily even need your ID column so you could just make your PRIMARY KEY be (order_id, product_id). Some people prefer always having a auto_increment primary key but in the scenario proposed it's not necessary and could be removed.

34 minutes ago, SaranacLake said:

nor would it slow things down having the extra index, right?

More indexes mean more index management and disk space usage. That can potentially lead to slower performance when inserting/updating/deleting data, but wouldn't have any significant impact on selects. As such, you should try and limit your indexes to only those that you absolutely need to make your system function well. Until your system grows to a very large scale (millions of rows), it's unlikely you'd notice any problems from extra indexes though.

39 minutes ago, SaranacLake said:

Probably for efficiency.

There's no efficiency to be gained from having the separate PK and UK there as far as I am aware. A single multi-column primary key would be just as effective and save an index.

SaranacLake · May 22, 2020

@kicken,

Thanks for the insightful response!

10 minutes ago, kicken said:
The order you want depends mostly on how you'll be using the index in your queries.

I'd venture that in an order system you're most likely going to be joining orders to the details then to the products to generate invoices and such, which would have queries like:
select *
from orders o
inner join order_details od on od.order_id=o.id
inner join products p on p.id=od.product_id
A query such as that would want your UK defined as (order_id,product_id). With that order the UK can be used to enforce your foreign key relationship on the order_id column and be used to speed up the join between the orders and order_details tables.

Okay, I follow you.

10 minutes ago, kicken said:
If on the other hand you were doing a query based on products, such as to find popular products you might do queries like:
select *
from products p
inner join order_details od on od.product_id=p.id
For that kind of query you might consider the (product_id,order_id) order so you could use the UK to handle your product_id foreign key and the join to the products table.

Okay.

10 minutes ago, kicken said:

In either case, assuming both order_id and product_id are foreign keys then your UK can satisfy the index requirement for one of them and the other would need it's own index. Neither order is particularly bad or good in this situation as you need at least two indexes anyway.

So create the UK the way I think I'd be querying the most - in my case UK(order_id, product_id) and then create a 2nd index for the other FK. (But i won't need a 3rd IDX.)

10 minutes ago, kicken said:

In such a table design you don't necessarily even need your ID column so you could just make your PRIMARY KEY be (order_id, product_id). Some people prefer always having a auto_increment primary key but in the scenario proposed it's not necessary and could be removed.

Yeah, I'm of that school of thought... Every table has an ID as the PK, with the exception being lookup tables.

10 minutes ago, kicken said:

More indexes mean more index management and disk space usage. That can potentially lead to slower performance when inserting/updating/deleting data, but wouldn't have any significant impact on selects.

Okay.

You can always redefine your indexes as things evolve, right?

10 minutes ago, kicken said:

As such, you should try and limit your indexes to only those that you absolutely need to make your system function well.

Okay.

10 minutes ago, kicken said:

Until your system grows to a very large scale (millions of rows), it's unlikely you'd notice any problems from extra indexes though.

Yeah.

I noticed that in the beginning, as well, and was wondering the reasoning behind it.

10 minutes ago, kicken said:

There's no efficiency to be gained from having the separate PK and UK there as far as I am aware. A single multi-column primary key would be just as effective and save an index.

Yeah, it probably would have been better to have one PK and be done with it, but I'm not expert in that area, so who knows?!

Either way, I found that write VERY informative, and definitely helpful in what I wanted to do. (Even though Gizmola thinks I'm nuts?!) 😉

gizmola · May 22, 2020

19 hours ago, SaranacLake said:

3.) The "id" + "product_type_code" need to form a UK because I will be using that as a join to another table.

This is your subtype table, correct? You do not need id+product_type. You simply need id in the subtype table.

You need the product_type_code (why code btw?) to determine which subtype table, but there is no value in bringing along the product_type_code.

This is called a "defining relationship". The best way to think of it, is that the primary key of product (id) will be the primary key of product_{product_type}

Here is a basic ERD of the relationships in MySQL Workbench

Here's generated SQL:


-- -----------------------------------------------------
-- Table `mydb`.`product_type`
-- -----------------------------------------------------
CREATE TABLE IF NOT EXISTS `mydb`.`product_type` (
  `code` CHAR(1) NOT NULL,
  `name` VARCHAR(30) NULL,
  PRIMARY KEY (`code`))
ENGINE = InnoDB;


-- -----------------------------------------------------
-- Table `mydb`.`product`
-- -----------------------------------------------------
CREATE TABLE IF NOT EXISTS `mydb`.`product` (
  `id` INT NOT NULL AUTO_INCREMENT,
  `product_type_code` CHAR(1) NOT NULL,
  `name` VARCHAR(60) NULL,
  PRIMARY KEY (`id`),
  INDEX `fk_product_product_type_idx` (`product_type_code` ASC),
  CONSTRAINT `fk_product_product_type`
    FOREIGN KEY (`product_type_code`)
    REFERENCES `mydb`.`product_type` (`code`)
    ON DELETE RESTRICT
    ON UPDATE CASCADE)
ENGINE = InnoDB;


-- -----------------------------------------------------
-- Table `mydb`.`product_subscription`
-- -----------------------------------------------------
CREATE TABLE IF NOT EXISTS `mydb`.`product_subscription` (
  `product_id` INT NOT NULL,
  PRIMARY KEY (`product_id`),
  CONSTRAINT `fk_product_subscription_product1`
    FOREIGN KEY (`product_id`)
    REFERENCES `mydb`.`product` (`id`)
    ON DELETE CASCADE
    ON UPDATE CASCADE)
ENGINE = InnoDB;


-- -----------------------------------------------------
-- Table `mydb`.`product_article`
-- -----------------------------------------------------
CREATE TABLE IF NOT EXISTS `mydb`.`product_article` (
  `product_id` INT NOT NULL,
  PRIMARY KEY (`product_id`),
  CONSTRAINT `fk_product_article_product1`
    FOREIGN KEY (`product_id`)
    REFERENCES `mydb`.`product` (`id`)
    ON DELETE CASCADE
    ON UPDATE CASCADE)
ENGINE = InnoDB;


-- -----------------------------------------------------
-- Table `mydb`.`product_promotional`
-- -----------------------------------------------------
CREATE TABLE IF NOT EXISTS `mydb`.`product_promotional` (
  `product_id` INT NOT NULL,
  PRIMARY KEY (`product_id`),
  CONSTRAINT `fk_product_promotional_product1`
    FOREIGN KEY (`product_id`)
    REFERENCES `mydb`.`product` (`id`)
    ON DELETE CASCADE
    ON UPDATE CASCADE)
ENGINE = InnoDB;

Notice what there is and isn't. There is no code that creates indexes, because INDEX and KEY are the same in MySQL/InnoDB DDL. Making the key makes the index.

SaranacLake · May 23, 2020

23 hours ago, gizmola said:

This is your subtype table, correct?

Yes.

Quote

You do not need id+product_type. You simply need id in the subtype table.

You need the product_type_code (why code btw?) to determine which subtype table, but there is no value in bringing along the product_type_code.

This is called a "defining relationship". The best way to think of it, is that the primary key of product (id) will be the primary key of product_{product_type}

Here is a basic ERD of the relationships in MySQL Workbench

Based on my understanding of what the author published on Stack Overflow, I disagree.

Let me refer to what was said by the author on SO...

https://stackoverflow.com/questions/17405702/mysql-supertype-subtype-design

Quote
What you want to do is fairly simple, but not necessarily easy. The important point in this kind of supertype/subtype design (also known as an exclusive arc) is to make it impossible to have rows about sedans referencing rows about semi-trucks, etc..

MySQL makes the code more verbose, because it doesn't enforce CHECK constraints. You're lucky; in your application, the CHECK constraints can be replaced by additional tables and foreign key constraints.

The UNIQUE constraint lets the pair of columns {veh_id, veh_type_code} be the target of a foreign key reference. That means a "car" row can't possibly reference a "semi" row, even by mistake.
	    insert into vehicles (veh_id, veh_type_code) values (1, 's'), (2, 'c'), (3, 'c'), (4, 'c'), (5, 'c'), (6, 'c'), (7, 'c');
 
The default value for veh_type_code, along with the foreign key reference to veh_type_is_car, guarantees that this rows in this table can be only about cars, and can only reference vehicles that are cars. On other platforms, I'd just declare the column veh_type_code as veh_type_code char(1) not null default 'c' check (veh_type_code = 'c').
xxx

So that is why I use ID + product_type_code.

Without leveraging the "product_type_code", you can end up with this...

Scenario #1:

[code]

PRODUCT

ID=1

SUBSCRIPTION

product_id=1

ONLINE_BOOK

product_id=1

TSHIRT

product_id=1

[/code]

Whereas the author's approach gives you this...

Scenario #2:

[code]

PRODUCT

ID=1S

ID=2B

ID=3T

SUBSCRIPTION

~~product_id=1~~

product_id=1S

ONLINE_BOOK

~~product_id=1~~

product_id=2B

TSHIRT

~~product_id=1~~

product_id=3T

[/code]

(And THIS is what I was saying a while back about, "I won't need procedural code to enforce subtype relationships...")

Edited May 23, 2020 by SaranacLake

SaranacLake · May 23, 2020

P.S. The way this website converts [ quote ], and [ code ], tags to an object is a real FUBAR...

I had to completely re-type y last post because this is this site trashed my post. nd as you can see, the [ code ] tags are still not working?! *rolls eyes*

kicken · May 23, 2020

12 minutes ago, SaranacLake said:

P.S. The way this website converts [ quote ], and [ code ], tags to an object is a real FUBAR...

I've more or less stopped typing tags in at this point. It's inconvenient to not be able to but what ya gonna do.

For longer posts I usually compose them in notepad then when done move them over to the text area and insert all the quotes/code blocks/formatting using the appropriate buttons. More work, but make the post look right.

I miss the old text-only editor still.

gizmola · May 24, 2020

On 5/22/2020 at 7:48 PM, SaranacLake said:

Based on my understanding of what the author published on Stack Overflow, I disagree.

The comments and concerns expressed in that SO question are in no way applicable to what you are talking about doing. You might also note that the author of the comment, never demonstrates the supposed danger or where the section you quoted in any way matters. The most comical thing about it, is that there is no semi-truck table. The original requirement was completely lost.

The ultimate model added to the original question does nothing other than to add tables only relevant to the car tree hierarchy. Adding a unique index and referencing that as a foreign key only matters in that scenario because there is a cars table where lots of different car subtypes could go. Are you implementing a hierarchy with multiple subclasses and sub-subclasses? No you aren't.

Did you look at the other answer from Walter Mitty? It literally calls out the solution I provided you with the tag shared-primary-key

I made you a model and generated the DDL. You could have run it in a test db and played with it, and created a few queries.

Could you create a row in a subtype table for the wrong product type? Yes. But considering the missing semi-truck table, the same issue existed in the non-solution presented and accepted in the question you referenced. The main focus of the solution was the hierarchy of car types, which again isn't relevant to your product subtypes.

It's not a concern for you because the procedural code you need to write, which uses the product_type attribute to determine which child table the subtype row needs to be created in, must do that properly for your system to work. If it doesn't do that properly 100% of the time, then your system doesn't work at all, and we are talking very simple logic. If you are super paranoid, then a simple trigger could be written to prevent it happening in any of the child tables. You could also, as an alternative, write a trigger to create the row in the correct subtype table, when you insert the product row. Ordinarily I am not a big fan of MySQL triggers or sprocs, as they reduce concurrency and insert performance, but your system will not have a lot of insert/update activity against the tables in question, and will primarily be selecting the data, so it's worth considering.

SaranacLake · May 26, 2020

On 5/22/2020 at 11:20 PM, kicken said:

I've more or less stopped typing tags in at this point. It's inconvenient to not be able to but what ya gonna do.

For longer posts I usually compose them in notepad then when done move them over to the text area and insert all the quotes/code blocks/formatting using the appropriate buttons. More work, but make the post look right.

And it prevents you from losing precious thoughts!

On 5/22/2020 at 11:20 PM, kicken said:

I miss the old text-only editor still.

What I have learned - the hard way - is that a [ quote ] tag near the end will often slurp up uncommented text at the very end - and code too - and once that happens you cannot edit things, which is stupid.

When I have a quote near the end, with text following it, I try now to cut the quote after it is in, add a few carriage returns before the ending text, then paste the quote back it, and usually that works, but not always, plus sometimes I forget and then get burned.

This is a pretty big bug in the forum software if you ask me.

Sign In

Question about FK and Indexes

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Join the conversation

Important Information