Jump to content

Recommended Posts

Hi,

 

Iv got a mysql database with 70,000 rows, and in the column "teamID" iv set it to a mediumint(4) col type.  Values in the colum is a teams ID ranging from 1-4 characters so could be team id "1" .. or up to "9999" for example.

 

But the index is only indexing 214 rows out of about 1200 rows with teamIDs other than 0.  iv tried just INT as the coltype, also varchar etc etc... doesnt seem to matter it still only indexed 214 rows...but theres 1200 rows with teamIDs... ?

 

Iv no idea why... ? any pointers?

Link to comment
https://forums.phpfreaks.com/topic/205445-indexnot-indexing-all-rows/
Share on other sites

I tihnk i now know what it is... its only indexing the unique (DISTINCT) teamID values. So only one indexed row per teamID, but many rows can have the same teamID ...?

 

Im confused, why is it indexing like this? I want it to index all rows with a teamID surley? no matter if example - 10 rows all have the same teamID or not..

 

more confused lolol.

Index cardinality tells you how many different values are there in the indexed column. It is indexing ALL rows however. Nothing to worry about.

 

If you have for example data like this

ID, teamID

1,  1

2,  1

3,  2

4,  1

5,  2

6,  3

 

Then the index created on column `teamID` will have cardinality equal to 3 (because there are three distinct teamIDs). However, the index will have indexed all rows. You can visualise it like this.

 

teamID -> IDs

1 -> 1,2,4

2 -> 3,5

3 -> 6

I want to clarify one other thing from your original message. 

 

A mediumint in mysql refers to the number of bytes used.  For a mediumint that is 3 bytes.  So the values allowable is the range of 3 bytes, which can either be signed or unsigned.  The column would have to be declared UNSIGNED to get the unsigned behavior.  So the range of numbers that can be stored would be 24 bits, or -8388608 to +8388607. 

 

If your value truly fits in the range of 0-9999 then you could downsize that column to a SMALLINT and save a byte per row.

 

In regards to the mediumint(4) people are often confused as to the function of the (4).  It has nothing to do with storage.  Per the mysql manual:

 

This optional display width may be used by applications to display integer values having a width less than the width specified for the column by left-padding them with spaces. (That is, this width is present in the metadata returned with result sets. Whether it is used or not is up to the application.)

 

In other words, those numbers do absolutely nothing, unless one is to also declare the column with the unusual 'ZEROFILL' attribute, and in that case it will pad numbers with leading zeroes up to that length.  I typically omit those specifiers, since people so often mistakenly believe that they constrain the values in some way.

 

 

My accountID primary key is set to INT auto increment and values range from 1 - 164512. Iv always just used INT for IDs if I'm been honest. Would you suggest I could use smallINT or mediumINT on this accountID aswell?

 

I did read through the numerical types for mysql Colums but I just didn't understand it lol.

 

Thanks for the help and advice, it's very helpful.

It depends how many records you expect to have in this table. With INTEGER you can get to 4 billions, which is more than enough for pretty much every table (if you need more, you probably also need to hire a professional DBA).

 

MEDIUMINT can keep track ID values up to 16 million

SMALLINT - 65 thousand, which seems to be to little for you.

 

You have 160000 records in your table now, which happens to be 1% of MEDIUMINT capacity. Now you need to answer this question: do you anticipate that number of rows in this table will rise 100 times? If yes, then how quickly? Over a week, month or a year?

 

One final thing that you should know, is that it is not problematic to move from MEDIUMINT to INTEGER if need arises. It might be a slow process, but one time only.

For AUTO_INCREMENT columns you should always set them to be UNSIGNED because you'll never have a negative value, and you're wasting half your range on negative values you'll never use.  Otherwise, you can alter the table and in most cases, you won't lose any data in the process.

 

For the table that is a mediumint, you stated that the range was 0-9999 so I suggested the downgrade to a smallint (which can go 0-65767 for UNSIGNED) would be overkill.  Even if not UNSIGNED, you still have plenty of room. 

 

I tend to look at the use case for the column and pick the smallest datatype that supports it.  Often for "lookup" tables like departments, account numbers, or small lists, I'll use either a  tinyint (0-255 unsigned) or a CHAR[1], or a smallint.  For your main data tables which you expect to grow over time, it makes sense to use an INT. 

 

I've also worked on a number of "needs to scale" applications, where there was a design goal that included some sort of sharding algorithm, so the primary keys of those tables were often GUIDs or large hashes.  Without a doubt, however, the smaller the table the better performance and scalability.  Every byte counts when you consider that caching is used, smaller means more data that will fit in cache rather than having to be read from disk.

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.