Jump to content

How does Mysql 5.6 perform vs Mysql 5.5 regarding select operations against a partitioned table ?


Recommended Posts

Hi All,

 

I have been doing some select tests on a partitioned table and on a non partitioned table, both having same data and around 4 millions records, and size of around 400 Mb each. Mysql version is 5.5.8.

 

The partition targeted in the partitioned table contains around 300000 records, and the select statement is of type

SELECT DISTINCT field1, field2 FROM table where partition_field3 =text_value


Even if a select operation against the partitioned table is indeed much faster than against the non-partitioned table, I noticed that in 5-10% of the cases it takes a bit too long from the user's perspective.

I was just wondering whether there is a significant difference in partition select performance between Mysql 5.5.8 and Mysql 5.6 since select operations against partitioned tables are not implemented the same way in both versions. In Mysql 5.6, the statement mentions the partition name rather than the "partitioned_column=value" criteria.  Does it mean that a select statement against a Mysql 5.6 partitioned table would be somewhat as fast as the "classic" form of the same select statement against a non-partitioned table holding the same data as the partition?

 

I would appreciate your opinion on that. I found no related benchmarks on the net.

 

Thanks
 

From what I gather reading the documentation, the new partition select syntax would only improve your performance if your partitioning is setup in such a way that you can ignore safely ignore some partitions. For instance if you partition by year, but are only interested in this years data, you could limit mysql to only searching this years partition.

 

In addition to asking/searching around for other peoples opinions, you would probably be best served by downloading and installing mysql 5.6, importing your data, and playing around with it to see how it performs. Given the newness of mysql 5.6, and the specific nature of your inquiry, finding existing relevant opinions/benchmarks may be a bit challenging.

Actually docs say partitions are not compatible with innob tables that contain foreign keys. Since the table I want to partition will have a foreign key, I am afraid I will have to find a workaround.

 


For instance if you partition by year, but are only interested in this years data, you could limit mysql to only searching this years partition

 

PostgerSQL also does this through "Constraint Exclusion". Each partition has a CHECK constraint that controls the range of data that can be inserted into each partition, and the query planner can use that constraint to work out which partitions it can ignore.

 

However, I think the problem here may be the DISTINCT, which may result in a SEQSCAN, which is slow no matter how you do it.

You can get the same result using SELECT field1, field2 FROM table WHERE ... GROUP BY field1, field2; which may convince MySQL to use an index. (provided that MySQL can use separate indexes per partition...)

Hi All,

 

FYI I installed mysql 5.6 on an old 32-bit laptop and I am impressed by the power of partitioning in mysql 5.6, compared to partitioning in mysql 5.5. Despite the less powerful architecture (32 bits instead of 64 bits), select time has been divided by up to 3 (same table, same records).

 

Now I've got a question to specialists : I noticed that performance is even greater on page reload (as fast as just 2/100s). It seems the results are cached somewhere. Let's assume that User1 connects to the page containing the php code that runs the query against the database. If a few seconds later User2 connects to the same page, thus running the same query, will User2 benefit from the results having been cached ?

 

Thanks.

 

BTW:I tried the  "GROUP BY field1, field2" suggestion, but unfortunately without noticeable effects.

It seems the results are cached somewhere. Let's assume that User1 connects to the page containing the php code that runs the query against the database. If a few seconds later User2 connects to the same page, thus running the same query, will User2 benefit from the results having been cached ?

 

 

yes, mysql has a (result) query cache. if the exact same select query statement is executed, and there have been no changes (insert/update/delte) to the database tables in the query, the result will be returned from the cache. at the database, it doesn't know which of your web site users caused the query. the database only sees the database connection details and the query statements.

. If a few seconds later User2 connects to the same page, thus running the same query, will User2 benefit from the results having been cached ?

 

if it is exactly the same query, and none of the used table have changed in any way, and there is sufficient space allocated for caching, then yes.

 

 

 

BTW:I tried the  "GROUP BY field1, field2" suggestion, but unfortunately without noticeable effects.

 

 

 
 
 

Did you use EXPLAIN to see what the query does? You'd need an index on both columns together to make this work.

Edited by vinny42
This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.