InnoDB Primary Key versus Secondary Index: An Interesting Lesson from EXPLAIN

I ran into an interesting issue today, while examining some EXPLAIN outputs, and wanted to share the findings, as some of this is undocumented.

Basically, you can start with a very simple InnoDB table – 2 INT columns, Primary Key (PK) on the 1st column, regular index on the 2nd:

CREATE TABLE `t1` (
  `id1` int(10) unsigned NOT NULL,
  `id2` int(10) unsigned DEFAULT NULL,
  PRIMARY KEY (`id1`),
  KEY `id2` (`id2`)
) ENGINE=InnoDB;

The query is:

SELECT id1 FROM t1;

This is a straight-forward query with no WHERE clause.

Given no WHERE clause, we know there will be a full table or index scan. Let’s look at EXPLAIN:

mysql> EXPLAIN SELECT id1 FROM t1\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t1
         type: index
possible_keys: NULL
          key: id2
      key_len: 5
          ref: NULL
         rows: 1
        Extra: Using index

EXPLAIN returns NULL for “possible_keys”. This means there was no relevant index (though we see one was used at some point):

“It is possible that key will name an index that is not present in the possible_keys value. This can happen if none of the possible_keys indexes are suitable for looking up rows, but all the columns selected by the query are columns of some other index. That is, the named index covers the selected columns, so although it is not used to determine which rows to retrieve, an index scan is more efficient than a data row scan.”

http://dev.mysql.com/doc/refman/5.6/en/explain-output.html

So “possible_keys” returns NULL, but key returns `id2`.

Per the above, that means `id2` is a covering index.

This part is a bit strange, as it is not readily intuitive why this is the case.

I mean, the query asks for “id1”, yet EXPLAIN says “id2” is the covering index.

How is this possible and why isn’t just the Primary Key chosen anyway?

This is possible because of the way InnoDB’s secondary indexes are stored. In InnoDB, secondary indexes are stored along with their corresponding primary key values, so by looking at the secondary index, you can see the primary key values.

As for why this is chosen instead of the primary key, it took some digging, but I found the exact comments as to why in the source code (sql/sql_select.cc) in the find_shortest_key() function:

"As far as 
1) clustered primary key entry data set is a set of all record
	fields (key fields and not key fields) and
2) secondary index entry data is a union of its key fields and
	primary key fields (at least InnoDB and its derivatives don't
	duplicate primary key fields there, even if the primary and
	the secondary keys have a common subset of key fields),
then secondary index entry data is always a subset of primary key entry.
Unfortunately, key_info[nr].key_length doesn't show the length
of key/pointer pair but a sum of key field lengths only, thus
we can't estimate index IO volume comparing only this key_length
value of secondary keys and clustered PK.
So, try secondary keys first, and choose PK only if there are no
usable secondary covering keys or found best secondary key include
all table fields (i.e. same as PK):"

So since secondary index entry data is always a subset of primary key entry, scanning the secondary index should generate slightly less IO than scanning the primary key.

Fwiw, this also explains why key_len is less if you specify “USE INDEX (Primary)” to force it to use the PK over the secondary index:

mysql> EXPLAIN SELECT id1 FROM t1 USE INDEX (Primary)\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t1
         type: index
possible_keys: NULL
          key: PRIMARY
      key_len: 4
          ref: NULL
         rows: 1
        Extra: Using index

Note this reports key_len of 4, whereas the former reported 5.

Hope this helps.

 

MariaDB 10.0.9 Overview and Highlights

MariaDB 10.0.9 was recently released, and is available for download here:

https://downloads.mariadb.org/mariadb/10.0.9/

This is the second RC (“Release Candidate”) release of MariaDB 10.0, and 10th overall release of 10.0. All features planned for MariaDB 10.0 GA are included in this release.

There were 6 notable changes in MariaDB 10.0.9:

  1. InnoDB upgraded to version 5.6.15
  2. Extended-keys optimization is now enabled by default.
  3. MariaDB can be compiled to use the system’s PCRE library.
  4. Added MASTER_GTID_WAIT() and @@last_gtid.
  5. When a TIME value is casted to a DATETIME, the date part will be the CURRENT_DATE, not 0000-00-00. This is compatible with the SQL standard and MySQL-5.6. One can use @@old_mode=ZERO_DATE_TIME_CAST to revert to the old behavior.
  6. XtraDB is now the default InnoDB implementation, Oracle InnoDB is a plugin that can be dynamically loaded if desired.
  7. Builds for Debian Sid and Ubuntu Trusty are being made available for the first time in the MariaDB repositories. For this release the Trusty packages are considered as alpha releases. The Sid packages will likely always be considered as such. Both were made as part of normal MariaDB development, and we’re making them available for those that want to test or try them out.

Also, if you read my MariaDB 10.0.8 Overview post, you will have noticed there was a build issue with Percona’s XtraDB that was preventing it from being the default in MariaDB 10.0. I’m glad I posted about it, as the MariaDB and Percona Devs began working together immediately, and resolved the issue in days, and now XtraDB+ is back the default InnoDB in MariaDB 10.0. (This is mentioned in the bullet point #6 above, but I wanted to explicitly mention it as well, so it did not go unnoticed.)

You can read more about the 10.0.9 release here:

https://mariadb.com/kb/en/mariadb-1009-release-notes/

And if interested, you can review the full list of changes in 10.0.9 (changelogs) here:

https://mariadb.com/kb/en/mariadb-1009-changelog/

Hope this helps.

 

MariaDB 5.5.36 Overview and Highlights

MariaDB 5.5.36 was recently released (it is the latest MariaDB 5.5), and is available for download here:

https://downloads.mariadb.org/mariadb/5.5.36/

This is a maintenance release, and so there are not too many big changes of note, just a number of normal bug fixes. However, there are a couple items worth mentioning:

  1. Includes all bugfixes and updates from: MySQL 5.5.36
  2. TokuDB is now included in RPM packages for CentOS 6 on x86-64

Note this release did not contain the bugfixes from XtraDB 5.5.36 yet. There is one bug fix in XtraDB 5.5.36 that is fairly serious, so if you are a *heavy* XtraDB+ user, I might wait for MariaDB 5.5.37 to be released, so you get this fix. I should note the bug was partially fixed in 5.5.35, and I did not see reports of the same issues in 5.5.35, so perhaps that covered it, or at least the majority. 5.5.34 had the worst of this bug, so if you are running 5.5.34, then I *would* definitely recommend upgrading.

Additionally, if you are using MySQL’s InnoDB in MariaDB, then I would definitely recommend upgrading to MariaDB 5.5.36 as well.

If interested, there is more about the 5.5.36 release here:

https://mariadb.com/kb/en/mariadb-5536-release-notes/

And the full list of fixed bugs and changes in MariaDB 5.5.36 can be found here:

https://mariadb.com/kb/en/mariadb-5536-changelog/

Hope this helps.