This is the User’s Guide for InnoDB Plugin 1.0.6 for MySQL 5.1, generated on November 27, 2009 (rev 477).
Copyright © 2009 Innobase Oy. All rights reserved.
Copyright © 2009 Oracle. All rights reserved.
Provided that you reproduce the above copyright notice, this list of conditions and the following disclaimer on all copies of this documentation (the “Documentation”) for the InnoDB Plugin for MySQL (the “Program”): (a) permission is hereby given to make a reasonable number of complete copies of the Documentation in formats of your choice, solely for your use with the Program, and (b) you may copy and distribute unmodified and complete versions of the content of this Documentation in format(s) of your choice, but only in conjunction with your otherwise-authorized distribution of the Program.
THIS DOCUMENTATION IS PROVIDED BY INNOBASE OY AND ITS CONTRIBUTORS “AS IS” AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL INNOBASE OY OR ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS DOCUMENTATION, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
Abstract
Starting with version 5.1, MySQL AB has promoted the idea of a “pluggable” storage engine architecture , which permits multiple storage engines to be added to MySQL. Beginning with MySQL version 5.1, it is possible for users to swap out one version of InnoDB and use another. The pluggable storage engine architecture also permits Innobase Oy to release new versions of InnoDB containing bug fixes and new features independently of the release cycle for MySQL.
This User’s Guide documents the installation and removal procedures and the additional features of the InnoDB Plugin 1.0.6 for MySQL 5.1.
WARNING: Because the InnoDB Plugin
introduces a new file format, restrictions apply to the use of
a database created with the InnoDB Plugin with earlier
versions of InnoDB, when using
mysqldump or MySQL replication and if you
use the InnoDB Hot Backup utility. See
Section 1.5, “Operational Restrictions”.
Table of Contents
INFORMATION_SCHEMA tablesTRUNCATE TABLE Reclaims SpaceSHOW ENGINE INNODB MUTEXList of Tables
CREATE TABLE and ALTER TABLE options
CREATE/ALTER TABLE Warnings and Errors
when InnoDB Strict Mode is OFFINNODB_CMP and
INNODB_CMP_RESET
INNODB_CMPMEM and INNODB_CMPMEM_RESETINFORMATION_SCHEMA.INNODB_CMP under light load
INFORMATION_SCHEMA.INNODB_CMPMEM under light load
INNODB_TRX columnsINNODB_LOCKS columnsINNODB_LOCK_WAITS columnsINFORMATION_SCHEMA.INNODB_TRXINFORMATION_SCHEMA.INNODB_LOCKSINFORMATION_SCHEMA.INNODB_LOCK_WAITSINFORMATION_SCHEMA.PROCESSLIST in a loaded
system
INFORMATION_SCHEMA.INNODB_TRX in a loaded
system
INFORMATION_SCHEMA.INNODB_LOCK_WAITS in a loaded
system
INFORMATION_SCHEMA.INNODB_LOCKS in a loaded
system
innodb_thread_concurrencyList of Examples
Table of Contents
The unique architecture of MySQL permits multiple storage engines with different capabilities to be accessed via the same SQL language and APIs. Starting with version 5.1, MySQL AB has promoted the idea of a “pluggable” storage engine architecture, which permits multiple storage engines to be added to MySQL. Currently, however, most users have accessed only those storage engines that are distributed by MySQL AB, and are linked into the binary (executable) releases.
Since 2001, MySQL AB has distributed the InnoDB transactional storage engine with its releases (both source and binary). Beginning with MySQL version 5.1, it is possible for users to swap out one version of InnoDB and use another. The pluggable storage engine architecture also permits Innobase Oy to release new versions of InnoDB containing bug fixes and new features independently of the release cycle for MySQL. Users can thus take advantage of these new versions of InnoDB in the context of their deployed MySQL installations.
The InnoDB Plugin for MySQL contains several important new features:
BLOB,
TEXT, and VARCHAR
columnsINFORMATION_SCHEMA tables: information about
compression and lockingNote that the ability to use data compression and the new row format require the use of a new InnoDB file format called “Barracuda”. The previous file format, used by the built-in InnoDB in MySQL versions 5.0 and 5.1 is now called “Antelope” and does not support these features, but does support the other features introduced with the InnoDB Plugin.
The InnoDB Plugin is upward compatible from standard InnoDB as
built in to, and distributed with, MySQL.
Existing databases can be used with the InnoDB Plugin for MySQL.
As described in
Section 9.5, “Configuring the InnoDB Plugin”, the
new parameter innodb_file_format can help protect
upward and downward compatibility between InnoDB versions and
database files, allowing users to enable or disable use of new
features that can only be used with certain versions of InnoDB.
The built-in InnoDB in MySQL since version 5.0.21 has a safety feature that prevents it from opening tables that are in an unknown format. However, as noted in Section 11.2, “The Built-in InnoDB, the Plugin and File Formats”, the system tablespace may contain references to new-format tables that will confuse the built-in InnoDB in MySQL. These references will be cleared in a “slow” shutdown of the InnoDB Plugin.
With previous versions of InnoDB, no error would be returned until you try to access a table that is in a format “too new” for the software. Beginning with version 1.0.1 of the InnoDB Plugin, however, to provide early feedback, InnoDB will check the system tablespace to ensure that the file format used in the database is enabled for use before it will start. See Section 4.4.1, “Startup File Format Compatibility Checking” for the details.
From the InnoDB web site, you can download source and binary versions of an upgraded version of InnoDB with additional features beyond the standard InnoDB provided by MySQL in Version 5.1. The InnoDB Plugin for MySQL is licensed under the same license that MySQL uses (GPLv2). It is available at no charge for use and can be freely redistributed, subject to the same requirements of the GPL as is MySQL itself (see the GNU General Public License, version 2). Limited support is available for the InnoDB Plugin for MySQL via http://forums.innodb.com. You may also report bugs in the InnoDB Plugin using the MySQL bug database.
In many environments, the InnoDB plugin can dynamically be
added to a MySQL instance without relinking the MySQL server.
The plugin version of InnoDB then “takes over” from
the statically linked InnoDB that is part of the
mysqld binary. In other environments, it is
currently necessary to build the entire MySQL server, including
the InnoDB plugin, from source code.
On Linux, Unix and Windows, it is a simple matter of
installing the InnoDB Plugin for MySQL using the
MySQL command INSTALL PLUGIN. When
the InnoDB Plugin for MySQL is installed, it replaces the
statically-linked version of InnoDB that is incorporated in the
MySQL binary as distributed by MySQL AB.
On platforms where the dynamic plugin is not available,
users must download the source code for the
InnoDB Plugin for MySQL and build MySQL from source. Building
from source also facilitates the distribution of a MySQL server
where the InnoDB Plugin is already “installed”, so
all users of an organization can use the new capabilities without
the INSTALL step. The procedure for building
from source code is documented in Section 9.4, “Building the InnoDB Plugin from Source Code”.
Full instructions are provided in Chapter 9, Installing the InnoDB Plugin.
InnoDB Plugin releases are numbered with version numbers independent of MySQL release numbers. The initial release of the InnoDB Plugin is version 1.0, and it is designed to work with MySQL 5.1.
Once you have installed the InnoDB Plugin, you can check its version number in three ways:
SELECT * FROM information_schema.plugins;SELECT @@innodb_version;The InnoDB Plugin writes its version number to the error log, which can be helpful in diagnosis of errors:
091105 12:28:06 InnoDB Plugin 1.0.5 started; log sequence number 46509
Note that the PLUGIN_VERSION column
in the table INFORMATION_SCHEMA.PLUGINS does not
display the third component of the version number, only
the first and second components, as in 1.0.
Because the InnoDB Plugin introduces a new file format, with new on-disk data structures within both the database and log files, there are important restrictions on the use of the plugin in typical user environments. Specifically, you should pay special attention to the information presented here about file format compatibility with respect to the following scenarios:
mysqldump
WARNING: Once you use the
InnoDB Plugin on a set of database files, care must be taken to avoid
crashes and corruptions when using those files with an earlier
version of InnoDB, as might happen by opening the database with
MySQL when the plugin is not installed. It is
strongly
recommended that you use a “slow shutdown”
(SET GLOBAL innodb_fast_shutdown=0) when
stopping the MySQL server when the InnoDB Plugin is enabled. This
will ensure log files and other system information written by the
plugin will not cause problems when using a prior version of
InnoDB. See Section 11.3, “How to Downgrade”.
Because of these considerations, and although it may be useful in certain circumstances to use the plugin in a temporary way as just described, many users will find it preferable to test their application with the plugin and use it on an on-going basis, without reverting back to the standard, built-in InnoDB.
WARNING: If you dump a
database containing compressed tables with
mysqldump, the dump file may contain
CREATE TABLE commands that attempt to create
compressed tables, or those using ROW_FORMAT=DYNAMIC
in the new database. Therefore, you should be
sure the new database is running the InnoDB Plugin, with the proper
settings for innodb_file_format and
innodb_file_per_table, if you want to have the
tables re-created as they exist in the original database.
Typically, however, when the mysqldump file is
loaded, MySQL and InnoDB will ignore
CREATE TABLE options they do not recognize, and
the table(s) will be created in a format used by the running
server.
WARNING: If you use
MySQL replication, you should be careful to ensure all slaves
are configured with the InnoDB Plugin, with the same settings for
innodb_file_format and
innodb_file_per_table. If you do not do so,
and you create tables that require the new “Barracuda” file
format, replication errors may occur. If a slave MySQL server is
running the built-in InnoDB, it will ignore the
CREATE TABLE
options to create a compressed table or one with
ROW_FORMAT=DYNAMIC, and create the table
uncompressed, with ROW_FORMAT=COMPACT.
WARNING: The current
version of InnoDB Hot Backup does not support the new “Barracuda” file
format. Using InnoDB Hot Backup Version 3 to backup databases in this
format will cause unpredictable behavior. A future version of
InnoDB Hot Backup will support databases used with the InnoDB Plugin. As an
alternative, you may back up such databases with
mysqldump.
Table of Contents
In MySQL versions up to 5.0, adding or dropping an index
on a table with existing data can be very slow if the table has
many rows. The CREATE INDEX and
DROP INDEX commands work by creating a new, empty
table defined with the requested set of indexes. It then copies the
existing rows to the new table one-by-one, updating the indexes as it
goes. Inserting entries into the indexes in this fashion, where
the key values are not sorted, requires random access to the index
nodes, and is far from optimal. After all rows from the original
table are copied, the old table is dropped and the copy is renamed
with the name of the original table.
Beginning with version 5.1, MySQL allows a storage engine to create or drop indexes without copying the contents of the entire table. The standard built-in InnoDB in MySQL version 5.1, however, does not take advantage of this capability. With the InnoDB Plugin, however, users can in most cases add and drop indexes much more efficiently than with prior releases.
In InnoDB, the rows of a table are stored in a clustered (or primary key) index, forming what some database systems call an “index-organized table”. Changing the clustered index requires copying the data, even with the InnoDB Plugin. However, adding or dropping a secondary index with the InnoDB Plugin is much faster, since it does not involve copying the data.
This new mechanism also means that you can generally speed the overall process of creating and loading an indexed table by creating the table with only the clustered index, and adding the secondary indexes after the data is loaded.
No syntax changes are required in the
CREATE INDEX or DROP INDEX
commands. However, there are some considerations of which you should be
aware (see Section 2.6, “Limitations”).
Because the ability to create and drop indexes does not require use of a new on-disk file format, it is possible to temporarily use the InnoDB Plugin to create or drop an index, and then fall back to using the standard built-in InnoDB in MySQL for normal operations if you wish. See Chapter 11, Downgrading from the InnoDB Plugin for more information.
It is possible to create multiple indexes on a table with
one ALTER TABLE command. This is relatively
efficient, because the clustered index of the table needs to be
scanned only once (although the data is sorted separately for each
new index). For example:
CREATE TABLE T1(A INT PRIMARY KEY, B INT, C CHAR(1)) ENGINE=InnoDB; INSERT INTO T1 VALUES (1,2,'a'), (2,3,'b'), (3,2,'c'), (4,3,'d'), (5,2,'e'); COMMIT; ALTER TABLE T1 ADD INDEX (B), ADD UNIQUE INDEX (C);
The above commands will create table T1
with the clustered index (primary key) on column
A, insert several rows, and then build two new
indexes on columns B and C.
If there were many rows inserted into T1 before
the ALTER TABLE command, this approach would be
much more efficient than creating the table with all its indexes
before loading the data.
You may also create the indexes one at a time, but then the
clustered index of the table is scanned (as well as sorted) once
for each CREATE INDEX command. Thus, the
following commands are not as efficient as the ALTER TABLE
command above, even though neither requires
recreating the clustered index for table
T1.
CREATE INDEX B ON T1 (B); CREATE UNIQUE INDEX C ON T1 (C);
Dropping indexes in the InnoDB Plugin does not require any
copying of table data. Thus, you can equally quickly drop multiple
indexes with a single ALTER TABLE command or
multiple DROP INDEX commands:
ALTER TABLE T1 DROP INDEX B, DROP INDEX C;
or
DROP INDEX B ON T1; DROP INDEX C ON T1;
Restructuring the clustered index in InnoDB always
requires copying the data in the table. For example, if you
create a table without a primary key, InnoDB chooses one for
you, which may be the first UNIQUE key defined
on NOT NULL columns, or a system-generated key.
Defining a PRIMARY KEY later causes the data
to be copied, as in the following example:
CREATE TABLE T2 (A INT, B INT) ENGINE=InnoDB; INSERT INTO T2 VALUES (NULL, 1); ALTER TABLE T2 ADD PRIMARY KEY (B);
Note that when you create a UNIQUE or
PRIMARY KEY index, InnoDB must do some extra
work. For UNIQUE indexes, InnoDB checks
that the table contains no duplicate values for the key. For a
PRIMARY KEY index, InnoDB also
checks that none of the PRIMARY KEY columns
contains a NULL. It is best to define the
primary key when you create a table, so you need not rebuild the
table later.
InnoDB has two types of indexes: the clustered index and secondary indexes. Since the clustered index contains the data values in its B-tree nodes, adding or dropping a clustered index does involve copying the data, and creating a new copy of the table. A secondary index, however, contains only the index key and the value of the primary key. This type of index may be created or dropped without copying the data in the clustered index. Furthermore, because the secondary index contains the values of the primary key (used to access the clustered index when needed), when you change the definition of the primary key, thus recreating the clustered index, all secondary indexes are recreated as well.
Dropping a secondary index is simple. Only the internal InnoDB system tables and the MySQL data dictionary tables need to be updated to reflect the fact that the index no longer exists. InnoDB returns the storage used for the index to the tablespace that contained it, so that new indexes or additional table rows may use the space.
To add a secondary index to an existing table, InnoDB scans the table, and sorts the rows using memory buffers and temporary files in order by the value(s) of the secondary index key column(s). The B-tree is then built in key-value order, which is more efficient than inserting rows into an index in random order with respect to the key values. Because the B-tree nodes are split when they fill, building the index in this way results in a higher fill-factor for the index, making it more efficient for subsequent access.
While a secondary index is being created or dropped, the table is locked in shared mode. That is, any writes to the table are blocked, but the data in the table may be read. When you alter the clustered index of a table, however, the table is locked in exclusive mode, because the data must be copied. Thus, during the creation of a new clustered index, all operations on the table are blocked.
Before it can start executing, a CREATE INDEX
or ALTER TABLE command must
always wait for currently executing transactions that are
accessing the table to commit or rollback before it can proceed.
In addition, ALTER TABLE commands that
create a new clustered index must wait for all
SELECT statements that access the table to
complete (or their containing transactions to commit). Even
though the original index exists throughout the creation of
the new clustered index, no transactions whose execution spans the
creation of the index can be accessing the table, because the
original table must be dropped when clustered index is
restructured.
Once a CREATE INDEX or ALTER TABLE command that creates a
secondary index begins
executing, queries may access the table for read access, but may
not update the table. If an ALTER TABLE
command is changing the clustered index, all queries must wait
until the operation completes.
A newly-created secondary index contains only data that is
current in the table as of the time the CREATE INDEX
or ALTER TABLE command begins to execute.
Specifically, a newly-created index
contains only the versions of data as of the most-recently committed
transactions prior to the creation of the index. The index thus does
not contain any rows that were deleted (and therefore marked for
deletion) by transactions that completed before the
CREATE INDEX or
ALTER TABLE began. Similarly, the index contains only current
versions of every row, and none of the old versions of rows that
were updated by transactions that ran before the index was
created.
Because a newly-created index contains only information about data current at the time the index was created, queries that need to see data that was deleted or changed before the index was created cannot use the index. The only queries that could be affected by this limitation are those executing in transactions that began before the creation of the index was begun. For such queries, unpredictable results could occur. Newer queries can use the index.
No data is lost if the server crashes while an
ALTER TABLE command is executing. Recovery,
however, is different for clustered indexes and secondary indexes.
If the server crashes while creating a secondary index, upon
recovery, InnoDB drops any partially created indexes. All
you need to do to create the index is to re-run the
ALTER TABLE or
CREATE INDEX command.
However, when a crash occurs during the creation of a clustered index, recovery is somewhat more complicated, because the data in the table must be copied to an entirely new clustered index. Remember that all InnoDB tables are stored as clustered indexes. In the following discussion, we use the word table and clustered index interchangeably.
The InnoDB Plugin creates the new clustered index by copying the existing data from the original table to a temporary table that has the desired index structure. Once the data is completely copied to this temporary table, the original table is renamed with a different temporary table name. The temporary table comprising the new clustered index is then renamed with the name of the original table, and the original table is then dropped from the database.
If a system crash occurs while creating a new clustered index, no data is lost, but users must complete the recovery process using the temporary tables that exist during the process.
Users rarely re-create a clustered index or re-define primary keys on large tables. Because system crashes are uncommon and the situation described here is rare, this manual does not provide information on recovering from this scenario. Instead, please see the InnoDB web site: http://www.innodb.com/support/tips.
You should be aware of the following considerations when creating or dropping indexes using the InnoDB Plugin:
If any of the indexed columns use the UTF-8 character encoding, MySQL will copy the table. This has been reported as MySQL Bug #33650.
Due to a limitation of MySQL, the table is copied, rather
than using “Fast Index Creation” when you create an
index on a TEMPORARY TABLE. This has been
reported as MySQL Bug
#39833.
The command ALTER IGNORE TABLE
does not
delete duplicate rows. This has been reported as MySQL Bug
#40344. The t ADD UNIQUE INDEXIGNORE keyword is ignored,
and duplicates cause failure of the operation with the
following error message:
ERROR 23000: Duplicate entry '347' for key 'pl'
As noted above, a newly-created index contains only information about data current at the time the index was created. Therefore, you should not run queries in a transaction that might use a secondary index that did not exist at the beginning of the transaction. There is no way for InnoDB to access “old” data that is consistent with the rest of the data read by the transaction. See the discussion of locking in Section 2.4, “Concurrency Considerations”.
Prior to InnoDB Plugin 1.0.4, unexpected results could occur if a query attempts to use an index created after the start of the transaction containing the query. If an old transaction attempts to access a “too new” index, InnoDB Plugin 1.0.4 and later reports an error:
ERROR HY000: Table definition has changed, please retry transaction
As the error message suggests, committing (or rolling back) the transaction, and restarting it, cures the problem.
InnoDB Plugin 1.0.2 introduces some improvements in error handling when users attempt to drop indexes. See section Section 8.7, “Better Error Handling When Dropping Indexes” for details.
Unfortunately, MySQL 5.1 does not support efficient
creation or dropping of FOREIGN KEY constraints. Therefore, if
you use ALTER TABLE to add or remove a REFERENCES constraint,
the child table will be copied, rather than using “Fast
Index Creation”.
Table of Contents
Over the years, processors and cache memories have become much faster, but mass storage based on rotating magnetic disks has not kept pace. While the storage capacity of disks has grown by about a factor of 1,000 in the past decade, random seek times and data transfer rates are still severely limited by mechanical constraints. Therefore, many workloads are i/o-bound. The idea of data compression is to pay a small cost in increased CPU utilization for the benefit of smaller databases and reduced i/o to improve throughput, potentially significantly.
The ability to compress user data is an important new capability of the InnoDB Plugin. Compressed tables reduce the size of the database on disk, resulting in fewer reads and writes needed to access the user data. For many InnoDB workloads and many typical user tables (especially with read-intensive applications where sufficient memory is available to keep frequently-used data in memory), compression not only significantly reduces the storage required for the database, but also improves throughput by reducing the i/o workload, at a modest cost in processing overhead. The storage cost savings can be important, but the reduction in i/o costs can be even more valuable.
The usual size of InnoDB data pages is 16K. Beginning with the
InnoDB Plugin, you can use the attributes
ROW_FORMAT=COMPRESSED or
KEY_BLOCK_SIZE in the
CREATE TABLE and ALTER TABLE
commands to request InnoDB to compress each page to
1K, 2K, 4K, 8K, or 16K bytes.
(The term KEY_BLOCK_SIZE may be
confusing. It does not refer to a “key” at all, but simply
specifies the size of compressed pages that will be used for the
table. Likewise, in the InnoDB Plugin, compression is
applicable to tables, not to individual rows, so the option
ROW_FORMAT really should be
TABLE_FORMAT. Unfortunately, MySQL
does not permit a storage engine to add syntax to SQL statements,
so the InnoDB Plugin simply re-uses the clauses originally defined
for MyISAM).
To create a compressed table, you might use a command like this:
CREATE TABLEname(column1 INT PRIMARY KEY) ENGINE=InnoDB ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=4;
If KEY_BLOCK_SIZE not specified, the
default compressed page size of 8K will be used. If
KEY_BLOCK_SIZE is specified, the attribute
ROW_FORMAT=COMPRESSED may be omitted.
Setting KEY_BLOCK_SIZE=16 most often will
not result in much compression, since the normal InnoDB page
size is 16K. However, this setting may be useful for tables with
many long BLOB, VARCHAR or
TEXT columns, because such data often do
compress well, and might therefore require fewer
“overflow” pages as described in
Section 3.3.2.2, “
Compressing BLOB, VARCHAR
and TEXT columns
”.
Note that compression is specified on a table-by-table basis.
All indexes of a table (including the clustered index) will be
compressed using the same page size, as specified on the
CREATE TABLE or ALTER TABLE
command. Table attributes such as ROW_FORMAT and
KEY_BLOCK_SIZE are not part of
the CREATE INDEX syntax, and are ignored if they
are specified (although you will see them in the output of the
SHOW CREATE TABLE command).
Compressed tables are stored in a format previous versions
of InnoDB cannot process. Therefore, to preserve downward
compatibility of database files, compression can be specified only
when the “Barracuda” data file format is enabled using the
configuration parameter innodb_file_format.
Furthermore, table compression is not available for the
InnoDB system tablespace. The system tablespace (space 0, the
ibdata* files) may contain user data, but it
also contains internal InnoDB system information, and therefore
is never compressed. Thus, compression applies only to tables
(and indexes) stored in their own tablespaces.
To use compression, you must enable the
“file per table” mode using the configuration
parameter innodb_file_per_table and enable the
“Barracuda” disk file format using the parameter
innodb_file_format. You can set these
parameters in the MySQL option file my.cnf or
my.ini, but both are dynamic parameters that
you can change with the SET command without
shutting down the MySQL server, as noted in
Section 9.5, “Configuring the InnoDB Plugin”.
Specifying ROW_FORMAT=COMPRESSED or a KEY_BLOCK_SIZE in the
CREATE TABLE or ALTER TABLE commands if the “Barracuda”
file format has not been enabled will produce these warnings that
you can view with the SHOW WARNINGS command:
| Level | Code | Message |
|---|---|---|
| Warning | 1478 | InnoDB: KEY_BLOCK_SIZE requires
innodb_file_per_table. |
| Warning | 1478 | InnoDB: KEY_BLOCK_SIZE requires
innodb_file_format=1. |
| Warning | 1478 | InnoDB: ignoring
KEY_BLOCK_SIZE= |
| Warning | 1478 | InnoDB: ROW_FORMAT=COMPRESSED requires
innodb_file_per_table. |
| Warning | 1478 | InnoDB: assuming
ROW_FORMAT=COMPACT. |
Note that these messages are only warnings, not errors, and the table will be created as if the options were not specified. If you wish, you may enable InnoDB “strict mode” (see Section 8.5, “InnoDB Strict Mode”) to cause InnoDB to generate an error, not a warning. In strict mode, the table would not be created if the current configuration does not permit using compressed tables.
The “non-strict” behavior is intended to permit you to
import a mysqldump file into a database that does
not support compressed tables even if the source database contained
compressed tables. In that case the InnoDB Plugin will create the
table in ROW_FORMAT=COMPACT instead of preventing the operation.
Generally, however, when you import the dump file into a new
database, you should be sure the server is running the
InnoDB Plugin with the proper settings for the configuration
parameters innodb_file_format and innodb_file_per_table,
if you want to have the tables re-created as they exist in the
original database.
The attribute KEY_BLOCK_SIZE is permitted
only when ROW_FORMAT is specified as
COMPRESSED or is omitted. Specifying a
KEY_BLOCK_SIZE with any other
ROW_FORMAT will generate a warning that
you can view with SHOW WARNINGS. However, the
table will be as non-compressed (the specified
KEY_BLOCK_SIZE will be ignored).
| Level | Code | Message |
|---|---|---|
| Warning | 1478 |
InnoDB: ignoring KEY_BLOCK_SIZE=
|
If you are running in InnoDB strict mode, the combination of a
KEY_BLOCK_SIZE with any
ROW_FORMAT other than
COMPRESSED will generate an error, not a warning,
and the table will not be created.
Table 3.1, “
Meaning of CREATE TABLE and ALTER TABLE options
”
summarizes how the various options on CREATE TABLE and ALTER TABLE
are handled.
Table 3.1.
Meaning of CREATE TABLE and ALTER TABLE options
| Option | Usage | Description |
|---|---|---|
ROW_FORMAT=REDUNDANT | Storage format used prior to MySQL 5.0.3 |
Less efficient than ROW_FORMAT=COMPACT;
for backward compatibility
|
ROW_FORMAT=COMPACT | Default storage format since MySQL 5.0.3 | Stores a prefix of 768 bytes of long column values in the clustered index page, with the remaining bytes stored in an overflow page |
ROW_FORMAT=DYNAMIC | Available only with innodb_file_format=Barracuda | Will store values within the clustered index page if they fit; if not, will store only a 20-byte pointer to an overflow page (no prefix) |
ROW_FORMAT=COMPRESSED | Available only with innodb_file_format=Barracuda |
Compresses the table and indexes using zlib to default
compressed page size of 8K bytes; implies
ROW_FORMAT=DYNAMIC
|
KEY_BLOCK_SIZE=
| Available only with innodb_file_format=Barracuda |
Specifies compressed page size of 1, 2, 4, 8 or 16K bytes;
implies ROW_FORMAT=DYNAMIC and
ROW_FORMAT=COMPRESSED
|
Table 3.2, “CREATE/ALTER TABLE Warnings and Errors
when InnoDB Strict Mode is OFF”
summarizes error conditions that occur with certain
combinations of configuration parameters and options on the
CREATE TABLE or ALTER TABLE commands,
and how the options appear in the output of
SHOW TABLE STATUS.
When InnoDB strict mode is OFF,
InnoDB will create or alter the table, but may ignore certain
settings, as shown below. You can see the warning messages in the
MySQL error log. When InnoDB strict mode is
ON, these specified combinations of options
generate errors, and the table is not created or altered.
You can see the full description of the error condition with
SHOW ERRORS. For example:
mysql>CREATE TABLE x (id INT PRIMARY KEY, c INT)->ENGINE=INNODB KEY_BLOCK_SIZE=33333;ERROR 1005 (HY000): Can't create table 'test.x' (errno: 1478) mysql>SHOW ERRORS;+-------+------+-------------------------------------------+ | Level | Code | Message | +-------+------+-------------------------------------------+ | Error | 1478 | InnoDB: invalid KEY_BLOCK_SIZE=33333. | | Error | 1005 | Can't create table 'test.x' (errno: 1478) | +-------+------+-------------------------------------------+ 2 rows in set (0.00 sec)
Table 3.2. CREATE/ALTER TABLE Warnings and Errors
when InnoDB Strict Mode is OFF
| Syntax | Warning or Error Condition |
Resulting ROW_FORMAT,
as shown in SHOW TABLE STATUS
|
|---|---|---|
ROW_FORMAT=REDUNDANT | None | REDUNDANT |
ROW_FORMAT=COMPACT | None | COMPACT |
ROW_FORMAT=COMPRESSED or
ROW_FORMAT=DYNAMIC or
KEY_BLOCK_SIZE is specified
|
Ignored unless you override the default settings
for innodb_file_format and innodb_file_per_table
| COMPACT |
Invalid KEY_BLOCK_SIZE is specified
(not 1, 2, 4, 8 or 16)
| KEY_BLOCK_SIZE is ignored | the requested one, or COMPACT by default |
ROW_FORMAT=COMPRESSED and valid
KEY_BLOCK_SIZE are specified
|
None; KEY_BLOCK_SIZE specified is used,
not the 8K default
| COMPRESSED |
KEY_BLOCK_SIZE is specified with
REDUNDANT, COMPACT or DYNAMIC row format
| KEY_BLOCK_SIZE is ignored | REDUNDANT, COMPACT or DYNAMIC |
ROW_FORMAT is not one of
REDUNDANT,
COMPACT,
DYNAMIC or
COMPRESSED
| Ignored if recognized by the MySQL parser. Otherwise, an error will be issued. | COMPACT or N/A |
When InnoDB strict mode is ON
(innodb_strict_mode=1), the InnoDB Plugin
will reject invalid ROW_FORMAT or KEY_BLOCK_SIZE parameters.
For compatibility with the built-in InnoDB in MySQL, InnoDB strict mode is
not enabled by default, and in this default non-strict mode, the
InnoDB Plugin will issue warnings (not errors) for ignored invalid
parameters.
Note that it is not possible to see the chosen
KEY_BLOCK_SIZE using
SHOW TABLE STATUS. The command
SHOW CREATE TABLE will display the
KEY_BLOCK_SIZE (even if it was ignored by InnoDB).
The real compressed page size inside InnoDB cannot
be displayed by MySQL.
This section describes some internal implementation details about compression in InnoDB. The information presented here may be helpful in tuning for performance, but is not necessary to know for basic use of compression.
Some operating systems implement compression at the file system level. Files are typically divided into fixed-size blocks that are compressed into variable-size blocks, which easily leads into fragmentation. Every time something inside a block is modified, the whole block is recompressed before it is written to disk. These properties make this compression technique unsuitable for use in an update-intensive database system.
The InnoDB Plugin implements a novel type of compression with the help of the well-known zlib library, which implements the LZ77 compression algorithm. This compression algorithm is mature, robust, and efficient in both CPU utilization and in reduction of data size. The algorithm is “lossless”, so that the original uncompressed data can always be reconstructed from the compressed form. LZ77 compression works by finding sequences of data that are repeated within the data to be compressed. The patterns of values in your data will determine how well it compresses, but typical user data will often compress by 50% or more.
Unlike compression performed by an application, or compression features of some other database management systems, InnoDB compression applies both to user data and to indexes. In many cases, indexes can constitute 40-50% or more of the total database size, so this difference is significant. When compression is working well for a data set, the size of the InnoDB data files (the .idb files) will be 25% to 50% of the uncompressed size or possibly smaller. Depending on the workload, this smaller database can in turn lead to a reduction in i/o, and an increase in throughput, at a modest cost in terms of increased CPU utilization.
All user data in InnoDB is stored in pages comprising a B-tree index (the so-called clustered index). In some other database systems, this type of index is called an “index-organized table”. Each row in the index node contains the values of the (user-specified or system-generated) primary key and all the other columns of the table.
Secondary indexes in InnoDB are also B-trees, containing pairs of values: the index key and a pointer to a row in the clustered index. The pointer is in fact the value of the primary key of the table, which is used to access the clustered index if columns other than the index key and primary key are required. Secondary index records must always fit on a single B-tree page.
The compression of B-tree nodes (of both clustered and
secondary indexes) is handled differently from compression of
overflow pages used to store long VARCHAR,
BLOB, or TEXT columns, as
explained in the following sections.
Because they are frequently updated, B-tree pages require special treatment. It is important to minimize the number of times B-tree nodes are split, as well as to minimize the need to uncompress and recompress their content.
One technique InnoDB uses is to maintain some system information in the B-tree node in uncompressed form, thus facilitating certain in-place updates. For example, this allows rows to be delete-marked and deleted without any compression operation.
In addition, InnoDB attempts to avoid unnecessary uncompression and recompression of index pages when they are changed. Within each B-tree page, the system keeps an uncompressed “modification log” to record changes made to the page. Updates and inserts of small records may be written to this modification log without requiring the entire page to be completely reconstructed.
When the space for the modification log runs out, InnoDB uncompresses the page, applies the changes and recompresses the page. If recompression fails, the B-tree nodes are split and the process is repeated until the update or insert succeeds.
Generally, InnoDB requires that each B-tree page can
accommodate at least two records. For compressed tables,
this requirement has been relaxed. Leaf pages of B-tree
nodes (whether of the primary key or secondary indexes) only
need to accommodate one record, but that record must fit in
uncompressed form, in the per-page modification log.
Starting with InnoDB Plugin version 1.0.2,
and if InnoDB strict mode is ON,
the InnoDB Plugin will check the maximum row size
during CREATE TABLE or CREATE INDEX. If the row will
not fit, the following error message is issued:
ERROR HY000: Too big row.
If you create a table when InnoDB strict mode is OFF, and
a subsequent INSERT or
UPDATE command attempts to create an
index entry that does not fit in the size of the compressed
page, the operation will fail with
ERROR 42000: Row size too large.
Unfortunately, this error message does not name the index
for which the record is too large, or mention the length
of the index record or the maximum record size on that
particular index page. The only remedy is to rebuild the
table with ALTER TABLE, in order to select a larger
compressed page size (KEY_BLOCK_SIZE), to shorten any
column prefix indexes, or to disable compression entirely
with ROW_FORMAT=DYNAMIC or ROW_FORMAT=COMPACT.
In a clustered index, BLOB,
VARCHAR and TEXT columns
that are not part of the primary key may be stored on
separately allocated (“overflow”) pages. We call
these “off-page columns” whose values are stored
on singly-linked lists of overflow pages.
For tables created in ROW_FORMAT=DYNAMIC or
ROW_FORMAT=COMPRESSED, the values of BLOB,
TEXT or VARCHAR columns may
be stored fully off-page, depending on their length and the length
of the entire row. For columns that are stored off-page, the
clustered index record only contains 20-byte pointers to the
overflow pages, one per column. Whether any columns are
stored off-page depends on the page size and the total size of
the row. When the row is too long to fit entirely within the
page of the clustered index, InnoDB will choose the longest
columns for off-page storage until the row fits on the
clustered index page. As noted above, if a row will not fit by
itself on a compressed page, an error will occur.
Tables created in previous versions of InnoDB use the
“Antelope” file format, which supports only
ROW_FORMAT=REDUNDANT and ROW_FORMAT=COMPACT. In these formats,
InnoDB stores the first 768 bytes of BLOB,
VARCHAR and TEXT columns in
the clustered index record along with the primary key. The 768-byte
prefix is followed by a 20-byte pointer to the overflow pages that
contain the rest of the column value.
When a table is in COMPRESSED format,
all data written to overflow pages is compressed
“as is”; that is, InnoDB applies the zlib
compression algorithm to the entire data item. Other than the data,
compressed overflow pages contain an uncompressed header and
trailer comprising a page checksum and a link to the next
overflow page, among other things. Therefore, very
significant storage savings can be obtained for longer
BLOB, TEXT or
VARCHAR columns if the data is highly
compressible, as is often the case with text data (but not
previously compressed images).
The overflow pages are of the same size as other pages. A row containing ten columns stored off-page will occupy ten overflow pages, even if the total length of the columns is only 8K bytes. In an uncompressed table, ten uncompressed overflow pages occupy 160K bytes. In a compressed table with an 8K page size, they occupy only 80K bytes. Thus, it is often more efficient to use compressed table format for tables with long column values.
Using a 16K compressed page size can reduce storage and
i/o costs for BLOB,
VARCHAR or TEXT columns,
because such data often will compress well, and might
therefore require fewer “overflow” pages, even
though the B-tree nodes themselves will take as many pages as
in the uncompressed form.
In a compressed InnoDB table, every compressed page (whether 1K, 2K, 4K or 8K) corresponds to an uncompressed page of 16K bytes. To access the data in a page, InnoDB must read the compressed page from disk (unless it is already in memory), and then uncompress the page to its original 16K byte form. This section describes how InnoDB manages the buffer pool with respect to pages of compressed tables.
To minimize i/o and to reduce the need to uncompress a page, at times the buffer pool will contain both the compressed and uncompressed form of a database page. However, to make room for other required database pages, InnoDB may “evict” from the buffer pool an uncompressed page, while leaving the compressed page in memory. Or, if a page has not been accessed in a while, the compressed form of the page may be written to disk, to free space for other data. Thus, at any given time, the buffer pool may (a) not contain any copy of a given database page, (b) contain only the compressed form of the page, or (c) contain both the compressed and uncompressed forms of the page.
InnoDB keeps track of which pages to retain in memory and which to evict using a least-recently-used (LRU) list, so that “hot” or frequently accessed data tends to stay in memory. When compressed tables are accessed, InnoDB uses an adaptive LRU algorithm to achieve an appropriate balance of compressed and uncompressed pages in memory. This adaptive algorithm is sensitive to whether the system is running in an i/o-bound or CPU-bound manner.
The essential idea is to avoid spending too much processing time uncompressing pages when the CPU is busy, and to avoid doing excess i/o when the CPU has spare cycles that can be used for uncompressing compressed pages (that may already be in memory). When the system is i/o-bound, the algorithm will prefer to evict the uncompressed copy of a page rather than both copies, in order to make more room for other disk pages to become memory resident. When the system is CPU-bound, InnoDB will preferentially evict both the compressed and uncompressed page, so that more memory can be used for “hot” pages and reducing the need to uncompress data in memory only in compressed form.
Before (but not necessarily at the same time as) a compressed page is written to a database file, InnoDB will write a copy of the page to the redo log (if it has been recompressed since the last time it was written to the database). This is done to ensure that redo logs will always be usable, even if a future version of InnoDB uses a slightly different compression algorithm. Therefore, some increase in the size of log files, or a need for more frequent checkpoints, can be expected when using compression. The amount of increase in the log file size or checkpoint frequency depends on the number of times compressed pages are modified in a way that requires reorganization and recompression.
Note that the redo log file format (and the database file format) are different from previous releases when using compression. The current release of InnoDB Hot Backup (version 3) therefore does not support databases that use compression. Only databases using the file format “Antelope” can be backed up online by InnoDB Hot Backup.
Most often, the internal optimizations in InnoDB described in section Section 3.3.2, “InnoDB Data Storage and Compression” will ensure that the system runs well with compressed data. However, because the efficiency of compression depends on the nature of your data, there are some factors you should consider to get best performance. You need to choose which tables to compress, and what compressed page size to use. You may also want to adjust the size of the buffer pool based on run-time performance characteristics such as the amount of time the system spends compressing and uncompressing data.
In general, compression will work best on tables that include a reasonable number of character string columns and where the data is read far more often than it is written. However, there are no universal and absolute ways to predict whether or not compression will benefit a particular situation. Only testing with a specific workload and data set running on representative configuration can lead to the appropriate decision. However, there are some general considerations that may help you decide when and how to use compression.
A key determinant of the efficiency of compression in
reducing the size of data files is the nature of the data
itself. Recall that compression works by identifying repeated
strings of bytes in a block of data. Completely randomized
data is the worst case. Typical data will often have repeated
values, and so will compress effectively. Character strings
will often compress well, whether defined in
CHAR, VARCHAR,
TEXT or BLOB columns.
On the other hand, tables containing mostly binary data
(integers or floating point numbers) or data that is
previously compressed (for example JPEG or
PNG images) may not generally compress
well, significantly or at all.
Compression is chosen on a table by table basis with the
InnoDB Plugin, and a table and all of its indexes use the same
(compressed) page size. It might be that the primary key
(clustered) index, which contains the data for all columns of
a table, will compress more effectively than the secondary
indexes. For those cases where there are long rows, the use
of compression may result in long column values being stored
“off-page”, as discussed in
Section 5.3, “DYNAMIC Row Format”. Those overflow pages
may compress well. Given these considerations, for many
applications, some tables will compress more effectively than
others, and you may find that your workload performs best only
with a subset of tables compressed.
Experimenting is the only way to determine whether or
not to compress a particular table. InnoDB compresses data
in 16K chunks corresponding to the uncompressed page size, and
in addition to user data, the page format includes some
internal system data that is not compressed. Compression
utilities compress an entire stream of data, and so may find
more repeated strings across the entire input stream than
InnoDB would find in a table compressed in 16K chunks. But
you can get a sense of how efficient compression will be by
using a utility that implements LZ77 compression (such as
gzip or WinZip) on your data file.
Another way to test compression on a specific table is
to copy some data from your uncompressed table to a similar,
compressed table (having all the same indexes) and look at the
size of the resulting file. When you do so (if nothing else
using compression is running), you can
examine the ratio of successful
compression operations to overall compression operations. (In
the INNODB_CMP table, compare
COMPRESS_OPS to
COMPRESS_OPS_OK. See
Section 6.2.1, “INNODB_CMP and
INNODB_CMP_RESET
” for more
information.) If a high percentage of compression operations
complete successfully, the table might be a good candidate for
compression.
Generally, it is not useful to attempt to compress already compressed data, so you should choose between compressing data in your application and having InnoDB do so. For some applications, it may make sense to compress data (such as text) before it is inserted into the database. This is the only choice for storage engines that do not do compression, of course, but you may still prefer to do that in some cases, even when using the InnoDB Plugin. You may design your application to compress some data and not other data, saving overhead for data that will not compress well. This approach also has the advantage of using CPU cycles on the client machine rather than the database server for compression and uncompression. However, is usually not wise to store data that is compressed by the application in an InnoDB compressed table, as further compression is extremely unlikely.
One disadvantage of compressing data in the application is that the data values are opaque to InnoDB. You cannot use SQL to compare values of externally compressed data, nor create an effective index on such data, as InnoDB would not be able to recognize and sort the uncompressed data values.
An advantage of using internal compression is that InnoDB compresses the entire contents of a table (including all its columns and the clustered index and secondary indexes). This means that even data that is indexed or used in comparisons can be stored in compressed form. Indexes are often a significant fraction of the total size of a database. Depending on how many indexes your table has, and how effectively the data in those indexes compresses, compression could result in significant savings in storage, i/o or processor time.
Of course, it is possible to combine these approaches. For some applications, it may be appropriate to use some compressed tables and some uncompressed tables. It may be best to externally compress some data (and store it in uncompressed InnoDB tables) and allow InnoDB to compress (some of) the other tables in the application. Careful schema and application design are always important, and this is of course true when deciding how to use compression. Different use cases will have different best solutions.
In addition to choosing which tables to compress (and
the page size), the workload is another key determinant of
performance. If the application is dominated by reads, rather
than updates, fewer pages will need to be reorganized and
recompressed after the index page runs out of room for the
per-page “modification log” that InnoDB
maintains for compressed data. If the updates predominantly
change non-indexed columns or those containing
BLOBs or large strings that happen to be
stored “off-page”, the overhead of compression
may be acceptable. If the only changes to a table are
INSERTs that use a monotonically increasing
primary key, and there are few secondary indexes, there will
be little need to reorganize and recompress index pages.
Since InnoDB can “delete-mark” and delete rows
on compressed pages “in place” by modifying
uncompressed data, DELETE operations on a
table are relatively efficient.
For some environments, the time it takes to load data can be as important as run-time retrieval. Especially in data warehouse environments, many tables may be read-only or read-mostly. In those cases, it might or might not be acceptable to pay the price of compression in terms of increased load time, unless the resulting savings in fewer disk reads or in storage cost is significant.
Fundamentally, compression works best when the CPU time is available for compressing and uncompressing data. Thus, if your workload is i/o bound, rather than CPU-bound, you may find that compression can improve overall performance. Therefore when you test your application performance with different compression configurations, it is important to test on a platform similar to the configuration on which it will run in production.
Reading and writing database pages from and to disk is the slowest aspect of system performance. Therefore, compression attempts to reduce i/o by using CPU time to compress and uncompress data, and thus is most effective when i/o is a relatively scarce resource compared to processor cycles.
This is often especially the case when running in a multi-user environment with fast, multi-core CPUs. When a page of a compressed table is in memory, InnoDB will often use an additional 16K in the buffer pool for an uncompressed copy of the page. The adaptive LRU algorithm in the InnoDB Plugin attempts to balance the use of memory between compressed and uncompressed pages to take into account whether the workload is running in an i/o-bound or CPU-bound manner. Nevertheless, a configuration with more memory dedicated to the InnoDB buffer pool will tend to run better when using compressed tables than a configuration where memory is highly constrained.
The optimal setting of the compressed page size depends on the type and distribution of data that the table and its indexes contain. The compressed page size should always be bigger than the maximum record size, or operations may fail as noted in Section 3.3.2.1, “Compression of B-tree Pages”.
Setting the compressed page size too large will waste some space, but the pages will not have to be compressed as often. If the compressed page size is set too small, inserts or updates may require time-consuming recompression, and the B-tree nodes may have to be split more frequently, leading to bigger data files and less efficient indexing.
Typically, one would set the compressed page size to 8K
or 4K bytes. Given that the maximum InnoDB record size is
around 8K, KEY_BLOCK_SIZE=8 is usually a
safe choice.
The current version of the InnoDB Plugin provides only a limited means to monitor the performance of compression at runtime. Overall application performance, CPU and i/o utilization and the size of disk files are the best indicators of how effective compression is for your application.
The InnoDB Plugin does include some Information Schema tables (see
Example 6.1, “Using the Compression Information Schema Tables”)
that reflect the internal use of memory and the rates of compression
used overall. The INNODB_CMP tables report
information about compression activity for each compressed page
size (KEY_BLOCK_SIZE) in use. The
information in these tables is system-wide, and includes summary
data across all compressed tables in your database. You can use
this data to help decide whether or not to compress a table by
examining these tables when no other compressed tables are being
accessed.
The key statistics to consider are the number of, and
amount of time spent performing, compression and uncompression
operations. Since InnoDB must split B-tree nodes when they
are too full to contain the compressed data following a
modification, you should also compare the number of
“successful” compression operations with the number
of such operations overall. Based on the information in the
INNODB_CMP tables and overall application
performance and hardware resource utilization, you may decide to
make changes in your hardware configuration, adjust the size of
the InnoDB buffer pool, choose a different page size, or
select a different set of tables to compress.
If the amount of CPU time required for compressing and uncompressing is high, changing to faster CPUs, or those with more cores, can help improve performance with the same data, application workload and set of compressed tables. You may also benefit by increasing the size of the InnoDB buffer pool, so that more uncompressed pages can stay in memory, reducing the need to uncompress pages which exist in memory only in compressed form.
A large number of compression operations overall (compared
to the number of INSERT,
UPDATE and DELETE
operations in your application and the size of the database)
could indicate that some of your compressed tables are being
updated too heavily for effective compression. You may want to
choose a larger page size, or be more selective about which
tables you compress.
If the number of “successful” compression
operations (COMPRESS_OPS_OK) is a high
percentage of the total number of compression operations
(COMPRESS_OPS), then the system is likely
performing well. However, if the ratio is low, then InnoDB is
being caused to reorganize, recompress and split B-tree nodes
more often than is desirable. In this case, you may want to
avoid compressing some tables or choose a larger
KEY_BLOCK_SIZE for some of the tables for
which you are using compression. You may not want to compress
tables which cause the number of
“compression failures” in
your application to be more than 1% or 2% of the total (although
this may be acceptable during a data load, for example, if your
application does not encounter such a ratio during normal
operations).
Table of Contents
As InnoDB evolves, new on-disk data structures are sometimes required to support new features. This release of InnoDB introduces two such new data structures: compressed tables (see Chapter 3, InnoDB Data Compression), and long variable-length columns stored off-page (see Chapter 5, Storage of Variable-Length Columns). These new data structures are not compatible with prior versions of InnoDB. Note, however, that the other new features of the InnoDB Plugin do not require the use of the new file format.
In general, a newer version of InnoDB may create a table or index that cannot safely be read or written with a prior version of InnoDB without risk of crashes, hangs, wrong results or corruptions. The InnoDB Plugin introduces a new mechanism to guard against these conditions, and to help preserve compatibility among database files and versions of InnoDB. Henceforth, users can take advantage of some new features of an InnoDB release (e.g., performance improvements and bug fixes), and still preserve the option of using their database with a prior version of InnoDB, by precluding the use of new features that create downward incompatible on-disk data structures.
The InnoDB Plugin introduces the idea of a named file format and a configuration parameter to enable the use of features that require use of that format. The new file format is the “Barracuda” format, and the file format supported by prior releases of InnoDB is called file format “Antelope”. Compressed tables and the new row format that stores long columns “off-page” require the use of the “Barracuda” file format or newer. Future versions of InnoDB may introduce a series of file formats, identified with the names of animals, in ascending alphabetical order.
Beginning with this release, every InnoDB per-table
tablespace file is labeled with a file format identifier. This
does not apply to the system tablespace (the
ibdata files) but only the files of separate
tablespaces (the *.ibd files where tables and
indexes are stored in their own tablespace). As noted below,
however, the system tablespace is tagged with the
“highest” file format in use in a group of InnoDB
database files, and this tag is checked when the files are opened.
In this release, when you create a compressed table, or a
table with ROW_FORMAT=DYNAMIC, the file header
for the corresponding .ibd file and the table
type in the InnoDB data dictionary are updated with the
identifier for the “Barracuda” file format. From that point
forward, the table cannot be used with a version of InnoDB that
does not support this new file format. To protect against
anomalous behavior, InnoDB version 5.0.21 and later will perform
a compatibility check when the table is opened, as described
below. (Note that the ALTER TABLE command
will, in many cases, cause a table to be recreated and thereby
change its properties. The special case of adding or dropping
indexes without rebuilding the table is described in
Chapter 2, Fast Index Creation in the InnoDB Storage Engine.)
If a version of InnoDB supports a particular file format (whether or not it is enabled), you can access and even update any table that requires that format or an earlier format. Only the creation of new tables using new features is limited based on the particular file format enabled. Conversely, if a tablespace contains a table or index that uses a file format that is not supported by the currently running software, it cannot be accessed at all, even for read access.
The only way to “downgrade” an InnoDB
tablespace to an earlier file format is to copy the data to a new
table, in a tablespace that uses the earlier format. This can be
done with the ALTER TABLE command, as described
in Section 4.6, “Downgrading the File Format”.
The easiest way to determine the file format of an existing
InnoDB tablespace is to examine the properties of the table it
contains, using the SHOW TABLE STATUS command
or querying the table INFORMATION_SCHEMA.TABLES.
If the Row_format of the table is reported as
'Compressed' or 'Dynamic',
the tablespace containing the table uses the “Barracuda” format.
Otherwise, it uses the prior InnoDB file format, “Antelope”.
The new configuration parameter
innodb_file_format controls whether such
commands as CREATE TABLE and
ALTER TABLE can be used to create tables that
depend on support for the “Barracuda” file format.
The file format is a dynamic, global parameter that can be
specified in the MySQL option file (my.cnf or
my.ini) or changed with the
SET GLOBAL command, as described in
Section 9.5, “Configuring the InnoDB Plugin”.
To avoid confusion, for the purposes of this discussion we define the term “ib-file set” to mean the set of operating system files that InnoDB manages as a unit. The ib-file set includes the following files:
ibdata files)
that contain internal system information (including internal
catalogs and undo information) and may include user data and
indexes
*.ibd files)
ib_logfile1
and ib_logfile2), used for crash recovery and
in backups
This collection of files is transactionally consistent, and
recoverable as a unit.
An “ib-file set” specifically does not include
the related MySQL .frm files that contain
meta data about InnoDB tables.
The .frm files are created and managed
exclusively by MySQL, and can sometimes get out of sync
with the internal meta data in InnoDB.
Instead of “ib-file set”, we might call such a collection a “database”. However, MySQL uses the word “database” to mean a logical collection of tables, what other systems term a “schema” or “catalog”. Given MySQL terminology, multiple tables (even from more than one database) can be stored in a single “ib-file set”.
The InnoDB Plugin incorporates several checks to guard against the possible crashes and data corruptions that might occur if you use an ib-file set in a file format that is not supported by the software release in use. These checks take place when the server is started, and when you first access a table. This section details these checks, how you can control them, and error and warning conditions that may arise.
To prevent possible crashes or data corruptions when
InnoDB Plugin opens an ib-file set, it will check that it can
fully support the file formats in use within the ib-file set. If the
system is restarted following a crash, or a “fast
shutdown” (i.e., innodb_fast_shutdown
is greater than zero), there may be
on-disk data structures (such as redo or undo entries, or
doublewrite pages) that are in a “too-new” format for
the current software. During the recovery process, serious damage
can be done to your data files if these data structures are
accessed. The startup check of the file format occurs before any
recovery process begins, thereby preventing the problems
described in Section 11.4, “Possible Problems”.
Beginning with version 1.0.1 of the InnoDB Plugin,
the system tablespace records an identifier or tag for the
“highest” file format used by any table in any of the
tablespaces that is part of the ib-file set. Checks against this
file format tag are controlled by the new configuration
parameter innodb_file_format_check, which is
ON by default.
If the file format tag in the system tablespace is newer or
higher than the highest version supported by the particular
currently executing software and if innodb_file_format_check
is ON, the following error will be issued
when the server is started:
InnoDB: Error: the system tablespace is in a file format that this version doesn't support
You can also set innodb_file_format to a file format name.
Doing so will prevent the InnoDB Plugin from starting if
the current software does not
support the file format specified. It also sets the
“high water mark” to the value you specify.
The ability to set innodb_file_format_check
will be useful (with future
releases of InnoDB) if you manually “downgrade”
all of the tables in an ib-file set (as described in
Chapter 11, Downgrading from the InnoDB Plugin). You can then rely on
the file format check at startup if you subsequently use an
older version of InnoDB to access the ib-file set.
In some limited circumstances, you might want to start the server
and use an ib-file set that is in a “too new” format
(one that is not supported by the software you’re using). If you
set the configuration parameter innodb_file_format_check to
OFF, the InnoDB Plugin will open the database,
but issue this warning message in the error log:
InnoDB: Warning: the system tablespace is in a file format that this version doesn't support
NOTE: This is a very dangerous setting, as it permits the
recovery process to run, possibly corrupting your database if
the previous shutdown was a crash or “fast shutdown”.
You should only set innodb_file_format_check to
OFF if you are sure that
the previous shutdown was done with
innodb_fast_shutdown=0, so
that essentially no recovery process will occur. In a future
release, this parameter setting may be renamed from
OFF to
UNSAFE. (However, until there are newer releases
of InnoDB that support additional file formats, even disabling
the startup checking is in fact “safe”.)
Note that the parameter
innodb_file_format_check affects only
what happens when a database is opened, not subsequently.
Conversely, the parameter innodb_file_format (which enables a
specific format) only determines whether or not a new table can
be created in the enabled format and has no effect on whether or
not a database can be opened.
The file format tag is a “high water mark”, and as
such it is increased after the server is started, if a
table in a “higher” format is created or an
existing table is accessed for read or write (assuming its
format is supported). If you access an
existing table in a format higher than the format the running
software supports, the system tablespace tag will not be
updated, but table-level compatibility checking will apply (and
an error will be issued), as described in Section 4.4.2, “Table-access File Format Compatibility Checking”.
Any time the high water mark is updated, the value of
innodb_file_format_check is updated as well,
so the command
SELECT @@innodb_file_format_check;
will display the name
of the newest file format known to be used by tables in the
currently open ib-file set and supported by the currently
executing software.
To best illustrate this behavior, consider the scenario described in Table 4.1, “InnoDB data file compatibility and related InnoDB parameters”. Imagine that some future version of InnoDB supports the “Cheetah” format and that an ib-file set has been used with that version.
Table 4.1. InnoDB data file compatibility and related InnoDB parameters
| innodb file format check | innodb file format | Highest file format used in ib-file set | Highest file format supported by InnoDB | Result |
|---|---|---|---|---|
OFF | Antelope or
Barracuda | Barracuda | Barracuda | Database will open; tables can be created which require “Antelope” or “Barracuda” file format |
OFF | Antelope or
Barracuda | Cheetah | Barracuda | Database will open with a warning, since the db contains files in a “too new” format; tables can be created which require “Antelope” or “Barracuda” file format; tables in “Cheetah” format cannot be accessed |
OFF | Cheetah | Barracuda | Barracuda | Database will not open; innodb_file_format
cannot be set to “Cheetah” |
ON | Antelope or
Barracuda | Barracuda | Barracuda | Database will open; tables can be created which require “Antelope” or “Barracuda” file format |
ON | Antelope or
Barracuda | Cheetah | Barracuda | Database will not open, since the db contains files in a “too new” format (“Cheetah”) |
ON | Cheetah | Barracuda | Barracuda | Database will not open; innodb_file_format
cannot be set to “Cheetah” |
When a table is first accessed, InnoDB (including some releases prior to InnoDB Plugin 1.0) check that the file format of the tablespace in which the table is stored is fully supported. This check prevents crashes or corruptions that would otherwise occur when tables using a “too new” data structure are encountered.
Note that all tables using any file format supported by a
release can be read or written (assuming the user has sufficient
privileges). The setting of the system configuration parameter
innodb_file_format can prevent creating a new table that uses
specific file formats, even if they are supported by a given
release. Such a setting might be used to preserve backward
compatibility, but it does not prevent accessing any table that
uses any supported format.
As noted in Section 4.2, “Named File Formats”, versions of InnoDB older than 5.0.21 cannot reliably use database files created by newer versions if a new file format was used when a table was created. To prevent various error conditions or corruptions, InnoDB checks file format compatibility when it opens a file (e.g., upon first access to a table). If the currently running version of InnoDB does not support the file format identified by the table type in the InnoDB data dictionary, MySQL will report the following error:
ERROR 1146 (42S02): Table 'test.t1' doesn't exist
Furthermore, InnoDB will write a message to the error log:
InnoDB: tabletest/t1: unknown table type33
The table type should be equal to the tablespace flags, which contains the file format version as discussed in Section 4.5, “Identifying the File Format in Use”.
Versions of InnoDB prior to 4.1 did not include table format identifiers in the database files, and versions prior to 5.0.21 did not include a table format compatibility check. Therefore, there is no way to ensure proper operations if a table in a “too new” format is used with versions of InnoDB prior to 5.0.21.
The new file format management capability introduced with the InnoDB Plugin (comprising tablespace tagging and run-time checks) allows InnoDB to verify as soon as possible that the running version of software can properly process the tables existing in the database.
If you permit InnoDB to open a database containing files in a
format it does not support (by setting the parameter
innodb_file_format_check to OFF),
the table-level checking described in this section will still apply.
Users are strongly urged not to use database files that contain “Barracuda” file format tables with releases of InnoDB older than the InnoDB Plugin. It is possible to “downgrade” such tables to the “Antelope” format (that the built-in InnoDB in MySQL up to version 5.1 supports) with the procedure described in Section 4.6, “Downgrading the File Format”.
Although you may have enabled a given
innodb_file_format at a particular time, unless
you create a new table, the database file format is unchanged.
If you do create a new table, the tablespace containing the table
is tagged with the “earliest” or “simplest”
file format that is required for the table’s features. For example,
if you enable file format “Barracuda”, and create a new table that
is not compressed and does not use ROW_FORMAT=DYNAMIC,
the new tablespace that contains the table will be tagged as using file
format “Antelope”.
It is easy to identify the file format used by a given
tablespace or table. The table uses the “Barracuda” format if
the Row_format reported by
SHOW CREATE TABLE or
INFORMATION_SCHEMA.TABLES is
one of 'Compressed' or
'Dynamic'. (Please note that the
Row_format is a separate column, and ignore the
contents of the Create_options column, which
may contain the string ROW_FORMAT.) If the
table in a tablespace uses neither of those features, the file
uses the format supported by prior releases of InnoDB, now
called file format “Antelope”. Then, the
Row_format will be one of
'Redundant' or
'Compact'.
The file format identifier is written as part of the
tablespace flags (a 32-bit number) in the *.ibd
file in the 4 bytes starting at position 54 of the file, most
significant byte first. (The first byte of the file is byte
zero.) On some systems, you can display these bytes in
hexadecimal with the command od -t x1 -j 54 -N 4
. If all bytes
are zero, the tablespace uses the “Antelope” file format (which
is the format used by the standard built-in InnoDB in MySQL up to version 5.1).
Otherwise, the least significant bit should be set in the
tablespace flags, and the file format identifier is written in the
bits 5 through 11. (Divide the tablespace flags by 32 and take
the remainder after dividing the integer part of the result by
128.)
tablename.ibd
The InnoDB tablespaces files (*.ibd file) are
tagged with the file format used to create its table and indexes.
The way to downgrade the tablespace is to re-create the table and
its indexes. The easiest way to recreate a table and its indexes
is to use the command
ALTER TABLEtROW_FORMAT=COMPACT;
on each table that you want to downgrade. The
COMPACT row format uses the file format “Antelope”.
It was introduced in MySQL 5.0.3.
The file format used by the standard built-in InnoDB in MySQL 5.1 is the “Antelope” format, and the new file format introduced with the InnoDB Plugin 1.0 is the “Barracuda” format. No definitive plans have been made to introduce new features that would require additional new file formats. However, the file format mechanism introduced with the InnoDB Plugin allows for further enhancements.
For the sake of completeness, these are the file format names that might be used for future file formats: Antelope, Barracuda, Cheetah, Dragon, Elk, Fox, Gazelle, Hornet, Impala, Jaguar, Kangaroo, Leopard, Moose, Nautilus, Ocelot, Porpoise, Quail, Rabbit, Shark, Tiger, Urchin, Viper, Whale, X, Y and Zebra. These file formats correspond to the internal identifiers 0..25.
Table of Contents
All data in InnoDB is stored in database pages comprising a B-tree index (the so-called clustered index or primary key index). The essential idea is that the nodes of the B-tree contain, for each primary key value (whether user-specified or generated or chosen by the system), the values of the remaining columns of the row as well as the key. In some other database systems, a clustered index is called an “index-organized table”. Secondary indexes in InnoDB are also B-trees, containing pairs of values of the index key and the value of the primary key, which acts as a pointer to the row in the clustered index.
There is an exception to this rule. Variable-length columns
(such as BLOB and VARCHAR)
that are too long to fit on a B-tree page are stored on separately
allocated disk (“overflow”) pages. We call these
“off-page columns”. The values of such columns are
stored on singly-linked lists of overflow pages, and each such
column has its own list of one or more overflow pages. In some
cases, all or a prefix of the long column values is stored in the
B-tree, to avoid wasting storage and eliminating the need to read
a separate page.
The new “Barracuda” file format provides a new option to control how much column data is stored in the clustered index, and how much is placed on overflow pages.
Previous versions of InnoDB used an unnamed file format
(now called “Antelope”) for database files. With that format,
tables were defined with ROW_FORMAT=COMPACT (or
ROW_FORMAT=REDUNDANT) and InnoDB stored up to
the first 768 bytes of variable-length columns (such as
BLOB and VARCHAR) in the
index record within the B-tree node, with the remainder stored on
the overflow page(s).
To preserve compatibility with those prior versions, tables
created with the InnoDB Plugin will use the prefix format, unless one
of ROW_FORMAT=DYNAMIC or
ROW_FORMAT=COMPRESSED is specified (or implied)
on the CREATE TABLE command.
With the “Antelope” file format, if the value of a column
is not longer than 768 bytes, no overflow page is needed, and some
savings in i/o may result, since the value is in the B-tree node.
This works well for relatively short BLOBs, but
may cause B-tree nodes to fill with data rather than key values,
thereby reducing their efficiency. Tables with many
BLOB columns could cause B-tree nodes to become
too full of data, and contain too few rows, making the entire
index less efficient than if the rows were shorter or if the
column values were stored off-page.
When innodb_file_format is set to “Barracuda” and a table is
created with ROW_FORMAT=DYNAMIC or ROW_FORMAT=COMPRESSED,
long column values are stored fully off-page, and the clustered
index record contains only a 20-byte pointer to the overflow page.
Whether any columns are stored off-page depends on the page size and the total size of the row. When the row is too long, InnoDB will choose the longest columns for off-page storage until the clustered index record fits on the B-tree page.
The DYNAMIC row format maintains the efficiency of storing the
entire row in the index node if it will fit (as do the COMPACT and
REDUNDANT formats), but this new format avoids the problem of
filling B-tree nodes with a large number of data bytes of long
columns. The DYNAMIC format is predicated on
the idea that if a portion of a long data value is stored
off-page, it is usually most efficient to store all of the value
off-page. With DYNAMIC format, shorter columns
are likely to remain in the B-tree node, minimizing the number of
overflow pages needed for any given row.
The row format used for a table is specified with the
ROW_FORMAT clause of the
CREATE TABLE and ALTER TABLE
commands. Note that COMPRESSED format implies
DYNAMIC format. See
Section 3.2, “Specifying Compression” for more details on the
relationship between this clause and other clauses of these
commands.
Table of Contents
The seven
INFORMATION_SCHEMA
tables INNODB_CMP, INNODB_CMP_RESET, INNODB_CMPMEM,
INNODB_CMPMEM_RESET, INNODB_TRX, INNODB_LOCKS and
INNODB_LOCK_WAITS contain live information
about compressed InnoDB tables, the compressed InnoDB buffer
pool, all transactions currently executing inside InnoDB, the
locks that transactions hold and those that are blocking
transactions waiting for access to a resource (a table or
row).
Note that the Information Schema tables are themselves plugins to the MySQL server. As such they need to be INSTALLed as described in Chapter 9, Installing the InnoDB Plugin. If they are installed, but the InnoDB storage engine plugin is not installed, these tables will appear to be empty.
Following is a description of the new Information Schema tables introduced in the InnoDB Plugin, and some examples of their use.
Two new pairs of Information Schema tables provided by the InnoDB Plugin can give you some insight into how well compression is working overall. One pair of tables contains information about the number of compression operations and the amount of time spent performing compression. Another pair of tables contains information on the way memory is allocated for compression.
The tables INNODB_CMP and
INNODB_CMP_RESET contain status information
on the operations related to compressed tables, which are covered
in Chapter 3, InnoDB Data Compression. The compressed
page size is in the column PAGE_SIZE.
These two tables have identical contents, but reading from
INNODB_CMP_RESET will reset the statistics on
compression and uncompression operations. For example, if you
archived the output of INNODB_CMP_RESET every
60 minutes, it would show the hourly statistics. If you never
read INNODB_CMP_RESET and monitored the
output of INNODB_CMP instead, it would show
the cumulated statistics since InnoDB was started.
Table 6.1. Columns of INNODB_CMP and
INNODB_CMP_RESET
| Column name | Description |
|---|---|
PAGE_SIZE | Compressed page size in bytes. |
COMPRESS_OPS | Number of times a B-tree page of the size
PAGE_SIZE has been compressed. Pages
are compressed whenever an empty page is created or the
space for the uncompressed modification log runs out. |
COMPRESS_OPS_OK | Number of times a B-tree page of the size
PAGE_SIZE has been successfully
compressed. This count should never exceed
COMPRESS_OPS. |
COMPRESS_TIME | Total time in seconds spent in attempts to
compress B-tree pages of the size
PAGE_SIZE. |
UNCOMPRESS_OPS | Number of times a B-tree page of the size
PAGE_SIZE has been uncompressed.
B-tree pages are uncompressed whenever compression fails
or at first access when the uncompressed page does not
exist in the buffer pool. |
UNCOMPRESS_TIME | Total time in seconds spent in uncompressing
B-tree pages of the size PAGE_SIZE. |
You may consider the tables INNODB_CMPMEM and
INNODB_CMPMEM_RESET as the status information
on the compressed pages that reside in the buffer pool. Please
consult Chapter 3, InnoDB Data Compression for further
information on compressed tables and the use of the buffer pool.
The tables INNODB_CMP and INNODB_CMP_RESET should provide
more useful statistics on compression.
The InnoDB Plugin uses a so-called “buddy
allocator” system to manage memory allocated to pages of
various sizes, from 1KB to 16KB. Each row of the two tables
described here corresponds to a single page size, except for
rows with PAGE_SIZE<1024, which are
implementation artifacts. The smallest blocks
(PAGE_SIZE=64
or PAGE_SIZE=128, depending on the server
platform) are used for keeping track of compressed pages for
which no uncompressed page has been allocated in the buffer
pool. Other blocks of PAGE_SIZE<1024
should never be allocated (PAGES_USED=0).
They exist because the memory allocator allocates smaller blocks by
splitting bigger ones into halves.
These two tables have identical contents, but reading from
INNODB_CMPMEM_RESET will reset the statistics on relocation
operations. For example, if every 60 minutes you archived the output of
INNODB_CMPMEM_RESET, it would show the hourly
statistics. If you never read INNODB_CMPMEM_RESET and monitored
the output of INNODB_CMPMEM instead, it would show the
cumulated statistics since InnoDB was started.
Table 6.2. Columns of INNODB_CMPMEM and INNODB_CMPMEM_RESET
| Column name | Description |
|---|---|
PAGE_SIZE | Block size in bytes. Each record of this table describes blocks of this size. |
PAGES_USED | Number of blocks of the size
PAGE_SIZE that are currently in
use. |
PAGES_FREE | Number of blocks of the size
PAGE_SIZE that are currently available for
allocation. This column shows the external
fragmentation in the memory pool. Ideally, these numbers
should be at most 1. |
RELOCATION_OPS | Number of times a block of the size
PAGE_SIZE has been relocated. The
buddy system can relocate the allocated “buddy
neighbor” of a freed block when it tries to form
a bigger freed block. Reading from the table
INNODB_CMPMEM_RESET will reset this
count. |
RELOCATION_TIME | Total time in microseconds spent in relocating
blocks of the size PAGE_SIZE. Reading
from the table INNODB_CMPMEM_RESET will
reset this count. |
Example 6.1. Using the Compression Information Schema Tables
The following is sample output from a database that contains
compressed tables (see Chapter 3, InnoDB Data Compression,
Section 6.2.1, “INNODB_CMP and
INNODB_CMP_RESET
”, and
Section 6.2.2, “INNODB_CMPMEM and
INNODB_CMPMEM_RESET
”).
As can be seen in the following table,
the only compressed page size that the buffer pool contains is 8K.
Compressing or uncompressing pages has consumed less than a
second since the time the statistics were reset, because the
columns COMPRESS_TIME and
UNCOMPRESS_TIME are zero.
Table 6.3. Contents of
INFORMATION_SCHEMA.INNODB_CMP under light load
| page size | compress ops | compress ops ok | compress time | uncompress ops | uncompress time |
|---|---|---|---|---|---|
| 1024 | 0 | 0 | 0 | 0 | 0 |
| 2048 | 0 | 0 | 0 | 0 | 0 |
| 4096 | 0 | 0 | 0 | 0 | 0 |
| 8192 | 1048 | 921 | 0 | 61 | 0 |
| 16384 | 0 | 0 | 0 | 0 | 0 |
According to Table 6.4, “Contents of
INFORMATION_SCHEMA.INNODB_CMPMEM under light load
”,
there are 6169 compressed 8KB pages in the buffer pool. The
only other allocated
block size is 64 bytes. The smallest PAGE_SIZE
in INNODB_CMPMEM is used for block descriptors of those
compressed pages for which no uncompressed page exists in the
buffer pool. We see that there are 5910 such pages.
Indirectly, we see that 259 (6169-5910) compressed pages also
exist in the buffer pool in uncompressed form.
In the following table
we can see that some memory is unusable due to fragmentation of
the InnoDB memory allocator for compressed
pages: SUM(PAGE_SIZE*PAGES_FREE)=6784. This
is because small memory allocation requests are fulfilled by
splitting bigger blocks, starting from the 16K blocks that are
allocated from the main buffer pool, using the buddy
allocation system. The fragmentation is this low, because some
allocated blocks have been relocated (copied) to form bigger
adjacent free blocks. This copying of
SUM(PAGE_SIZE*RELOCATION_OPS) bytes has
consumed less than a second
(SUM(RELOCATION_TIME)=0).
Table 6.4. Contents of
INFORMATION_SCHEMA.INNODB_CMPMEM under light load
| page size | pages used | pages free | relocation ops | relocation time |
|---|---|---|---|---|
| 64 | 5910 | 0 | 2436 | 0 |
| 128 | 0 | 1 | 0 | 0 |
| 256 | 0 | 0 | 0 | 0 |
| 512 | 0 | 1 | 0 | 0 |
| 1024 | 0 | 0 | 0 | 0 |
| 2048 | 0 | 1 | 0 | 0 |
| 4096 | 0 | 1 | 0 | 0 |
| 8192 | 6169 | 0 | 5 | 0 |
| 16384 | 0 | 0 | 0 | 0 |
Three new Information Schema tables introduced in the
InnoDB Plugin make it much easier to monitor transactions and
diagnose possible locking problems. The three tables are
INNODB_TRX, INNODB_LOCKS and INNODB_LOCK_WAITS.
Contains information about every transaction currently executing inside InnoDB, including whether the transaction is waiting for a lock, when the transaction started, and the particular SQL statement the transaction is executing.
Table 6.5. INNODB_TRX columns
| Column name | Description |
|---|---|
TRX_ID
| Unique transaction ID number, internal to InnoDB. |
TRX_WEIGHT
| The weight of a transaction, reflecting (but not necessarily the exact count of) the number of rows altered and the number of rows locked by the transaction. To resolve a deadlock, InnoDB selects the transaction with the smallest weight as the “victim” to rollback. Transactions that have changed non-transactional tables are considered heavier than others, regardless of the number of altered and locked rows. |
TRX_STATE
|
Transaction execution state. One of 'RUNNING',
'BLOCKED', 'ROLLING BACK' or 'COMMITTING'.
|
TRX_STARTED
| Transaction start time; the transaction is created by executing a transactional query. |
TRX_REQUESTED_LOCK_ID
|
ID of the lock the transaction is currently waiting
for (if TRX_STATE is 'BLOCKED',
otherwise NULL). Details about the lock can be found
by joining with INNODB_LOCKS on LOCK_ID.
|
TRX_WAIT_STARTED
|
Time when the transaction started waiting on the lock
(if TRX_STATE is 'BLOCKED', otherwise
NULL).
|
TRX_MYSQL_THREAD_ID
|
MySQL thread ID. Can be used for joining with
PROCESSLIST on ID. See Section 6.4.3, “Possible inconsistency with PROCESSLIST”.
|
TRX_QUERY
| The SQL query that is being executed by the transaction. |
Each transaction in InnoDB that is waiting for another
transaction to release a lock
(INNODB_TRX.TRX_STATE='BLOCKED') is blocked
by exactly one “blocking lock request”. That
blocking lock request is for a row or table lock held by another
transaction in an incompatible mode. The waiting or blocked
transaction cannot proceed until the other transaction commits
or rolls back, thereby releasing the requested lock. For every
blocked transaction, INNODB_LOCKS contains one row that
describes each lock the transaction has requested, and for which
it is waiting. INNODB_LOCKS also contains one row for each
lock that is blocking another transaction, whatever the state of
the transaction that holds the lock ('RUNNING', 'BLOCKED',
'ROLLING BACK' or 'COMMITTING'). The lock that is blocking a
transaction will always be held in a mode (read vs. write,
shared vs. exclusive) incompatible with the mode of requested
lock.
Table 6.6. INNODB_LOCKS columns
| Column name | Description |
|---|---|
LOCK_ID
|
Unique lock ID number, internal to InnoDB. Should
be treated as an opaque string. Although LOCK_ID
currently contains TRX_ID, the format of the data in
LOCK_ID is not guaranteed to remain the same in
future releases. You should not write programs that
parse the LOCK_ID value.
|
LOCK_TRX_ID
|
ID of the transaction holding this lock. Details
about the transaction can be found by joining with
INNODB_TRX on TRX_ID.
|
LOCK_MODE
|
Mode of the lock. One of 'S', 'X', 'IS', 'IX',
'S,GAP', 'X,GAP', 'IS,GAP', 'IX,GAP', or 'AUTO_INC'
for shared, exclusive, intention shared, intention
exclusive row locks, shared and exclusive gap locks,
intention shared and intension exclusive gap locks,
and auto-increment table level lock, respectively.
Refer to the sections
InnoDB Lock Modes
and
InnoDB and TRANSACTION ISOLATION LEVEL
of the MySQL Manual for information on InnoDB locking.
|
LOCK_TYPE
|
Type of the lock. One of 'RECORD' or 'TABLE' for
record (row) level or table level locks, respectively.
|
LOCK_TABLE
| Name of the table that has been locked or contains locked records. |
LOCK_INDEX
|
Name of the index if LOCK_TYPE='RECORD',
otherwise NULL.
|
LOCK_SPACE
|
Tablespace ID of the locked record if
LOCK_TYPE='RECORD', otherwise NULL.
|
LOCK_PAGE
|
Page number of the locked record if
LOCK_TYPE='RECORD', otherwise NULL.
|
LOCK_REC
|
Heap number of the locked record within the page if
LOCK_TYPE='RECORD', otherwise NULL.
|
LOCK_DATA
|
Primary key of the locked record if
LOCK_TYPE='RECORD', otherwise
NULL. This column contains the value(s) of the
primary key column(s) in the locked row, formatted as
a valid SQL string (ready to be copied to SQL
commands). If there is no primary key then the
InnoDB internal unique row ID number is used. When
the page containing the locked record is not in the
buffer pool (in the case that it was paged out to disk
while the lock was held), InnoDB will not fetch the
page from disk to avoid unnecessary disk operations.
Instead, LOCK_DATA will be set to NULL.
|
Using this table, you can tell which transactions are
waiting for a given lock, or for which lock a given transaction
is waiting. This table contains one or more rows for each
blocked transaction, indicating the lock it
has requested and the lock(s) that is (are) blocking that
request. The REQUESTED_LOCK_ID refers to the lock that a
transaction is requesting, and the BLOCKING_LOCK_ID refers to
the lock (held by another transaction) that is preventing the
first transaction from proceeding. For any given blocked
transaction, all rows in INNODB_LOCK_WAITS have the same value
for REQUESTED_LOCK_ID and different values for
BLOCKING_LOCK_ID.
Table 6.7. INNODB_LOCK_WAITS columns
| Column name | Description |
|---|---|
REQUESTING_TRX_ID
| ID of the requesting transaction. |
REQUESTED_LOCK_ID
|
ID of the lock for which a transaction is waiting.
Details about the lock can be found by joining with
INNODB_LOCKS on LOCK_ID.
|
BLOCKING_TRX_ID
| ID of the blocking transaction. |
BLOCKING_LOCK_ID
|
ID of a lock held by a transaction blocking another
transaction from proceeding. Details about the lock
can be found by joining with INNODB_LOCKS on
LOCK_ID.
|
Example 6.2. Identifying blocking transactions
It is sometimes helpful to be able to identify which transaction is blocking another. You can use the Information Schema tables to find out which transaction is waiting for another, and which resource is being requested.
Suppose you have the following scenario, with three users running concurrently. Each user (or session) corresponds to a MySQL thread, and will execute one transaction after another. Consider the state of the system when these users have issued the following commands, but none has yet committed its transaction:BEGIN;
SELECT a FROM t FOR UPDATE;
SELECT SLEEP(100);SELECT b FROM t FOR UPDATE;SELECT c FROM t FOR UPDATE;SELECT r.trx_id waiting_trx_id, r.trx_mysql_thread_id waiting_thread,
r.trx_query waiting_query,
b.trx_id blocking_trx_id, b.trx_mysql_thread_id blocking_thread,
b.trx_query blocking_query
FROM information_schema.innodb_lock_waits w
INNER JOIN information_schema.innodb_trx b ON b.trx_id = w.blocking_trx_id
INNER JOIN information_schema.innodb_trx r ON r.trx_id = w.requesting_trx_id;| waiting trx id | waiting thread | waiting query | blocking trx id | blocking thread | blocking query |
|---|---|---|---|---|---|
A4 | 6 | SELECT b FROM t FOR UPDATE | A3 | 5 | SELECT SLEEP(100) |
A5 | 7 | SELECT c FROM t FOR UPDATE | A3 | 5 | SELECT SLEEP(100) |
A5 | 7 | SELECT c FROM t FOR UPDATE | A4 | 6 | SELECT b FROM t FOR UPDATE |
'A4', thread
6) and User C (trx id
'A5', thread 7) are
both waiting for User A (trx id 'A3',
thread 5).INNODB_TRX, INNODB_LOCKS, and INNODB_LOCK_WAITS.
Table 6.8. Sample Contents of
INFORMATION_SCHEMA.INNODB_TRX
| trx id | trx state | trx started | trx requested lock id | trx wait started | trx weight | trx mysql thread id | trx query |
|---|---|---|---|---|---|---|---|
A3 | RUNNING | 2008-01-15 16:44:54 | NULL | NULL | 2 | 5 | SELECT SLEEP(100) |
A4 | LOCK WAIT | 2008-01-15 16:45:09 | A4:1:3:2 | 2008-01-15 16:45:09 | 2 | 6 | SELECT b FROM t FOR UPDATE |
A5 | LOCK WAIT | 2008-01-15 16:45:14 | A5:1:3:2 | 2008-01-15 16:45:14 | 2 | 7 | SELECT c FROM t FOR UPDATE |
Table 6.9. Sample Contents of
INFORMATION_SCHEMA.INNODB_LOCKS
| lock id | lock trx id | lock mode | lock type | lock table | lock index | lock space | lock page | lock rec | lock data |
|---|---|---|---|---|---|---|---|---|---|
A3:1:3:2 | A3 | X | RECORD | `test`.`t` | `PRIMARY` | 1 | 3 | 2 | 0x0200 |
A4:1:3:2 | A4 | X | RECORD | `test`.`t` | `PRIMARY` | 1 | 3 | 2 | 0x0200 |
A5:1:3:2 | A5 | X | RECORD | `test`.`t` | `PRIMARY` | 1 | 3 | 2 | 0x0200 |
Table 6.10. Sample Contents of
INFORMATION_SCHEMA.INNODB_LOCK_WAITS
| requesting trx id | requested lock id | blocking trx id | blocking lock id |
|---|---|---|---|
A4 | A4:1:3:2 | A3 | A3:1:3:2 |
A5 | A5:1:3:2 | A3 | A3:1:3:2 |
A5 | A5:1:3:2 | A4 | A4:1:3:2 |
Example 6.3. More Complex Example of Transaction Data in Information Schema Tables
Sometimes you would like to correlate the internal InnoDB locking information with session-level information maintained by MySQL. For example, you might like to know, for a given InnoDB transaction ID, the corresponding MySQL session ID and name of the user that may be holding a lock, and thus blocking another transaction.
The following output from the INFORMATION_SCHEMA tables is
taken from a somewhat loaded system.
As can be seen in the following tables, there are several transactions running.
The following INNODB_LOCKS and
INNODB_LOCK_WAITS tables shows that:
77F (executing
an INSERT) is waiting for transactions
77E, 77D and
77B to commit.77E (executing
an INSERT) is waiting for transactions
77D and 77B to
commit.77D (executing an
INSERT) is waiting for transaction
77B to commit.77B (executing an
INSERT) is waiting for transaction
77A to commit.77A is running,
currently executing SELECT.E56 (executing an
INSERT) is waiting for transaction
E55 to commit.E55 (executing an
INSERT) is waiting for transaction
19C to commit.19C is running,
currently executing an INSERT.Note that there may be an inconsistency between queries
shown in the two tables
INNODB_TRX.TRX_QUERY and
PROCESSLIST.INFO. The current transaction
ID for a thread, and the query being executed in that
transaction, may be different in these two tables for any
given thread. See Section 6.4.3, “Possible inconsistency with PROCESSLIST” for
an explanation.
Table 6.11. Contents of
INFORMATION_SCHEMA.PROCESSLIST in a loaded
system
| ID | USER | HOST | DB | COMMAND | TIME | STATE | INFO |
|---|---|---|---|---|---|---|---|
384 | root | localhost | test | Query | 10 | update | insert into t2 values … |
257 | root | localhost | test | Query | 3 | update | insert into t2 values … |
130 | root | localhost | test | Query | 0 | update | insert into t2 values … |
61 | root | localhost | test | Query | 1 | update | insert into t2 values … |
8 | root | localhost | test | Query | 1 | update | insert into t2 values … |
4 | root | localhost | test | Query | 0 | preparing | SELECT * FROM processlist |
2 | root | localhost | test | Sleep | 566 | | NULL |
Table 6.12. Contents of
INFORMATION_SCHEMA.INNODB_TRX in a loaded
system
| trx id | trx state | trx started | trx requested lock id | trx wait started | trx weight | trx mysql thread id | trx query |
|---|---|---|---|---|---|---|---|
77F | LOCK WAIT | 2008-01-15 13:10:16 | 77F:806 | 2008-01-15 13:10:16 | 1 | 876 | insert into t09 (D, B, C) values … |
77E | LOCK WAIT | 2008-01-15 13:10:16 | 77E:806 | 2008-01-15 13:10:16 | 1 | 875 | insert into t09 (D, B, C) values … |
77D | LOCK WAIT | 2008-01-15 13:10:16 | 77D:806 | 2008-01-15 13:10:16 | 1 | 874 | insert into t09 (D, B, C) values … |
77B | LOCK WAIT | 2008-01-15 13:10:16 | 77B:733:12:1 | 2008-01-15 13:10:16 | 4 | 873 | insert into t09 (D, B, C) values … |
77A | RUNNING | 2008-01-15 13:10:16 | NULL | NULL | 4 | 872 | select b, c from t09 where … |
E56 | LOCK WAIT | 2008-01-15 13:10:06 | E56:743:6:2 | 2008-01-15 13:10:06 | 5 | 384 | insert into t2 values … |
E55 | LOCK WAIT | 2008-01-15 13:10:06 | E55:743:38:2 | 2008-01-15 13:10:13 | 965 | 257 | insert into t2 values … |
19C | RUNNING | 2008-01-15 13:09:10 | NULL | NULL | 2900 | 130 | insert into t2 values … |
E15 | RUNNING | 2008-01-15 13:08:59 | NULL | NULL | 5395 | 61 | insert into t2 values … |
51D | RUNNING | 2008-01-15 13:08:47 | NULL | NULL | 9807 | 8 | insert into t2 values … |
Table 6.13. Contents of
INFORMATION_SCHEMA.INNODB_LOCK_WAITS in a loaded
system
| requesting trx id | requested lock id | blocking trx id | blocking lock id |
|---|---|---|---|
77F | 77F:806 | 77E | 77E:806 |
77F | 77F:806 | 77D | 77D:806 |
77F | 77F:806 | 77B | 77B:806 |
77E | 77E:806 | 77D | 77D:806 |
77E | 77E:806 | 77B | 77B:806 |
77D | 77D:806 | 77B | 77B:806 |
77B | 77B:733:12:1 | 77A | 77A:733:12:1 |
E56 | E56:743:6:2 | E55 | E55:743:6:2 |
E55 | E55:743:38:2 | 19C | 19C:743:38:2 |
Table 6.14. Contents of
INFORMATION_SCHEMA.INNODB_LOCKS in a loaded
system
| lock id | lock trx id | lock mode | lock type | lock table | lock index | lock space | lock page | lock rec | lock data |
|---|---|---|---|---|---|---|---|---|---|
77F:806 | 77F | AUTO_INC | TABLE | `test`.`t09` | NULL | NULL | NULL | NULL | NULL |
77E:806 | 77E | AUTO_INC | TABLE | `test`.`t09` | NULL | NULL | NULL | NULL | NULL |
77D:806 | 77D | AUTO_INC | TABLE | `test`.`t09` | NULL | NULL | NULL | NULL | NULL |
77B:806 | 77B | AUTO_INC | TABLE | `test`.`t09` | NULL | NULL | NULL | NULL | NULL |
77B:733:12:1 | 77B | X | RECORD | `test`.`t09` | `PRIMARY` | 733 | 12 | 1 | supremum pseudo-record |
77A:733:12:1 | 77A | X | RECORD | `test`.`t09` | `PRIMARY` | 733 | 12 | 1 | supremum pseudo-record |
E56:743:6:2 | E56 | S | RECORD | `test`.`t2` | `PRIMARY` | 743 | 6 | 2 | 0, 0 |
E55:743:6:2 | E55 | X | RECORD | `test`.`t2` | `PRIMARY` | 743 | 6 | 2 | 0, 0 |
E55:743:38:2 | E55 | S | RECORD | `test`.`t2` | `PRIMARY` | 743 | 38 | 2 | 1922, 1922 |
19C:743:38:2 | 19C | X | RECORD | `test`.`t2` | `PRIMARY` | 743 | 38 | 2 | 1922, 1922 |
When a transaction updates a row in a table, or locks it
with SELECT FOR UPDATE, InnoDB establishes
a list or queue of locks on that row. Similarly, InnoDB
maintains a list of locks on a table for table-level locks
transactions hold. If a second transaction wants to update a
row or lock a table already locked by a prior transaction in an
incompatible mode, InnoDB adds a lock request for the row to the
corresponding queue. In order for a lock to be acquired by a
transaction, all incompatible lock requests previously entered
into the lock queue for that row or table must be removed (the
transactions holding or requesting those locks either commit or
rollback).
A transaction may have any number of lock requests for
different rows or tables. At any given time, a transaction may
be requesting a lock that is held by another transaction, in
which case it is blocked by that other transaction. The
requesting transaction must wait for the transaction that holds
the blocking lock to commit or rollback. If a transaction is
not waiting for a a lock, it is in the 'RUNNING' state. If a
transaction is waiting for a lock, it is in the 'BLOCKED' state.
The table INNODB_LOCKS holds one or more row for each
'BLOCKED' transaction, indicating the lock request(s) that is
(are) preventing its progress. This table also contains one row
describing each lock in a queue of locks pending for a given row
or table. The table INNODB_LOCK_WAITS shows which locks
already held by a transaction are blocking locks requested by
other transactions.
The data exposed by the transaction and locking tables represent a glimpse into fast-changing data. This is not like other (user) tables, where the data only changes when application-initiated updates occur. The underlying data is internal system-managed data, and can change very quickly.
For performance reasons, and to minimize the chance of
misleading JOINs between the
INFORMATION_SCHEMA tables, InnoDB collects the required
transaction and locking information into an intermediate buffer
whenever a SELECT on any of the tables is
issued. This buffer is refreshed only if more than 0.1 seconds
has elapsed since the last time the buffer was used. The data
needed to fill the three tables is fetched atomically and
consistently and is saved in this global internal buffer,
forming a point-in-time “snapshot”. If multiple
table accesses occur within 0.1 seconds (as they almost
certainly do when MySQL processes a join among these tables),
then the same snapshot is used to satisfy the query.
A correct result is returned when you
JOIN any of these tables together in a single
query, because the data for the three tables comes from the same
snapshot. Because the buffer is not refreshed with every query
of any of these tables, if you issue separate queries against
these tables within a tenth of a second, the results will be the
same from query to query. On the other hand, two separate
queries of the same or different tables issued more than a tenth
of a second apart may see different results, since the
data will come from different snapshots.
Because InnoDB must temporarily stall while the transaction and locking data is collected, too frequent queries of these tables can negatively impact performance as seen by other users.
As these tables contain sensitive information (at least
INNODB_LOCKS.LOCK_DATA and
INNODB_TRX.TRX_QUERY), for security reasons,
only the users with the PROCESS privilege are
allowed to SELECT from them.
As just described, while the transaction and locking data
is correct and consistent when these INFORMATION_SCHEMA tables
are populated, the underlying data changes so fast that similar
glimpses at other, similarly fast-changing data, will not
necessarily be in sync. This means that you should be careful
in comparing the data in the InnoDB transaction and locking
tables with that in the
MySQL table PROCESSLIST
. The data from the PROCESSLIST table does not come from the
same snapshot as the data about locking and transactions.
Even if you issue a single SELECT
(JOINing INNODB_TRX and PROCESSLIST, for example),
the content of those tables will generally not be consistent.
INNODB_TRX may reference rows that are not present in PROCESSLIST
or the currently executing SQL query of a transaction, shown in
INNODB_TRX.TRX_QUERY may be different from
the one in PROCESSLIST.INFO. The query in
INNODB_TRX will always be consistent with the rest of
INNODB_TRX, INNODB_LOCKS and INNODB_LOCK_WAITS when the
data comes from the same snapshot.
Table of Contents
InnoDB has always been highly efficient, and includes several unique architectural elements to assure high performance and scalability. InnoDB Plugin 1.0.6 includes several new features that take better advantage of recent advances in operating systems and hardware platforms, such as multi-core processors and improved memory allocation systems. In addition, this release permits you to better control the use of some InnoDB internal subsystems to achieve the best performance with your workload.
InnoDB Plugin 1.0.6 includes new capabilities in these areas:
In MySQL and InnoDB, multiple threads of execution access shared data structures. InnoDB synchronizes these accesses with its own implementation of mutexes and read/write locks. InnoDB has historically protected the internal state of a read/write lock with an InnoDB mutex. On Unix and Linux platforms, the internal state of an InnoDB mutex is protected by a Pthreads mutex, as in IEEE Std 1003.1c (POSIX.1c).
On many platforms, there is a more efficient way to implement mutexes and read/write locks. Atomic operations can often be used synchronize the actions of multiple threads more efficiently than Pthreads. Each operation to acquire or release a lock can be done in fewer CPU instructions, and thus result in less wasted time when threads are contending for access to shared data structures. This in turn means greater scalability on multi-core platforms.
Beginning with InnoDB Plugin 1.0.3, InnoDB implements
mutexes and read/write locks with the built-in
functions provided by the GNU Compiler Collection (GCC)
for atomic memory access instead of using the Pthreads
approach previously used. More specifically, an InnoDB Plugin
that is compiled with GCC version 4.1.2 or later will use the
atomic builtins instead of a pthread_mutex_t to
implement InnoDB mutexes and read/write locks.
On 32-bit Microsoft Windows, InnoDB has implemented mutexes (but not read/write locks) with hand-written assembler instructions. Beginning with Microsoft Windows 2000, it is possible to use functions for Interlocked Variable Access that are similar to the built-in functions provided by GCC. Beginning with InnoDB Plugin 1.0.4, InnoDB makes use of the Interlocked functions on Windows. Unlike the old hand-written assembler code, the new implementation supports read/write locks and 64-bit platforms.
Solaris 10 introduced library functions for atomic operations. Beginning with InnoDB Plugin 1.0.4, when InnoDB is compiled on Solaris 10 with a compiler that does not support the built-in functions provided by the GNU Compiler Collection (GCC) for atomic memory access, the library functions will be used.
This change improves the scalability of InnoDB on multi-core systems. Note that the user does not have to set any particular parameter or option to take advantage of this new feature. This feature is enabled out-of-the-box on the platforms where it is supported. On platforms where the GCC, Windows, or Solaris functions for atomic memory access are not available, InnoDB will use the traditional Pthreads method of implementing mutexes and read/write locks.
When MySQL starts, InnoDB will write a message to the
log file indicating whether atomic memory access will be used for
mutexes, for mutexes and read/write locks, or neither. If
suitable tools are used to build the InnoDB Plugin and the
target CPU supports the atomic operations required, InnoDB will
use the built-in functions for mutexing. If, in addition, the
compare-and-swap operation can be used on thread identifiers
(pthread_t), then InnoDB will use the
instructions for read-write locks as well.
Note: If you are building from source, see Section 9.4.1, “Building the InnoDB Plugin on Linux or Unix” to ensure that your build process properly takes advantage of your platform capabilities.
When InnoDB was developed, the memory allocators supplied
with operating systems and run-time libraries were often lacking
in performance and scalability. At that time, there were no
memory allocator libraries tuned for multi-core CPUs. Therefore,
InnoDB implemented its own memory allocator in the
mem subsystem. This allocator is guarded by a
single mutex, which may become a bottleneck. InnoDB also
implements a wrapper interface around the system allocator
(malloc and free) that is
likewise guarded by a single mutex.
Today, as multi-core systems have become more widely
available, and as operating systems have matured, significant
improvements have been made in the memory allocators provided with
operating systems. New memory allocators perform better and are
more scalable than they were in the past. The leading
high-performance memory allocators include
Hoard, libumem,
mtmalloc, ptmalloc,
tbbmalloc, and TCMalloc.
Most workloads, especially those where memory is frequently
allocated and released (such as multi-table joins) will benefit
from using a more highly tuned memory allocator as opposed to the
internal, InnoDB-specific memory allocator.
Beginning with InnoDB Plugin 1.0.3, you control whether
InnoDB uses its own memory allocator or an allocator of the
operating system, by setting the value of the new system
configuration parameter innodb_use_sys_malloc in the MySQL
option file (my.cnf or
my.ini). If set to ON or
1 (the default), InnoDB will use the
malloc and free functions of
the underlying system rather than manage memory pools itself.
This parameter is not dynamic, and takes effect only when the
system is started. To continue to use the InnoDB memory
allocator in InnoDB Plugin, you will have to set
innodb_use_sys_malloc to 0.
Note that when the InnoDB memory allocator is disabled,
InnoDB will ignore the value of the parameter
innodb_additional_mem_pool_size. The InnoDB memory allocator
uses an additional memory pool for satisfying allocation requests
without having to fall back to the system memory allocator. When
the InnoDB memory allocator is disabled, all such allocation
requests will be fulfilled by the system memory allocator.
Furthermore, since InnoDB cannot track all memory use when
the system memory allocator is used (innodb_use_sys_malloc is
ON), the section “BUFFER POOL AND
MEMORY” in the output of the SHOW ENGINE INNODB STATUS command
will only include the buffer pool statistics in the “Total
memory allocated”. Any memory allocated via the
mem subsystem or via
ut_malloc will be excluded.
On Unix-like systems that use dynamic linking, replacing the
memory allocator may be as easy as making the environment variable
LD_PRELOAD or
LD_LIBRARY_PATH point to the dynamic library
that implements the allocator. On other systems, some relinking
may be necessary. Please refer to the documentation of the memory
allocator library of your choice.
When INSERTs are done to a table, often
the values of indexed columns (particularly the values of
secondary keys) are not in sorted order. This means that the
inserts of such values into secondary B-tree indexes is
“random”, and this can cause excessive i/o if the
entire index does not fit in memory. InnoDB has an insert
buffer that caches changes to secondary index entries when the
relevant page is not in the buffer pool, thus avoiding I/O
operations by not reading in the page from the disk. The buffered
changes are written into a special insert buffer tree and are
subsequently merged when the page is loaded to the buffer pool.
The InnoDB main thread merges buffered changes when the server
is nearly idle.
Usually, this process will result in fewer disk reads and writes, especially during bulk inserts. However, the insert buffer tree will occupy a part of the buffer pool. If the working set almost fits in the buffer pool, it may be useful to disable insert buffering. If the working set entirely fits in the buffer pool, insert buffering will not be used anyway, because the index would exist in memory.
Beginning with InnoDB Plugin 1.0.3, you can control
whether InnoDB performs insert buffering with the system
configuration parameter innodb_change_buffering. The allowed
values of innodb_change_buffering are none
(do not buffer any operations) and inserts
(buffer insert operations, the default). You can set the value of
this parameter in the MySQL option file
(my.cnf or my.ini) or change
it dynamically with the SET GLOBAL command,
which requires the SUPER privilege. Changing
the setting affects the buffering of new operations; the merging
of already buffered entries is not affected.
If a table fits almost entirely in main memory, the fastest way to perform queries on it is to use hash indexes rather than B-tree lookups. InnoDB monitors searches on each index defined for a table. If it notices that certain index values are being accessed frequently, it automatically builds an in-memory hash table for that index. Based on the pattern of searches that InnoDB observes, it will build a hash index using a prefix of the index key. The prefix of the key can be any length, and it may be that only a subset of the values in the B-tree will appear in the hash index. InnoDB builds hash indexes on demand for those pages of the index that are often accessed.
The adaptive hash index mechanism allows InnoDB to take advantage of large amounts of memory, something typically done only by database systems specifically designed for databases that reside entirely in memory. Normally, the automatic building and use of adaptive hash indexes will improve performance. However, sometimes, the read/write lock that guards access to the adaptive hash index may become a source of contention under heavy workloads, such as multiple concurrent joins.
You can monitor the use of the adaptive hash index and the
contention for its use in the “SEMAPHORES” section of
the output of the SHOW ENGINE INNODB STATUS command. If you see many
threads waiting on an RW-latch created in
btr0sea.c, then it might be useful to disable
adaptive hash indexing.
The configuration parameter innodb_adaptive_hash_index can
be set to disable or enable the adaptive hash index. See
Section 8.3.4, “Dynamically Changing innodb_adaptive_hash_index”
for details.
InnoDB uses operating system threads to process requests from user transactions. (Transactions may issue many requests to InnoDB before they commit or roll back.) On today’s modern operating systems and servers with multi-core processors, where context switching is efficient, most workloads will run well without any limit on the number of concurrent threads. Thanks to several scalability improvements in InnoDB Plugin 1.0.3, and further changes in release 1.0.4, there should be less need to artificially limit the number of concurrently executing threads inside InnoDB.
However, for some situations, it may be helpful to minimize context switching between threads. InnoDB can use a number of techniques to limit the number of concurrently executing operating system threads (and thus the number of requests that are processed at any one time). When InnoDB receives a new request from a user session, if the number of threads concurrently executing is at a pre-defined limit, the new request will sleep for a short time before it tries again. A request that cannot be rescheduled after the sleep is put in a first-in/first-out queue and eventually will be processed. Threads waiting for locks are not counted in the number of concurrently executing threads.
The limit on the number of concurrent threads is given by
the settable global variable innodb_thread_concurrency. Once
the number of executing threads reaches this limit, additional
threads will sleep for a number of microseconds, set by the system
configuration parameter innodb_thread_sleep_delay, before being
placed into the queue.
The default value for innodb_thread_concurrency and the
implied default limit on the number of concurrent threads has been
changed in various releases of MySQL and the InnoDB Plugin.
Starting with InnoDB Plugin 1.0.3, the default value of
innodb_thread_concurrency is 0, so that by
default there is no limit on the number of concurrently executing
threads, as shown in Table 7.1, “Changes to innodb_thread_concurrency”.
Table 7.1. Changes to innodb_thread_concurrency
| InnoDB Version | MySQL Version | Default value | Default limit of concurrent threads | Value to allow unlimited threads |
|---|---|---|---|---|
| Built-in | Earlier than 5.1.11 | 20 | No limit | 20 or higher |
| Built-in | 5.1.11 and newer | 8 | 8 | 0 |
| InnoDB Plugin before 1.0.3 | (corresponding to Plugin) | 8 | 8 | 0 |
| InnoDB Plugin 1.0.3 and newer | (corresponding to Plugin) | 0 | No limit | 0 |
Note that InnoDB will cause threads to sleep only when the
number of concurrent threads is limited. When there is no limit
on the number of threads, all will contend equally to be
scheduled. That is, if innodb_thread_concurrency is
0, the value of innodb_thread_sleep_delay is
ignored.
When there is a limit on the number of threads, InnoDB
reduces context switching overhead by permitting multiple requests
made during the execution of a single SQL statement to enter InnoDB
without observing the limit set by innodb_thread_concurrency.
Since a SQL statement (such as a join) may comprise multiple row
operations within InnoDB, InnoDB assigns
“tickets” that allow a thread to be scheduled
repeatedly with minimal overhead.
When starting to execute a new SQL statement, a thread will
have no tickets, and it must observe innodb_thread_concurrency.
Once the thread is entitled to enter InnoDB, it will be assigned
a number of tickets that it can use for subsequently entering
InnoDB. If the tickets run out, innodb_thread_concurrency
will be observed again and further tickets will be assigned. The
number of tickets to assign is specified by the global option
innodb_concurrency_tickets, which is 500 by default. A thread
that is waiting for a lock will be given one ticket once the lock
becomes available.
The correct values of these variables are dependent on your
environment and workload. You will need to try a range of
different values to determine what value works for your
applications. Before limiting the number of concurrently
executing threads, you should review configuration options that
may improve the performance of InnoDB on multi-core and
multi-processor computers, such as innodb_use_sys_malloc and innodb_adaptive_hash_index.
A read ahead request is an I/O request to prefetch multiple pages in the buffer cache asynchronously in anticipation that these pages will be needed in the near future. InnoDB has historically used two read ahead algorithms to improve I/O performance.
Random read ahead is done if a certain number of pages from the same extent (64 consecutive pages) are found in the buffer cache. In such cases, InnoDB asynchronously issues a request to prefetch the remaining pages of the extent. Random read ahead added unnecessary complexity to the InnoDB code and often resulted in performance degradation rather than improvement. Starting with InnoDB Plugin 1.0.4, this feature has been removed from InnoDB, and users should generally see equivalent or improved performance.
Linear read ahead is based
on the access pattern of the pages in the buffer cache, not just their
number. In releases before 1.0.4, if most pages belonging to some
extent are accessed sequentially, InnoDB will issue an asynchronous
prefetch request for the entire next extent when it reads in the last
page of the current extent. Beginning with InnoDB Plugin 1.0.4,
users can control when InnoDB performs a read ahead, by adjusting
the number of sequential page accesses required to trigger an
asynchronous read request using the new configuration parameter
innodb_read_ahead_threshold.
If the number of pages read from an extent of 64 pages is greater or
equal to innodb_read_ahead_threshold, InnoDB will initiate an
asynchronous read ahead of the entire following extent. Thus, this
parameter controls how sensitive InnoDB is to the pattern of page
accesses within an extent in deciding whether to read the following
extent asynchronously. The higher the value, the more strict will
be the access pattern check. For example, if you set the value to 48,
InnoDB will trigger a linear read ahead request only when 48 pages
in the current extent have been accessed sequentially. If the value
is 8, InnoDB would trigger an asynchronous read ahead even if as
few as 8 pages in the extent were accessed sequentially.
The new configuration parameter innodb_read_ahead_threshold may be
set to any value from 0-64. The default value is 56, meaning that an
asynchronous read ahead is performed only when 56 of the 64 pages in
the extent are accessed sequentially. You can set the value of this
parameter in the MySQL option file (my.cnf or my.ini), or change it
dynamically with the SET GLOBAL command, which
requires the SUPER privilege.
Starting with InnoDB Plugin 1.0.5 more statistics are
provided through SHOW ENGINE INNODB STATUS command to
measure the effectiveness of the read ahead algorithm. See
Section 8.9, “More Read Ahead Statistics” for more
information.
InnoDB uses background threads to service various types
of I/O requests. Starting from InnoDB Plugin 1.0.4, the number of
background threads tasked with servicing read and write I/O
on data pages is configurable. In previous versions
of InnoDB, there was only one thread each for read and
write on non-Windows platforms. On Windows, the number of background
threads was controlled by innodb_file_io_threads. The
configuration parameter innodb_file_io_threads has been
removed in InnoDB Plugin 1.0.4. If you try to set a value
for this parameter, a warning will be written to the log file and
the value will be ignored.
In place of innodb_file_io_threads, two new configuration
parameters are introduced in the InnoDB Plugin 1.0.4, which are
effective on all supported platforms. The two parameters
innodb_read_io_threads and innodb_write_io_threads signify
the number of background threads used for read and write
requests respectively. You can set the value of these parameters in
the MySQL option file (my.cnf or
my.ini). These parameters cannot be changed
dynamically. The default value for these parameters is
4 and the permissible values range from
1-64.
The purpose of this change is to make InnoDB more scalable
on high end systems. Each background thread can handle up to 256
pending I/O requests. A major source of background I/O is the read
ahead requests. InnoDB tries to balance the load of incoming
requests in such way that most of the background threads share work
equally. InnoDB also attempts to allocate read requests from the
same extent to the same thread to increase the chances of coalescing
the requests together. If you have a high end I/O subsystem and you
see more than 64 times innodb_read_io_threads
pending read requests in SHOW ENGINE INNODB STATUS,
then you may gain by increasing the value of
innodb_read_io_threads.
InnoDB, like any other ACID compliant database engine, is required to flush the redo log of a transaction before it is committed. Historically InnoDB used group commit functionality to group multiple such flush requests together to avoid one flush for each commit. With group commit, InnoDB can issue a single write to the log file to effectuate the commit action for multiple user transactions that commit at about the same time, significantly improving throughput.
Group commit in InnoDB worked until MySQL 4.x. With the introduction of support for the distributed transactions and Two Phase Commit (2PC) in MySQL 5.0, group commit functionality inside InnoDB was broken.
Beginning with InnoDB Plugin 1.0.4, the group commit functionality inside InnoDB works with the Two Phase Commit protocol in MySQL. Re-enabling of the group commit functionality fully ensures that the ordering of commit in the MySQL binlog and the InnoDB logfile is the same as it was before. It means it is totally safe to use InnoDB Hot Backup with InnoDB Plugin 1.0.4.
Group commit is transparent to the user and nothing needs to be done by the user to take advantage of this significant performance improvement.
The master thread in InnoDB is a thread that performs various tasks in the background. Most of these tasks are I/O related like flushing of the dirty pages from the buffer cache or writing the buffered inserts to the appropriate secondary indexes. The master thread attempts to perform these tasks in a way that does not adversely affect the normal working of the server. It tries to estimate the free I/O bandwidth available and tune its activities to take advantage of this free capacity. Historically, InnoDB has used a hard coded value of 100 IOPs (input/output operations per second) as the total I/O capacity of the server.
Beginning with InnoDB Plugin 1.0.4, a new configuration
parameter is introduced to indicate the overall I/O capacity
available to InnoDB. The new parameter innodb_io_capacity
should be set to approximately the number of I/O operations
that the system can perform per second. The value will of course
depend on your system configuration. When innodb_io_capacity
is set, the master threads estimates the I/O bandwidth available
for background tasks based on the set value. Setting the value
to 100 reverts to the old behavior.
You can set the value of innodb_io_capacity to any number
100 or greater, and the default value is 200.
You can set the value of this parameter in the MySQL option file
(my.cnf or my.ini) or change
it dynamically with the SET GLOBAL command,
which requires the SUPER privilege.
InnoDB performs certain tasks in the background, including
flushing of dirty pages (those pages that have been changed but are
not yet written to the database files) from the buffer cache, a task
performed by the “master thread”. Currently, the master
thread aggressively flushes buffer pool pages if the percentage of
dirty pages in the buffer pool exceeds innodb_max_dirty_pages_pct.
This behavior can cause temporary reductions in throughput when excessive buffer pool flushing takes place, limiting the I/O capacity available for ordinary read and write activity. Beginning with release 1.0.4, InnoDB Plugin uses a new algorithm to estimate the required rate of flushing based on the speed of redo log generation and the current rate of flushing. The intent of this change is to smooth overall performance, eliminating steep dips in throughput, by ensuring that buffer flush activity keeps up with the need to keep the buffer pool “clean”.
Remember that InnoDB uses its log files in a circular
fashion. To make a log file (or a portion of it) reusable, InnoDB
must flush to disk all dirty buffer pool pages whose redo
entries are contained in that portion of the log file. When required,
InnoDB performs a so-called “sharp checkpoint” by
flushing the appropriate dirty pages to make space available in
the log file. If a workload is write intensive, it will generate
a lot of redo information (writes to the log file). In this case,
it is possible that available space in the log files will be used up,
even though innodb_max_dirty_pages_pct is not reached. This will
cause a sharp checkpoint, causing a temporary reduction in throughput.
Beginning with release 1.0.4, InnoDB Plugin uses a new heuristic-based algorithm to avoid such a scenario. The heuristic is a function of the number of dirty pages in the buffer cache and the rate at which redo is being generated. Based on this heuristic, the master thread will decide how many dirty pages to flush from the buffer cache each second. This self adapting heuristic is able to deal with sudden changes in the workload.
The primary aim of this feature is to smooth out I/O activity, avoiding sudden dips in throughput when flushing activity becomes high. Internal benchmarking has also shown that this algorithm not only maintains throughput over time, but can also improve overall throughput significantly.
Because adaptive flushing is a new feature that can
significantly affect the I/O pattern of a workload, the InnoDB Plugin
introduces a new configuration parameter that can be used to
disable this feature. The default value of the new boolean
parameter innodb_adaptive_flushing is TRUE,
enabling the new algorithm. You can set the value of this parameter
in the MySQL option file (my.cnf or
my.ini) or change it dynamically with the
SET GLOBAL command, which requires the
SUPER privilege.
Synchronization inside InnoDB frequently involves the use of spin loops (where, while waiting, InnoDB executes a tight loop of instructions repeatedly to avoid having the InnoDB process and threads be rescheduled by the operating system). If the spin loops are executed too quickly, system resources are wasted, imposing a relatively severe penalty on transaction throughput. Most modern processors implement the PAUSE instruction for use in spin loops, so the processor can be more efficient.
Beginning with 1.0.4, the InnoDB Plugin uses a PAUSE instruction in its spin loops on all platforms where such an instruction is available. This technique increases overall performance with CPU-bound workloads, and has the added benefit of minimizing power consumption during the execution of the spin loops.
Using the PAUSE instruction in InnoDB spin loops is transparent to the user. User does not have to do anything to take advantage of this performance improvement.
Many InnoDB mutexes and rw-locks are reserved for a short amount of time. On a multi-core system, it is often more efficient for a thread to actively poll a mutex or rw-lock for a while before sleeping. If the mutex or rw-lock becomes available during this polling period, the thread may continue immediately, in the same time slice. Alas, if a shared object is being polled too frequently by multiple threads, it may result in “cache ping-pong”, the shipping of cache lines between processors. InnoDB tries to avoid this by making threads busy, waiting a random time between subsequent polls. The delay is implemented as a busy loop.
Starting with InnoDB Plugin 1.0.4, it is possible to
control the maximum delay between sampling a mutex or rw-lock
using the new parameter innodb_spin_wait_delay. In the
100 MHz Pentium era, the unit of delay used to be one
microsecond. The duration of the delay loop depends on the C
compiler and the target processor. On a system where
all processor cores share a fast cache memory, it might be useful
to reduce the maximum delay or disable the busy loop altogether by
setting innodb_spin_wait_delay=0. On a system
that consists of multiple processor chips, the shipping of cache
lines can be slower and it may be useful to increase the maximum
delay.
The default value of innodb_spin_wait_delay is
6. The spin wait delay is a dynamic,
global parameter that can be specified in the MySQL option file
(my.cnf or my.ini) or
changed at runtime with the command SET GLOBAL
innodb_spin_wait_delay=,
where delay is the
desired maximum delay. Changing the setting requires the
delaySUPER privilege.
Historically, InnoDB has inserted newly read blocks into the
middle of the list representing the buffer cache,
to avoid pollution of the cache due to excessive
read-ahead. The idea is that the read-ahead algorithm should not
pollute the buffer cache by forcing the
frequently accessed (“hot”) pages out of the LRU list.
To achieve this,
InnoDB internally maintains a pointer at 3/8
from the tail of the LRU list, and all newly read pages are
inserted at this location in the LRU list. The pages are moved to
the from of the list (the most-recently used end) when they are
accessed from the buffer cache for the first time. Thus pages
that are never accessed never make it to the front
5/8 of the LRU list.
The above arrangement logically divides the LRU list into two segments where the 3/8 pages downstream of the insertion point are considered “old” and are desirable victims for LRU eviction. Starting with InnoDB Plugin 1.0.5, this mechanism has been extended in two ways.
You can control the insertion
point in the LRU list. A new configuration parameter
innodb_old_blocks_pct now controls the percentage of
“old” blocks in the LRU list. The default value of
innodb_old_blocks_pct is 37, corresponding
to the original fixed ratio of 3/8. The permissible value range
is 5 to 95.
The optimization that keeps the buffer cache from being
churned too much by read-ahead, is extended to avoid similar problems
resulting from table or index scans. During an index scan,
a data page is typically accessed a few times in quick succession
and is then never touched again. InnoDB Plugin 1.0.5 introduces
a new configuration parameter innodb_old_blocks_time
which specifies the time window (in milliseconds) after the first
access to a page during which it can be accessed without being
moved to the front (most-recently used end) of the LRU list.
The default value of innodb_old_blocks_time is 0,
corresponding to the original behavior of moving a page to the MRU
end of the LRU list on first access in the buffer pool.
Both the new parameters innodb_old_blocks_pct and
innodb_old_blocks_time are dynamic, global and can be
specified in the MySQL option file (my.cnf
or my.ini) or changed at runtime with the
SET GLOBAL command. Changing the setting
requires the SUPER privilege.
To help you gauge the effect of setting these
parameters, some additional statistics are reported by
SHOW ENGINE INNODB STATUS command. The
BUFFER POOL AND MEMORY section now looks like:
Total memory allocated 1107296256; in additional pool allocated 0 Dictionary memory allocated 80360 Buffer pool size 65535 Free buffers 0 Database pages 63920 Old database pages 23600 Modified db pages 34969 Pending reads 32 Pending writes: LRU 0, flush list 0, single page 0 Pages made young 414946, not young 2930673 1274.75 youngs/s, 16521.90 non-youngs/s Pages read 486005, created 3178, written 160585 2132.37 reads/s, 3.40 creates/s, 323.74 writes/s Buffer pool hit rate 950 / 1000, young-making rate 30 / 1000 not 392 / 1000 Pages read ahead 1510.10/s, evicted without access 0.00/s LRU len: 63920, unzip_LRU len: 0 I/O sum[43690]:cur[221], unzip sum[0]:cur[0]
Old database pages is the number of
pages in the “old” segment of the LRU
list.
Pages made young and
not young is the total number of
“old” pages that have been made young or not
respectively.
youngs/s and
non-young/s is the rate at which page accesses
to the “old” pages have resulted in making such
pages young or otherwise respectively since the last invocation
of the command.
young-making rate and
not provides the same rate but in terms of
overall buffer cache accesses instead of accesses just to the
“old” pages.
The default values of both parameters leave the original behavior as of InnoDB Plugin 1.0.4 intact. To take advantage of this feature, you must set different values.
Because the effects of these parameters can vary widely based on your hardware configuration, your data, and the details of your workload, always benchmark to verify the effectiveness before changing these settings in any performance-critical or production environment.
In mixed workloads where most of the
activity is OLTP type with periodic batch reporting queries
which result in large scans, setting the value of
innodb_old_blocks_time during the batch runs can help keep the
working set of the normal workload in the buffer cache.
When scanning large tables that cannot fit entirely in the buffer pool,
setting innodb_old_blocks_pct to a small value keeps the
data that is only read once from consuming a significant portion of the
buffer pool. For example, setting innodb_old_blocks_pct=5 restricts
this data that is only read once to 5% of the buffer pool.
When scanning small tables that do fit into memory, there is less
overhead for moving pages around within the buffer pool, so you
can leave innodb_old_blocks_pct at its default value, or even
higher, such as innodb_old_blocks_pct=50.
The effect of the innodb_old_blocks_time parameter is harder to predict
than the innodb_old_blocks_pct parameter, is relatively small,
and varies more with the workload.
To arrive at an optimal value, conduct your own benchmarks if the
performance improvement from adjusting innodb_old_blocks_pct is
not sufficient.
Table of Contents
TRUNCATE TABLE Reclaims SpaceSHOW ENGINE INNODB MUTEXThis chapter describes several changes in the InnoDB Plugin that offer new flexibility and improve ease of use, reliability and performance:
TRUNCATE TABLE Reclaims Space”SHOW ENGINE INNODB MUTEX”The InnoDB Plugin introduces named file formats to improve compatibility between database file formats and various InnoDB versions.
To create new tables that require a new file format, you
must enable the new “Barracuda” file format, using the
configuration parameter innodb_file_format.
The value of this parameter will determine whether it will be
possible to create a table or index using compression or the new
DYNAMIC row format. If you omit
innodb_file_format or set it to “Antelope”,
you preclude the use of new features that would make your database
inaccessible to the built-in InnoDB in MySQL 5.1 and prior releases.
You can set the value of innodb_file_format on the command line
when you start mysqld, or in the option file
my.cnf (Unix operating systems) or
my.ini (Windows). You can also change it
dynamically with the SET GLOBAL command.
Further information about managing file formats is presented in Chapter 4, InnoDB File Format Management.
With the InnoDB Plugin it now is possible to chance certain system configuration parameters dynamically, without shutting down and restarting the server as was previously necessary. This increases up-time and facilitates testing of various options. You can now set these parameters dynamically:
Since MySQL version 4.1, InnoDB has provided two options
for how tables are stored on disk. You can choose to create a new
table and its indexes in the shared system tablespace (corresponding to
the set of files named ibdata files), along
with other internal InnoDB system information. Or, you can chose
to use a separate file (an .ibd file) to store
a new table and its indexes.
The tablespace style used for new tables is determined by
the setting of the configuration parameter
innodb_file_per_table at the time a table is
created. Previously, the only way to set this parameter was in
the MySQL option file (my.cnf or
my.ini), and changing it required shutting
down and restarting the server. Beginning with the
InnoDB Plugin, the configuration parameter
innodb_file_per_table is dynamic, and can be set
ON or OFF using the
SET GLOBAL command. The default setting is
OFF, so new tables and indexes are created in
the system tablespace. Dynamically changing the value of this
parameter requires the SUPER privilege and
immediately affects the operation of all connections.
Tables created when innodb_file_per_table is
disabled cannot use the new compression capability, or use the new
row format DYNAMIC. Tables created when
innodb_file_per_table is enabled can use
those new features, and each table and its indexes will be
stored in a new .ibd file.
The ability to change the setting of
innodb_file_per_table dynamically is useful for
testing. As noted above, the parameter
innodb_file_format is also dynamic, and must be
set to “Barracuda” to create new compressed tables, or tables
that use the new row format DYNAMIC. Since both
parameters are dynamic, it is easy to experiment with these table
formats and the downgrade procedure described in
Chapter 11, Downgrading from the InnoDB Plugin without a system shutdown and
restart.
Note that the InnoDB Plugin can add and drop a table’s secondary
indexes without re-creating the table, but must recreate the table
when you change the clustered (primary key) index (see
Chapter 2, Fast Index Creation in the InnoDB Storage Engine).
When a table is recreated as a result of creating or dropping an
index, the table and its indexes will be stored in the shared system
tablespace or in its own .ibd file just as if it were created using a
CREATE TABLE command (and depending on the setting
of innodb_file_per_table). When an index is created
without rebuilding the table, the index is stored in the same file as
the clustered index, regardless of the setting of
innodb_file_per_table.
As noted in Section 8.6, “Controlling Optimizer Statistics Estimation”, the
InnoDB Plugin allows you to control the way in which InnoDB
gathers information about the number of distinct values in an
index key. A related parameter, innodb_stats_on_metadata, has
existed since MySQL release 5.1.17 to control whether or not
InnoDB performs statistics gathering when metadata statements
are executed. See the MySQL manual on InnoDB Startup Options and System Variables for
details.
Beginning with release 1.0.2 of the InnoDB Plugin, it is
possible to change the setting of innodb_stats_on_metadata
dynamically at runtime with the command SET GLOBAL
innodb_stats_on_metadata=,
where mode is
either modeON or OFF (or
1 or 0). Changing this
setting requires the SUPER privilege and
immediately affects the operation of all connections.
When a transaction is waiting for a resource, it will wait for the resource to become free, or stop waiting and return with the error
ERROR HY000: Lock wait timeout exceeded; try restarting transaction
The length of time a transaction will wait for a resource before
“giving up” is determined by the value of the
configuration parameter innodb_lock_wait_timeout. The default
setting for this parameter is 50 seconds. The minimum setting is
1 second, and values above 100,000,000 disable the timeout, so a
transaction will wait “forever”. Following a
timeout, the SQL statement that was executing will be rolled
back. (In MySQL 5.0.12 and earlier, the transaction rolled
back.) The user application may try the statement again
(usually after waiting for a while), or rollback the entire
transaction and restart.
Before InnoDB Plugin 1.0.2, the only way to set this
parameter was in the MySQL option file
(my.cnf or my.ini), and
changing it required shutting down and restarting the server.
Beginning with the InnoDB Plugin 1.0.2, the configuration
parameter innodb_lock_wait_timeout can be set at runtime with
the SET GLOBAL or SET
SESSION commands. Changing the
GLOBAL setting requires the
SUPER privilege and affects the operation of
all clients that subsequently connect. Any client can change
the SESSION setting for
innodb_lock_wait_timeout, which affects only that
client.
As described in Section 7.5, “Controlling Adaptive Hash Indexing”, it may be desirable, depending on your workload, to dynamically enable or disable the adaptive hash indexing scheme InnoDB uses to improve query performance.
Version 5.1.24 of MySQL introduced the start-up option
innodb_adaptive_hash_index that allows the adaptive hash index
to be disabled. It is enabled by default. Starting with
InnoDB Plugin 1.0.3, the parameter can be modified by the
SET GLOBAL command, without restarting the
server. Changing the setting requires the
SUPER privilege.
Disabling the adaptive hash index will empty the hash table immediately. Normal operations can continue while the hash table is emptied, and executing queries that have been using the hash table will access the index B-trees directly instead of attempting to utilize the hash index. When the adaptive hash index is enabled, the hash table will be populated during normal operation.
Starting with the InnoDB Plugin, when the user requests to
TRUNCATE a table that is stored in an
.ibd file of its own (because
innodb_file_per_table was enabled when
the table was created), and if the table is not referenced in a
FOREIGN KEY constraint, the InnoDB Plugin will
drop and re-create the table in a new .idb file. This
operation is much faster than deleting the rows
one by one, and will return disk space to the operating system and
reduce the size of page-level backups.
Previous versions of InnoDB
would re-use the existing .idb file,
thus releasing the space only
to InnoDB for storage management, but not to the operating system.
Note that when the table is truncated, the count of rows affected by
the TRUNCATE command is an arbitrary number.
Note: if there are referential constraints between the table
being truncated and other tables, MySQL instead automatically
converts the TRUNCATE command to a
DELETE command that operates row-by-row, so
that ON DELETE operations can occur on
“child” tables.
To guard against ignored typos and syntax errors in SQL, or other
unintended consequences of various combinations of operational
modes and SQL commands, the InnoDB Plugin provides a
“strict mode” of operations. In this mode, InnoDB
will raise error
conditions in certain cases, rather than issue a warning and
process the specified command (perhaps with some unintended
defaults). This is analogous to MySQL’s sql_mode,
which controls
what SQL syntax MySQL will accept, and determines whether it will
silently ignore errors, or validate input syntax and data values.
Note that there is no strict mode with the built-in InnoDB, so
some commands that execute without errors with the built-in
InnoDB will generate errors with the InnoDB Plugin, unless you
disable strict mode.
In the InnoDB Plugin, the setting of InnoDB strict
mode affects the handling of syntax errors on the CREATE TABLE,
ALTER TABLE and CREATE INDEX commands. Starting with
InnoDB Plugin version 1.0.2, the strict mode also enables a
record size check, so that an INSERT or
UPDATE will never fail due to the record
being too large for the selected page size.
Using the new clauses and settings for ROW_FORMAT and
KEY_BLOCK_SIZE on CREATE TABLE and
ALTER TABLE commands and the
CREATE INDEX can be confusing when not running
in strict mode. Unless
you run in strict mode, InnoDB will ignore certain syntax errors
and will create the table or index, with only a warning in the
message log. However if InnoDB strict mode is on, such errors
will generate an immediate error and the table or index will not
be created, thus saving time by catching the error at the time the
command is issued.
The default for strict mode is off, but in the future, the default may be changed. It is best to start using strict mode with the InnoDB Plugin, and make sure your SQL scripts use commands that do not generate warnings or unintended effects.
InnoDB strict mode is set with the configuration parameter
innodb_strict_mode, which can be specified as
on or off.
You can set the value on the command line when you start mysqld,
or in the configuration file my.cnf (Unix
operating systems) or
my.ini (Windows). You can also enable or disable
InnoDB strict
mode at runtime with the command SET [GLOBAL|SESSION]
innodb_strict_mode=,
where mode is either
modeON or OFF.
Changing the GLOBAL setting requires the
SUPER privilege and affects the operation of
all clients that subsequently connect. Any client can change
the SESSION setting for innodb_strict_mode,
which affects only that client.
The MySQL query optimizer uses estimated statistics about key distributions to select or avoid using an index in an execution plan, based on the relative selectivity of the index. Previously, InnoDB sampled 8 random pages from an index to get an estimate of the cardinality of (i.e., the number of distinct values in) the index. (This page sampling technique is frequently described as “index dives”.) This small number of page samples frequently was insufficient, and could give inaccurate estimates of an index’s selectivity and thus lead to poor choices by the query optimizer.
To give users control over the quality of the statistics
estimate (and thus better information for the query optimizer),
the number of sampled pages now can be changed via the parameter
innodb_stats_sample_pages.
This feature addresses user requests such as that as expressed in MySQL Bug #25640: InnoDB Analyze Table Should Allow User Selection of Index Dives.
You can change the number of sampled pages via the global
parameter innodb_stats_sample_pages, which can be set at
runtime (i.e., it is a dynamic parameter). The default value for
this parameter is 8, preserving the same behavior as in past
releases.
Note that the value of innodb_stats_sample_pages affects the
index sampling for all tables and indexes.
You should also be aware that there are the following potentially
significant impacts when you change the index sample size:
SHOW TABLE STATUS.
Note that the cardinality estimation can be disabled for
metadata commands such as SHOW TABLE STATUS
by executing the command SET GLOBAL
innodb_stats_on_metadata=OFF (or 0).
Before InnoDB Plugin
1.0.2, this variable could only be set in the MySQL option
file (my.cnf or my.ini),
and changing it required shutting down and restarting the
server.
The cardinality (the number of different key values) in every
index of a table is calculated when a table is opened,
at SHOW TABLE STATUS and ANALYZE TABLE and on other
circumstances (like when the table has changed too much).
Note that all tables are opened, and the statistics are
re-estimated, when the mysql client starts if the
auto-rehash setting is set on (the default).
The auto-rehash feature enables automatic name completion
of database, table, and column names for interactive
users. You may prefer setting auto-rehash off to improve
the start up time of the mysql client.
You should note that it does not make sense to increase
the index sample size, then run ANALYZE TABLE and decrease
sample size to attempt to obtain better statistics.
This is because the statistics are not persistent. They are
automatically recalculated at various times other than on
execution of ANALYZE TABLE. Sooner or later the
“better” statistics calculated by
ANALYZE running with a high value of
innodb_stats_sample_pages will be wiped away.
The estimated cardinality for an index will be more accurate
with a larger number of samples, but each sample might require
a disk read, so you do not want to make the sample size too
large. You should choose a value for innodb_stats_sample_pages
that results in reasonably accurate estimates for all tables in
your database without requiring excessive I/O.
Although it is not possible to specify the sample size on a
per-table basis, smaller tables generally would require fewer
index samples than larger tables require. If your database
has many large tables, you may want to consider using a higher
value for innodb_stats_sample_pages than if you have mostly
smaller tables.
For efficiency, InnoDB requires an index to exist on
foreign key columns so that UPDATE and
DELETE operations on a “parent”
table can easily check for the existence or non-existence of
corresponding rows in the “child” table. To ensure
that there is an appropriate index for such checks, MySQL will
sometimes implicitly create or drop such indexes as a side-effect
of CREATE TABLE, CREATE
INDEX, and ALTER TABLE
statements.
When you explicitly DROP an index,
InnoDB will check that an index suitable for referential
integrity checking will still exist following the
DROP of the index. InnoDB will prevent you
from dropping the last usable index for enforcing any given
referential constraint. Users have been confused by this
behavior, as reported in MySQL Bug
#21395.
In releases prior to InnoDB Plugin 1.0.2, attempts to drop the only usable index would result in an error message such as
ERROR 1025 (HY000): Error on rename of './db2/#sql-18eb_3' to './db2/foo'(errno: 150)
Beginning with InnoDB Plugin 1.0.2, this error condition is reported with a more friendly message:
ERROR 1553 (HY000): Cannot drop index 'fooIdx':
needed in a foreign key constraintAs a related matter, because all user data in InnoDB is
maintained in the so-called “clustered index” (or
primary key index), InnoDB ensures that there is such an index
for every table, even if the user does not declare an explicit
PRIMARY KEY. In such cases, InnoDB will
create an implicit clustered index using the first columns of the
table that have been declared UNIQUE and
NOT NULL.
When the InnoDB Plugin is used with a MySQL version
earlier than 5.1.29, an attempt to drop an implicit clustered
index (the first UNIQUE NOT NULL index) will
fail if the table does not contain a PRIMARY
KEY. This has been reported as MySQL Bug
#31233. Attempts to use the DROP INDEX or ALTER TABLE
command to drop such an index will generate this error:
ERROR 42000: This table type requires a primary key
Beginning with MySQL 5.1.29 when using the
InnoDB Plugin, attempts to drop such an index will copy the
table, rebuilding the index using a different UNIQUE NOT
NULL group of columns or a system-generated key. Note
that all indexes will be re-created by copying the table, as
described in Section 2.3, “Implementation”.
In those versions of MySQL that are affected by this bug,
one way to change an index of this type is to create a new table
and copy the data into it using INSERT INTO
, and then
newtable SELECT * FROM
oldtableDROP the old table and rename the new
table.
However, if there are existing tables with references to the
table whose index you are dropping, you will first need to use the
ALTER TABLE command to remove foreign key references from or to
other tables. Unfortunately, MySQL does not support dropping or
creating FOREIGN KEY constraints, even though dropping a
constraint would be trivial. Therefore, if you use ALTER TABLE
to add or remove a REFERENCES constraint, the
child table will be copied, rather than using “Fast Index
Creation”.
The command SHOW ENGINE INNODB MUTEX displays information
about InnoDB mutexes and rw-locks. It can be a useful
tuning aid on multi-core systems. However, with a big buffer
pool, the size of the output may be overwhelming. There is a
mutex and rw-lock in each 16K buffer pool block. It is highly
improbable that an individual block mutex or rw-lock could become
a performance bottleneck, and there are 65,536 blocks per
gigabyte.
Starting with InnoDB Plugin 1.0.4, SHOW ENGINE INNODB MUTEX
will skip the mutexes and rw-locks of buffer pool blocks.
Furthermore, it will not list any mutexes or rw-locks that have
never been waited on (os_waits=0). Therefore,
SHOW ENGINE INNODB MUTEX only displays information about such mutexes
and rw-locks that does not belong to the buffer pool blocks and
for whom there have been at least one OS level wait.
As described in Section 7.7, “Changes in the Read Ahead Algorithm”
a read ahead request is an asynchronous IO request issued in
anticipation that the page being read in will be used in near
future. It can be very useful if a DBA has the information
about how many pages are read in as part of read ahead and how many
of them are evicted from the buffer pool without ever being
accessed. Based on this information a DBA can then fine tune the
degree of aggressiveness of the read ahead using the parameter
innodb_read_ahead_threshold.
Starting from InnoDB Plugin 1.0.5 two new status variables
are added to the SHOW STATUS output. These
global status variables Innodb_buffer_pool_read_ahead
and Innodb_buffer_pool_read_ahead_evicted
indicate the number of pages read in as part of read ahead and the
number of such pages evicted without ever being accessed
respectively. These counters provide global values since the start
of the server. Please also note that the status variables
Innodb_buffer_pool_read_ahead_rnd and
Innodb_buffer_pool_read_ahead_seq have been
removed from the SHOW STATUS output.
In addition to the two counters mentioned above
SHOW INNODB STATUS will also show the rate at
which the read ahead pages are being read in and the rate at
which such pages are being evicted without being accessed. The per
second averages are based on the statistics collected since the last
invocation of SHOW INNODB STATUS and are
displayed in the BUFFER POOL AND MEMORY section
of the output.
Table of Contents
The InnoDB Plugin is available at the download section of the InnoDB website. You can download the plugin in these formats:
As a platform-specific executable binary file that is dynamically linked or “plugged in” to the MySQL server.
In source code form, available under the GNU General Public License (GPL), version 2.
While it is possible to use the source code to build a special version of MySQL containing the InnoDB Plugin, we recommend you install the binary shared library for the InnoDB Plugin instead, without building from source. Replacing the shared library is simpler and much less error prone than building from source.
The InnoDB Plugin is also included in the MySQL distribution,
starting from MySQL 5.1.38.
We recommend that if you plan to adopt the plugin for production use,
you download from the InnoDB web site, because that site might
have fixes that are not yet included in the MySQL distribution.
You can install the MySQL-supplied plugin as a shared library
using the instructions in that chapter, with one change:
the filename of the shared library as supplied by MySQL
is
ha_innodb_plugin.so
or
ha_innodb_plugin.dll,
as opposed to
ha_innodb.so
or
ha_innodb.dll
when you download from the InnoDB web site.
This discussion pertains to using the InnoDB Plugin with the MySQL Community Edition, whether source or binary. Except for download locations for MySQL software, the procedures documented here should work without change when you use MySQL Enterprise.
Whether you dynamically install the binary InnoDB Plugin or build from source, configure MySQL by editing the configuration file to use InnoDB as the default engine (if desired) and set appropriate configuration parameters to enable use of new InnoDB Plugin features, as described in Section 9.5, “Configuring the InnoDB Plugin”.
At this time, the InnoDB Plugin has not been
compiled or tested with the Intel C Compiler
(icc), so you should use a version of
MySQL compiled with the GNU Compiler Collection
(gcc).
Use the following table to confirm that the version of the InnoDB Plugin (whether source or binary) is compatible with your platform, hardware type (including 32-bit vs 64-bit) and with your version of MySQL. In general, a specific release of the InnoDB Plugin is designed to be compatible with a range of MySQL versions, but this may vary, so check the information on the download page.
When building MySQL from source, you can generally use the source for the InnoDB Plugin in place of the source for the built-in InnoDB. However, due to limitations of MySQL, a given binary version of the InnoDB Plugin is compatible only with a specific version of MySQL, as follows.
| Note: MySQL Bug #42610 prevents using the binary InnoDB Plugin with MySQL 5.1.31 or 5.1.32. There is no binary InnoDB Plugin for MySQL 5.1.33. The only way to use InnoDB Plugin with MySQL 5.1.31 through 5.1.35 is by building from source. This issue was resolved in MySQL 5.1.37 and InnoDB Plugin 1.0.4. |
The simplest way to install the InnoDB Plugin is to use a precompiled (binary) shared library file, when one is available. The procedures are similar for installing the InnoDB Plugin using the binary on different hardware and operating systems platforms, but the specific details differ between Unix or Linux and Microsoft Windows. See below for notes specific to your platform.
Note that due to MySQL Bug #42610, the procedure of installing the binary InnoDB Plugin changed in MySQL 5.1.33. If your version of MySQL is older than 5.1.33, refer to Appendix B, Using the InnoDB Plugin with MySQL 5.1.30 or Earlier.
The steps for installing the InnoDB Plugin as a shared library are as follows:
Download, extract and install the suitable MySQL executable for your platform.
Make sure the MySQL server is not running. If you have to shut down the database server, you use a special “slow” shutdown procedure, described later.
Download and extract the corresponding binary for the InnoDB Plugin and copy it to the proper directory or folder where MySQL looks for plugins.
On database startup, make MySQL ignore the builtin InnoDB, and load the InnoDB Plugin and all new InnoDB Information Schema tables implemented in the InnoDB Plugin, using one of these alternatives:
Edit the option file (my.cnf, or my.ini) to contain the necessary options.
Specify equivalent options on the MySQL command line.
Edit the option file to disable InnoDB, then use
INSTALL statements on the MySQL command line
after startup.
These procedures are described in detail in the following sections.
Set appropriate configuration parameters to enable new InnoDB Plugin features.
Start MySQL, and verify the installation of the plugins.
The following sections detail these steps for Unix or Linux systems, and for Microsoft Windows.
For Unix and Linux systems, use the following procedure to install the InnoDB Plugin as a shared library:
Download, extract and install the suitable MySQL executable for your server platform and operating system from the MySQL download section for MySQL Database Server 5.1. Be sure to use a 32-bit or 64-bit version as appropriate for your hardware and operating system.
Make sure the MySQL server is not running. If the server is running, do a “slow” shutdown by issuing the following command before performing the shutdown:
SET GLOBAL innodb_fast_shutdown=0;
Then finish the shutdown process, as described in The Shutdown Process in the MySQL documentation. This option setting performs a full purge and an insert buffer merge before the shutdown, which can typically take minutes, or even hours for very large and busy databases.
Download and extract the files for the InnoDB Plugin from the InnoDB Plugin download page, which indicates the names of corresponding InnoDB Plugin and MySQL archive files.
Copy the file ha_innodb.so to the
folder where MySQL looks for plugins, which is generally the
lib/plugin subdirectory of the directory
specified by basedir where the MySQL
server is installed.
Edit the option file (my.cnf) to ignore the builtin InnoDB, and load the InnoDB Plugin and all Information Schema tables implemented in the InnoDB Plugin when the server starts:
ignore_builtin_innodb plugin-load=innodb=ha_innodb.so;innodb_trx=ha_innodb.so; innodb_locks=ha_innodb.so;innodb_lock_waits=ha_innodb.so; innodb_cmp=ha_innodb.so;innodb_cmp_reset=ha_innodb.so; innodb_cmpmem=ha_innodb.so;innodb_cmpmem_reset=ha_innodb.so
Note that all plugins for plugin-load
should be on the same line in the option file.
Alternatively, you can use the equivalent options on the MySQL command line:
mysqld --ignore-builtin-innodb --plugin-load=innodb=ha_innodb.so; innodb_trx=ha_innodb.so;innodb_locks=ha_innodb.so; innodb_lock_waits=ha_innodb.so;innodb_cmp=ha_innodb.so; innodb_cmp_reset=ha_innodb.so;innodb_cmpmem=ha_innodb.so; innodb_cmpmem_reset=ha_innodb.so
You can also install the InnoDB Plugin and the new InnoDB
Information Schema tables implemented in
ha_innodb.so with INSTALL
commands:
INSTALL PLUGIN INNODB SONAME 'ha_innodb.so'; INSTALL PLUGIN INNODB_TRX SONAME 'ha_innodb.so'; INSTALL PLUGIN INNODB_LOCKS SONAME 'ha_innodb.so'; INSTALL PLUGIN INNODB_LOCK_WAITS SONAME 'ha_innodb.so'; INSTALL PLUGIN INNODB_CMP SONAME 'ha_innodb.so'; INSTALL PLUGIN INNODB_CMP_RESET SONAME 'ha_innodb.so'; INSTALL PLUGIN INNODB_CMPMEM SONAME 'ha_innodb.so'; INSTALL PLUGIN INNODB_CMPMEM_RESET SONAME 'ha_innodb.so';
If you use INSTALL PLUGIN statement to
install the InnoDB Plugin and the Information Schema tables,
ensure the following conditions are set up:
In the mysqld command line or
my.cnf option file, prepend each InnoDB
option with loose_, so that MySQL will
start even when InnoDB is unavailable. For example,
write loose_innodb_file_per_table
instead of innodb_file_per_table.
Start the MySQL server while it is configured to
skip loading the built-in InnoDB and to make MyISAM the
default storage engine. This can be done by editing the
option file my.cnf to contain these two
lines:
ignore_builtin_innodb default_storage_engine=MyISAM
Or, you can use the equivalent options on the MySQL command line:
mysqld --ignore-builtin-innodb --default-storage-engine=MyISAM …
See the MySQL Manual section on INSTALL
PLUGIN Syntax for information on how these commands work.
Edit the option file my.cnf to use
InnoDB as the default engine (if desired) and set
appropriate configuration parameters to enable use of new
InnoDB Plugin features, as described in
Section 9.5, “Configuring the InnoDB Plugin”.
In particular, we recommend that you set the following
specific parameters as follows:
default-storage-engine=InnoDB innodb_file_per_table=1 innodb_file_format=barracuda innodb_strict_mode=1
The MySQL server always must be started with the option
ignore_builtin_innodb, as long as you want to use the
InnoDB Plugin as a shared library. Also, remember that the startup
option skip_grant_tables prevents MySQL from loading
any plugins.
Verify the installation of the plugins with the
MySQL command SHOW PLUGINS, which should
produce the following output:
| Name | Status | Type | Library | License |
|---|---|---|---|---|
| binlog | ACTIVE | STORAGE ENGINE | NULL | GPL |
| CSV | ACTIVE | STORAGE ENGINE | NULL | GPL |
| MEMORY | ACTIVE | STORAGE ENGINE | NULL | GPL |
| InnoDB | ACTIVE | STORAGE ENGINE | ha_innodb.so | GPL |
| INNODB_TRX | ACTIVE | INFORMATION SCHEMA | ha_innodb.so | GPL |
| INNODB_LOCKS | ACTIVE | INFORMATION SCHEMA | ha_innodb.so | GPL |
| INNODB_LOCK_WAITS | ACTIVE | INFORMATION SCHEMA | ha_innodb.so | GPL |
| INNODB_CMP | ACTIVE | INFORMATION SCHEMA | ha_innodb.so | GPL |
| INNODB_CMP_RESET | ACTIVE | INFORMATION SCHEMA | ha_innodb.so | GPL |
| INNODB_CMPMEM | ACTIVE | INFORMATION SCHEMA | ha_innodb.so | GPL |
| INNODB_CMPMEM_RESET | ACTIVE | INFORMATION SCHEMA | ha_innodb.so | GPL |
| MRG_MYISAM | ACTIVE | STORAGE ENGINE | NULL | GPL |
| MyISAM | ACTIVE | STORAGE ENGINE | NULL | GPL |
If the plugins fail to load properly, see Section 9.3.1.1, “Errors When Installing the InnoDB Plugin on Unix or Linux” for possible causes and corrections.
After verifying that the Plugin is recognized by MySQL, create an InnoDB table as another confirmation of success.
If MySQL or its associated daemon process cannot start,
or a post-startup INSTALL PLUGIN statement fails,
look at the MySQL error log (usually named
and located in the MySQL machine_name.errdata directory) for the
detailed error message.
The log is in chronological order, so look at the end of the file.
Try to resolve the problem based on other information in the
message.
| Error Condition or Message | Possible Solution |
|---|---|
Can't open shared library | Diagnose the cause from the following message details. |
API version for STORAGE ENGINE plugin is too different | The version of the Plugin is not compatible with the version of the MySQL server. Consult the compatibility chart . |
No such file or directory | Check that the file ha_innodb.so
or .dll
was copied to the
correct location. Confirm that you specified the right file name
(ha_innodb.so
or .dll
for the library from the InnoDB web site;
ha_innodb_plugin.so
or .dll
for the library supplied along with
MySQL 5.1.38 and up.)
|
Permission denied | Check that the directory and file access
permissions are set properly, or change them using
chmod on Unix-like systems . The
mysqld process must have permission
to read (r) the file
ha_innodb.so and to access files
(x) in the plugin directory. |
wrong ELF class
or any other message | Ensure that ha_innodb.so is
for the same system platform as
mysqld. In particular, note
that a 32-bit mysqld is unable to
load a 64-bit plugin, and vice versa. Be sure to
download an InnoDB Plugin that is compatible with your
platform. |
The Information Schema tables are themselves plugins to the MySQL server, but they depend on having the InnoDB storage engine plugin installed as well. These tables will appear to be empty if the storage engine is not installed.
The InnoDB Plugin is supported on any of the Windows operating system versions supported by MySQL. In particular, this includes Microsoft Windows 2008 Server, Windows Vista, Windows 2003 Server and Windows XP. Note that on Vista certain special procedures must be followed that are not documented here.
Use the following procedure to dynamically install the InnoDB Plugin on Microsoft Windows.
Download, extract and install the suitable MySQL executable for your server platform and operating system from the MySQL download section for MySQL Database Server 5.1. Be sure to use a 32-bit or 64-bit version as appropriate for your hardware and Windows version.
Make sure the MySQL server is not running. You do a “slow” shutdown by issuing the following command before performing the shutdown:
SET GLOBAL innodb_fast_shutdown=0;
Then finish the shutdown process, as described in The Shutdown Process in the MySQL documentation. This option setting performs a full purge and an insert buffer merge before the shutdown, which can typically take minutes, or even hours for very large and busy databases.
Download and extract the files for the InnoDB Plugin for your versions of Windows and MySQL from the InnoDB Plugin download page.
Copy the file ha_innodb.dll to the
folder where MySQL looks for plugins, which is generally the
lib\plugin sub-folder in the folder where
the MySQL server is installed (which is identified by
basedir in the my.cnf or
my.ini option file). If the
lib\plugin folder does not exist, you may
create it manually.
Edit the option file (my.ini) to ignore the builtin InnoDB, and load the InnoDB Plugin and all Information Schema tables implemented in the InnoDB Plugin when the server starts:
ignore_builtin_innodb plugin-load=innodb=ha_innodb.dll;innodb_trx=ha_innodb.dll; innodb_locks=ha_innodb.dll;innodb_lock_waits=ha_innodb.dll; innodb_cmp=ha_innodb.dll;innodb_cmp_reset=ha_innodb.dll; innodb_cmpmem=ha_innodb.dll;innodb_cmpmem_reset=ha_innodb.dll
All plugins for plugin-load
should be on the same line in the option file.
Be careful when copying and pasting that the line
does not split.
Alternatively, you can use the equivalent options on the MySQL command line:
mysqld --ignore-builtin-innodb --plugin-load= innodb=ha_innodb.dll; innodb_trx=ha_innodb.dll;innodb_locks=ha_innodb.dll; innodb_lock_waits=ha_innodb.dll;innodb_cmp=ha_innodb.dll; innodb_cmp_reset=ha_innodb.dll;innodb_cmpmem=ha_innodb.dll; innodb_cmpmem_reset=ha_innodb.dll
You can also install the InnoDB Plugin and the new InnoDB
Information Schema tables implemented in
ha_innodb.so with INSTALL
commands, as follows:
INSTALL PLUGIN INNODB SONAME 'ha_innodb.dll'; INSTALL PLUGIN INNODB_TRX SONAME 'ha_innodb.dll'; INSTALL PLUGIN INNODB_LOCKS SONAME 'ha_innodb.dll'; INSTALL PLUGIN INNODB_LOCK_WAITS SONAME 'ha_innodb.dll'; INSTALL PLUGIN INNODB_CMP SONAME 'ha_innodb.dll'; INSTALL PLUGIN INNODB_CMP_RESET SONAME 'ha_innodb.dll'; INSTALL PLUGIN INNODB_CMPMEM SONAME 'ha_innodb.dll'; INSTALL PLUGIN INNODB_CMPMEM_RESET SONAME 'ha_innodb.dll';
If you use INSTALL PLUGIN statements to
install the InnoDB Plugin and the Information Schema tables,
ensure the following conditions are set up:
In the mysqld command line or
my.ini option file, prepend each InnoDB
option with loose_, so that MySQL will
start even when InnoDB is unavailable. For example,
write loose_innodb_file_per_table
instead of innodb_file_per_table.
Start the MySQL server while it is configured to
skip loading the built-in InnoDB and to make MyISAM the
default storage engine. This can be done by editing the
option file my.cnf to contain these two
lines:
ignore_builtin_innodb default_storage_engine=MyISAM
Or, you can use the equivalent options on the MySQL command line:
mysqld --ignore-builtin-innodb --default-storage-engine=MyISAM …
See the MySQL Manual section on INSTALL
PLUGIN Syntax for
information on how these commands work.
Edit the option file my.ini to use
InnoDB as the default engine (if desired) and set
appropriate configuration parameters to enable use of new
InnoDB Plugin features, as described in Section 9.5, “Configuring the InnoDB Plugin”. In
particular, we recommend that you set the following
specific parameters as follows:
default-storage-engine=InnoDB innodb_file_per_table=1 innodb_file_format=barracuda innodb_strict_mode=1
IMPORTANT: The MySQL
server always must be started with the option
ignore_builtin_innodb, as long as you want to use the
dynamic InnoDB Plugin. Also, remember that the startup
option skip_grant_tables prevents MySQL from loading
any plugins.
Verify the installation of the plugins with the
MySQL command SHOW PLUGINS, which should
produce the following output:
| Name | Status | Type | Library | License |
|---|---|---|---|---|
| binlog | ACTIVE | STORAGE ENGINE | NULL | GPL |
| CSV | ACTIVE | STORAGE ENGINE | NULL | GPL |
| MEMORY | ACTIVE | STORAGE ENGINE | NULL | GPL |
| InnoDB | ACTIVE | STORAGE ENGINE | ha_innodb.dll | GPL |
| INNODB_TRX | ACTIVE | INFORMATION SCHEMA | ha_innodb.dll | GPL |
| INNODB_LOCKS | ACTIVE | INFORMATION SCHEMA | ha_innodb.dll | GPL |
| INNODB_LOCK_WAITS | ACTIVE | INFORMATION SCHEMA | ha_innodb.dll | GPL |
| INNODB_CMP | ACTIVE | INFORMATION SCHEMA | ha_innodb.dll | GPL |
| INNODB_CMP_RESET | ACTIVE | INFORMATION SCHEMA | ha_innodb.dll | GPL |
| INNODB_CMPMEM | ACTIVE | INFORMATION SCHEMA | ha_innodb.dll | GPL |
| INNODB_CMPMEM_RESET | ACTIVE | INFORMATION SCHEMA | ha_innodb.dll | GPL |
| MRG_MYISAM | ACTIVE | STORAGE ENGINE | NULL | GPL |
| MyISAM | ACTIVE | STORAGE ENGINE | NULL | GPL |
If the plugins fail to load properly, see Section 9.3.3, “Errors When Installing the InnoDB Plugin on Microsoft Windows” for possible causes and corrections.
After verifying that the Plugin is recognized by MySQL, create an InnoDB table as another confirmation of success.
If MySQL or the associated Windows service can not start,
or a post-startup INSTALL PLUGIN statement fails,
look at the MySQL error log (usually named
and located in the MySQL machine_name.errdata directory) for the
detailed error message.
The log is in chronological order, so look at the end of the file.
Try to resolve the problem based on other information in the
message.
| Error Condition or Message | Possible Solution |
|---|---|
Can't open shared library | Diagnose the cause from the following message details. |
API version for STORAGE ENGINE plugin is too different | The version of the Plugin is not compatible with the version of the MySQL server. Consult the compatibility chart . |
No such file or directory | Check that the file ha_innodb.so
or .dll
was copied to the
correct location. Confirm that you specified the right file name
(ha_innodb.so
or .dll
for the library from the InnoDB web site;
ha_innodb_plugin.so
or .dll
for the library supplied along with
MySQL 5.1.38 and up.)
|
Permission denied | Check that the folder and file access permissions are set
properly. The mysqld process
must have permission to read the file
ha_innodb.dll and to read files in
the plugin folder. On Windows XP, file permissions
can be seen or changed by right-clicking on a file and
pressing Properties, and then the Security Tab. To see
the Security Tab, you may need to adjust the Folder
Options on the Control Panel to turn off “Use
Simple File Sharing”. |
Can't open shared library
' | Ensure that |
Note: The Information Schema tables are themselves plugins to the MySQL server, but they depend on having the InnoDB storage engine plugin installed as well. These tables will appear to be empty if the storage engine is not installed.
Sometimes, you may wish to build the plugin from the source
code using special compilation options, or there might be no
binary plugin available for your server platform. With the
resulting special version of MySQL containing the new InnoDB
functionality, it is not necessary to INSTALL
any plugins or be concerned about startup parameters that preclude
loading plugins.
To build the InnoDB Plugin from the source code, you also need the MySQL source code and some software tools. You should become familiar with the MySQL manual section on MySQL Installation Using a Source Distribution.
The general steps for building MySQL from source, containing the InnoDB Plugin in place of the standard built-in InnoDB, are as follows:
Download the MySQL source code.
Download the InnoDB Plugin source code.
Replace the source code for the built-in InnoDB with the InnoDB Plugin source tree.
Compile MySQL as usual, generating a new
mysqld executable file.
Configure the MySQL server by editing the configuration file to use InnoDB as the default engine (if desired) and set appropriate configuration parameters to enable use of new InnoDB Plugin features.
The following sections detail these steps for Linux or Unix systems, and for Microsoft Windows.
Download the MySQL source code, version 5.1.24 or later from http://dev.mysql.com/downloads/mysql/5.1.html#source and extract it.
Download the InnoDB Plugin source code from http://www.innodb.com/innodb_plugin/download/.
Replace the contents of the
storage/innobase
directory in the MySQL
source tree with the InnoDB Plugin source tree.
In MySQL 5.1.38 and up, the MySQL source tree also contains a
storage/innodb_plugin directory, but that
does not affect this procedure. The source that you download
from the InnoDB web site may contain additional changes and fixes.
Compile and build MySQL. Instead of building a
dynamic InnoDB Plugin, it is advisable to build a version of
MySQL that contains the InnoDB Plugin. This is because
a dynamic InnoDB Plugin must be
built with exactly the same tools and options as the
mysqld executable, or spurious
errors may occur. Example:
% wget ftp://ftp.easynet.be/mysql/Downloads/MySQL-5.1/mysql-5.1.37.tar.gz % tar -zxf mysql-5.1.37.tar.gz % cd mysql-5.1.37/storage % wget http://www.innodb.com/download/innodb_plugin/innodb_plugin-1.0.6.tar.gz % tar -zxf innodb-1.0.6.tar.gz % rm -fr innobase % mv innodb-1.0.6 innobase % cd .. % ./configure --with-plugins=innobase % make
Reconfigure the MySQL server by editing the
my.cnf option file to use InnoDB as the
default engine (if desired) and set appropriate configuration
parameters to enable use of new InnoDB Plugin features, as
described in section Section 9.5, “Configuring the InnoDB Plugin”. In
particular, we recommend that you set the following
specific parameters as follows:
default_storage_engine=InnoDB innodb_file_per_table=1 innodb_file_format=barracuda innodb_strict_mode=1
If you build a version of MySQL that contains the
InnoDB Plugin
(--with-plugins=innobase), you do not
have to tell MySQL to specify ignore_builtin_innodb or
specify plugin_load, or issue any
INSTALL PLUGIN statements.
The mysqld executable that you compiled
will contain the new InnoDB Plugin features.
Note: In order to fully exploit the performance improvements discussed in Section 7.2, “Faster Locking for Improved Scalability”, the InnoDB Plugin source code and build process makes some compile-time tests of platform capabilities to automatically use instructions for atomic memory access where available. If this logic fails, you may need to perform some additional steps as described in a Support Tip on the InnoDB website.
The MySQL website includes some information about building from source on Windows. The following discussion is specifically focused on building a version of MySQL containing the InnoDB Plugin.
You need the following tools:
A compiler environment, one of the following:
Microsoft Visual C++ 2003
Microsoft Visual C++ 2005
Microsoft Visual C++ 2008 (Note: for building MySQL 5.1.31 or later)
Microsoft Visual C++ 2005 Express Edition (free of charge)
Download and install the Microsoft Visual C++ 2005 Express Edition.
Download and install the Windows Platform SDK.
Configure the Visual Studio Express Edition to use the Platform SDK according to the instruction.
Microsoft Visual C++ 2008 Express Edition (free of charge, for building MySQL 5.1.31 or later)
Download and install the Microsoft Visual C++ 2008 Express Edition. The Visual C++ 2008 Express Edition has already been integrated with the Windows SDK.
GNU Bison for Windows, a general-purpose parser generator that is largely compatible with Berkeley Yacc. This tool is used automatically as part of compiling and building MySQL. For most users, it is sufficient to download and run the “complete package” to install GNU Bison.
CMake 2.6.0 or later, a cross-platform make system that can generate MSVC project files.
In addition to installing these tools, you must also set CScript as the default Windows script host by executing the following command in the Command Prompt:
cscript //H:CScriptAfter you have installed and configured all the required tools, you may proceed with the compilation.
Download the MySQL source code, version 5.1.24 or later from the MySQL website and extract the source files.
Download the InnoDB plugin source code from the InnoDB download page.
Extract the files from the source code archives.
Replace the contents of the
storage\innobase folder in the MySQL
source tree with the InnoDB plugin source tree.
In MySQL 5.1.38 and up, the MySQL source tree also contains a
storage\innodb_plugin directory, but that
does not affect this procedure. The source that you download
from the InnoDB web site may contain additional changes and fixes.
Compile and build MySQL under the Microsoft Visual Studio Command Prompt as follows:
win\configure WITH_INNOBASE_STORAGE_ENGINE __NT__ win\build-vs7.bat devenv mysql /build release /project ALL_BUILD
win\configure WITH_INNOBASE_STORAGE_ENGINE __NT__ win\build-vs8.bat devenv mysql /build release /project ALL_BUILD
win\configure WITH_INNOBASE_STORAGE_ENGINE __NT__ win\build-vs9.bat vcbuild mysql.sln "Release"
For the 64-bit version, use
win\build-vs
instead of
N_x64.batwin\build-vs.
N.bat
Install the compiled mysqld.exe
from the sql\release folder of the source
tree by doing one of the following:
Copy the mysqld.exe to the
bin folder of an earlier MySQL 5.1
installation.
Make a distribution package and unpack it to the
folder where MySQL will be installed. See the MySQL
manual section on make_win_bin_dist—Package MySQL
Distribution as ZIP Archive. Note that
scripts\make_win_bin_dist requires the
Cygwin environment.
Reconfigure the MySQL server by editing the
my.cnf or my.ini option
file to use InnoDB as the default engine (if desired) and
set appropriate configuration parameters to enable use of new
InnoDB Plugin features, as described in section Section 9.5, “Configuring the InnoDB Plugin”. In
particular, we recommend that you set the following
specific parameters as follows:
default_storage_engine=InnoDB innodb_file_per_table=1 innodb_file_format=barracuda innodb_strict_mode=1
Since you built a version of MySQL that contains
the InnoDB Plugin, you do not have to specify
ignore_builtin_innodb or specify plugin_load, or
issue any INSTALL PLUGIN statements. The
mysqld.exe that you compiled contains
the new InnoDB Plugin features.
Because the MySQL server as distributed by MySQL
includes a built-in copy of InnoDB, if you are using the dynamic
InnoDB Plugin and have INSTALLed it into the
MySQL server, you must always start the server with the option
ignore_builtin_innodb, either in the option file or on
the mysqld command line. Also, remember that
the startup option skip_grant_tables prevents MySQL from
loading any plugins. Neither of these options is needed when
using a specialized version of MySQL that you build from
source.
By default, the InnoDB Plugin does not create tables in a format that is incompatible with the built-in InnoDB in MySQL. Tables in the new format may be compressed, and they may store portions of long columns off-page, outside the B-tree nodes. You may wish to enable the creation of tables in the new format, using one of these techniques:
Include innodb_file_per_table=1 and
innodb_file_format=barracuda in the
[mysqld] section of the MySQL option file.
Add --innodb_file_per_table=1 and
--innodb_file_format=barracuda to the
mysqld command line.
Issue the statements:
SET GLOBAL innodb_file_format=barracuda; SET GLOBAL innodb_file_per_table=ON;
in the MySQL client when running with SUPER
privileges.
You may also want to enable the new InnoDB strict mode, which
guards SQL or certain operational errors that otherwise generate
warnings and possible unintended consequences of ignored or
incorrect SQL commands or parameters. As described in
Section 8.5, “InnoDB Strict Mode”, the
GLOBAL parameter innodb_strict_mode can be
set ON or OFF in the same
way as the parameters just mentioned. You can also use
the command
SET SESSION innodb_strict_mode=
(where modemode is
ON or OFF) to enable or
disable InnoDB strict mode on a per-session basis.
Take care when using new InnoDB configuration
parameters or values that apply only when using the
InnoDB Plugin. When the MySQL server encounters an unknown option,
it fails to start and returns an error: unknown
variable. This happens, for example, if you include
the new parameter innodb_file_format when you
start the MySQL server with the built-in InnoDB rather than
the plugin. This can cause a problem if you accidentally use the
built-in InnoDB after a system crash, because InnoDB crash
recovery runs before MySQL checks the startup parameters. See
Section 11.4, “Possible Problems” why this can be a
problem. One safeguard is to specify the prefix loose_
before the names of new options, so that if they are not recognized on startup,
the server gives a warning instead of a fatal error.
The Plugin that you download from the InnoDB web site should always be at the same level or newer than the shared library that is included with the MySQL distribution starting with version 5.1.38. To pick up the very latest fixes, download from the InnoDB site.
For the types of errors and how to diagnose them, see
Section 9.3.3, “Errors When Installing the InnoDB Plugin on
Microsoft Windows”.
Be especially careful that the plugin-load line in the
option file does not get split across lines when you copy and paste from the README
or this manual, which can produce an “unrecognized option” error in the error log.
You automatically benefit from the “fast index creation”
feature for every index you create on a large InnoDB table.
If you switch to the “Barracuda” file format using the
innodb_file_format option, you can
take advantage of other features such as table compression.
For the full list of features, refer to
Section 1.2, “Features of the InnoDB Plugin”.
The Plugin must loaded whenever the MySQL database server is started.
As we saw earlier, there are several ways to configure MySQL to use
the Plugin rather than the built-in InnoDB: in the option file,
with mysqld command-line options, or with
INSTALL statements after the server starts.
To ensure that you do not accidentally revert to the older InnoDB,
be careful to carry that configuration forward in the future,
such as when transitioning from a development system to a test system,
setting up a replication slave, or when writing new mysqld
startup scripts.
Table of Contents
Thanks to the pluggable storage engine architecture of MySQL, upgrading the InnoDB Plugin should be a simple matter of shutting down MySQL, replacing a platform-specific executable file, and restarting the server. If you wish to upgrade and use your existing database, it is essential to perform a “slow” shutdown, or the new plugin may fail when merging buffered inserts or purging deleted records. If your database does not contain any compressed tables, you should be able to use your database with the newest InnoDB Plugin without problems after a slow shutdown.
However, if your database contains compressed tables, it may not be compatible with InnoDB Plugin 1.0.6. Because of an incompatible change introduced in InnoDB Plugin version 1.0.2, some compressed tables may need to be rebuilt, as noted in Section 10.3, “Converting Compressed Tables Created Before Version 1.0.2”. Please follow these steps carefully.
You may, of course, rebuild your database using
mysqldump or other methods. This may be a
preferable approach if your database is small or there are many
referential constrains among tables.
Note that once you have accessed your database with InnoDB Plugin 1.0.6, you should not try to use it with the Plugin prior to 1.0.2.
Before shutting down the MySQL server containing the InnoDB Plugin, you must enable “slow” shutdown:
SET GLOBAL innodb_fast_shutdown=0;
For the details of the shutdown procedure, see the MySQL manual on The Shutdown Process.
In the directory where the MySQL server looks for
plugins, rename the executable file of the old InnoDB Plugin
(ha_innodb.so or
ha_innodb.dll), so that you can restore it
later if needed. You may remove the file later. The plugin
directory is specified by the system variable
plugin_dir. The default location is usually
the lib/plugin subdirectory of the directory
specified by basedir.
Download a suitable package for your server platform,
operating system and MySQL version. Extract the contents of
the archive using tar or a similar tool for
Linux and Unix, or Windows Explorer or WinZip or similar utility
for Windows. Copy the file ha_innodb.so or
ha_innodb.dll to the directory where the
MySQL server looks for plugins.
Start the MySQL server. Follow the procedure in Section 10.3, “Converting Compressed Tables Created Before Version 1.0.2” to convert any compressed tables if needed.
As with a dynamically installed InnoDB Plugin, you must
perform a “slow” shutdown of the MySQL server.
If you have built MySQL from source code and replaced
the built-in InnoDB in MySQL with the InnoDB Plugin in the source tree as discussed
in Section 9.4, “Building the InnoDB Plugin from Source Code”, you will
have a special version of the mysqld
executable that contains the InnoDB Plugin.
If you intend to upgrade to a dynamically linked InnoDB Plugin, you can follow the advice of Section 11.3.4, “Uninstalling a Statically Built InnoDB Plugin” and Section 9.3, “Installing the Precompiled InnoDB Plugin as a Shared Library”.
If you intend to upgrade a statically built
InnoDB Plugin to another statically built plugin, you will
have to rebuild the mysqld executable, shut
down the server, and replace the mysqld
executable before starting the server.
Either way, please be sure to follow the instructions of Section 10.3, “Converting Compressed Tables Created Before Version 1.0.2” if any compressed tables were created.
The InnoDB Plugin version 1.0.2 introduces an incompatible change to the format of compressed
tables. This means that some compressed tables that
were created with an earlier version of the InnoDB Plugin may
need to be rebuilt with a bigger KEY_BLOCK_SIZE before they can
be used.
If you must keep your existing database when you upgrade to InnoDB Plugin 1.0.2 or newer, you will need to perform a “slow” shutdown of MySQL running the previous version of the InnoDB Plugin. Following such a shutdown, and using the newer release of the InnoDB Plugin, you will need to determine which compressed tables need conversion and then follow a procedure to upgrade these tables. Because most users will not have tables where this process is required, this manual does not detail the procedures required. If you have created compressed tables with the InnoDB Plugin prior to release 1.0.2, you may want to review the relevant information on the InnoDB website.
Table of Contents
There are times when you might want to use the InnoDB Plugin with a given database, and then downgrade to the built-in InnoDB in MySQL. One reason to do this is because you want to take advantage of a new InnoDB Plugin feature (such as “Fast Index Creation”), but revert to the standard built-in InnoDB in MySQL for production operation.
If you have created new tables using the InnoDB Plugin, you may
need to convert them to a format that the built-in InnoDB in MySQL can read.
Specifically, if you have created tables that use
ROW_FORMAT=COMPRESSED or
ROW_FORMAT=DYNAMIC you must convert them to a
different format, if you plan to access these tables with the
built-in InnoDB in MySQL. If you do not do so, anomalous results may
occur.
Although InnoDB will check the format of tables and
database files (specifically *.ibd files) for
compatibility, it is unable to start if there are buffered changes
for “too new format” tables in the redo log or in the
system tablespace. Thus it is important to carefully follow these
procedures when downgrading from the InnoDB Plugin to the built-in InnoDB in MySQL,
version 5.1.
This chapter describes the downgrade scenario, and the steps you should follow to ensure correct processing of your database.
Starting with version 5.0.21, the built-in InnoDB in MySQL checks the table type before opening a table. Until now, all InnoDB tables have been tagged with the same type, although some changes to the format have been introduced in MySQL versions 4.0, 4.1, and 5.0.
One of the important new features introduced with the InnoDB Plugin is support for identified file formats. This allows the InnoDB Plugin and versions of InnoDB since 5.0.21 to check for file compatibility. It also allows the user to preclude the use of features that would generate downward incompatibilities. By paying attention to the file format used, you can protect your database from corruptions, and ensure a smooth downgrade process.
In general, before using a database file created with the
InnoDB Plugin with the built-in InnoDB in MySQL you should verify that the tablespace
files (the *.ibd files) are compatible with the
built-in InnoDB in MySQL. The InnoDB Plugin can read and write tablespaces in both
the formats “Antelope” and “Barracuda”. The built-in InnoDB
can only read and write tablespaces in “Antelope” format. To make all
tablespaces “legible” to the built-in InnoDB in MySQL, you should
follow the instructions in Section 11.3, “How to Downgrade” to reformat all
tablespaces to be in the “Antelope” format.
Generally, after a “slow” shutdown of the
InnoDB Plugin (innodb_fast_shutdown=0),
it should be safe to open the data files with the built-in InnoDB in MySQL. See
Section 11.4, “Possible Problems” for a discussion of
possible problems that can arise in this scenario and workarounds
for them.
The built-in InnoDB in MySQL can access only tables in the “Antelope”
file format, that is, in the REDUNDANT or COMPACT row format.
If you have created tables in COMPRESSED or DYNAMIC format,
then they will use tablespaces in the new “Barracuda” file format,
and it will be necessary to downgrade these tables.
First, identify the tables that require conversion, by executing this command:
SELECT table_schema, table_name, row_format
FROM information_schema.tables
WHERE engine='innodb'
AND row_format NOT IN ('Redundant', 'Compact');Next, for each table that requires conversion, run the following command:
ALTER TABLE table_name ROW_FORMAT=COMPACT;This command will copy the table and its indexes to a new tablespace in the “Antelope” format. See Chapter 2, Fast Index Creation in the InnoDB Storage Engine for a discussion of exactly how such index creation operations are performed.
Before you shut down the InnoDB Plugin and start the basic built-in InnoDB in MySQL, you should review the configuration files. Changes to the startup options will not take effect until the server is restarted, or the InnoDB Plugin is uninstalled and reinstalled.
The InnoDB Plugin introduces several configuration
parameters that are not recognized by the built-in InnoDB in MySQL,
including: innodb_file_format, innodb_file_format_check,
and innodb_strict_mode. See
Section C.1, “New Parameters” for a complete list
of new configuration parameters in the InnoDB Plugin.
You can include these parameters in the configuration file,
only if you use the loose_ form of the
parameter names, so that the built-in InnoDB in MySQL will be able to start.
If the InnoDB Plugin was installed as a dynamic plugin, the
startup option ignore_builtin_innodb or skip_innodb must
have been set to disable the built-in InnoDB in MySQL. These options have to
be removed, so that the built-in InnoDB in MySQL will be enabled the next
time the server is started.
If the InnoDB Plugin was loaded via plugin_load option.
This option has to be removed too.
In MySQL, configuration options can be specified in the
mysqld command line or the option file
(my.cnf or my.ini). See
the MySQL manual on Using Option Files for more information.
The following applies if the InnoDB Plugin was installed as a
dynamic library with the INSTALL PLUGIN
command, as described in Section 9.3, “Installing the Precompiled InnoDB Plugin as a Shared Library”.
Issue the command UNINSTALL PLUGIN for
every “plugin” supplied by the library
ha_innodb.so (or ha_innodb.dll
on Windows). Note that the following commands will initiate a
shutdown of the InnoDB Plugin:
SET GLOBAL innodb_fast_shutdown=0; UNINSTALL PLUGIN INNODB; UNINSTALL PLUGIN INNODB_CMP; UNINSTALL PLUGIN INNODB_CMP_RESET; UNINSTALL PLUGIN INNODB_CMPMEM; UNINSTALL PLUGIN INNODB_CMPMEM_RESET; UNINSTALL PLUGIN INNODB_TRX; UNINSTALL PLUGIN INNODB_LOCKS; UNINSTALL PLUGIN INNODB_LOCK_WAITS;
Due to MySQL Bug #33731, please ensure that the plugin definitions were actually deleted from the database, so that they will not be loaded again:
SELECT * FROM mysql.plugin;
DELETE FROM mysql.plugin WHERE name='…';Restart the server. For the details of the shutdown procedure, see the MySQL manual on The Shutdown Process.
If you have built MySQL from source code and replaced
the built-in InnoDB in MySQL with the InnoDB Plugin in the source tree as discussed
in Section 9.4, “Building the InnoDB Plugin from Source Code”, you will
have a special version of the mysqld
executable that contains the InnoDB Plugin. To
“uninstall” the InnoDB Plugin, you will replace this
executable with something that is built from an unmodified
MySQL source code distribution.
Before shutting down the version of the MySQL server with built-in InnoDB Plugin, you must enable “slow” shutdown:
SET GLOBAL innodb_fast_shutdown=0;
For the details of the shutdown procedure, see the MySQL manual on The Shutdown Process.
Failure to follow the downgrading procedure described in Section 11.3, “How to Downgrade” may lead to compatibility issues when files written by the InnoDB Plugin are accessed by the built-in InnoDB in MySQL. This section describes some internal recovery algorithms, to help explain why it is important to follow the downgrade procedure described above. It discusses the issues that may arise, and covers possible ways to fix them.
A general fix is to install the plugin as described in Chapter 9, Installing the InnoDB Plugin and then follow the downgrading procedure described in Section 11.3, “How to Downgrade”.
In the future, the file format management features described in Chapter 4, InnoDB File Format Management will guard against the types of problems described in this section.
The built-in InnoDB in MySQL can only open tables that were created in
REDUNDANT or COMPACT
format. Starting with MySQL version 5.0.21, an attempt to
open a table in some other format will result in ERROR
1146 (42S02): Table '. Furthermore, a message “test.t' doesn't
existunknown
table type” will appear in the error
log.
In the InnoDB Plugin, you may rebuild an incompatible table
by issuing a statement
ALTER TABLE .
table_name
ROW_FORMAT=COMPACT
As noted in Section 11.3, “How to Downgrade”, you should ensure a
“slow” shutdown is done with the InnoDB Plugin, before
running with the built-in InnoDB in MySQL, in order to clean up all buffers.
To initiate a slow shutdown, execute the command SET
GLOBAL innodb_fast_shutdown=0 before initiating the
shutdown of the InnoDB Plugin.
The reason “slow” shutdown
(innodb_fast_shutdown=0) is recommended is
because the InnoDB Plugin may write special records to the
transaction undo log that will cause problems if the built-in InnoDB in MySQL
attempts to read the log. Specifically, these special records
will be written when a record in a COMPRESSED
or DYNAMIC table is updated or deleted and
the record contains columns stored off-page. The built-in InnoDB in MySQL
cannot read these undo log records. Furthermore, the built-in InnoDB in MySQL
will fail to roll back incomplete transactions that affect
tables that it is unable to read (tables in
COMPRESSED or DYNAMIC
format).
Note that a “normal” shutdown
will not necessarily
empty the undo log. A normal shutdown occurs when
innodb_fast_shutdown=1, the default. When
InnoDB is shut down, some active transactions may have
uncommitted modifications, or they may be holding a read view
that prevents the purging of some version information from the
undo log. The next time InnoDB is started after a normal
shutdown (innodb_fast_shutdown=1), it will
roll back any incomplete transactions and purge old version
information. Therefore, it is important to perform a
“slow” shutdown
(innodb_fast_shutdown=0) as part of the
downgrade process.
In case it is not possible to have the InnoDB Plugin clear the
undo log, you can prevent the built-in InnoDB in MySQL from accessing the undo
log by setting innodb_force_recovery=3.
However, this is not a recommended approach, since in addition to
preventing the purge of old versions, this recovery mode will
prevent the rollback of uncommitted transactions. For more
information, see the MySQL manual on Forcing InnoDB Recovery.
When it comes to downgrading, there are also
considerations with respect to redo log information. For the
purpose of crash recovery, InnoDB will write to the log files
information about every modification to the data files. When
recording changes to tables that were created in
DYNAMIC or COMPRESSED
format, the InnoDB Plugin writes redo log entries that cannot be
recognized by the built-in InnoDB in MySQL. The built-in InnoDB in MySQL will refuse to start
if it sees any unknown entries in the redo log.
When InnoDB is shut down cleanly,
it will flush all unwritten changes from the buffer pool to the
data files and make a checkpoint in the redo log. When InnoDB
is subsequently restarted, it will scan the redo log starting
from the last checkpoint. After a clean shutdown, InnoDB
crash recovery will only then see the end-of-log marker in the
redo log. In this case, the built-in InnoDB in MySQL would not see any
unrecognizable redo log entries. This is a second reason why
you should ensure a clean, slow shutdown of MySQL
(innodb_fast_shutdown=0) before you attempt a
downgrade.
In an emergency, you may prevent the redo log scan and the
crash recovery from the redo log by setting the parameter
innodb_force_recovery=6. However, this is
strongly discouraged, because
may lead into severe corruption. See the MySQL manual on Forcing InnoDB Recovery for more information.
InnoDB uses a novel file flush technique called doublewrite. Before writing pages to a data file, InnoDB first writes them to a contiguous area called the doublewrite buffer. Only after the write and the flush to the doublewrite buffer have completed does InnoDB write the pages to their proper positions in the data file. If the operating system crashes in the middle of a page write, InnoDB can later find a good copy of the page from the doublewrite buffer during recovery.
The doublewrite buffer may also contain compressed pages.
However, the built-in InnoDB in MySQL will fail to recognize such pages, and it
will assume that compressed pages in the doublewrite buffer are
corrupted. It will also wrongly assume that the tablespace (the
.ibd file) will consist of 16K byte pages.
Thus, you may find InnoDB warnings in the error log of the
form “a page in the doublewrite buffer is not within space
bounds”.
The doublewrite buffer will not be scanned after a
clean shutdown. In an emergency, you may
prevent crash recovery by setting
innodb_force_recovery=6. However, this is strongly discouraged, because it may
lead into severe corruption. For more information, see the
MySQL manual on Forcing InnoDB Recovery.
Secondary indexes are usually non-unique, and insertions into secondary indexes happen in a relatively random order. This would cause a lot of random disk I/O operations without a special mechanism used in InnoDB called the insert buffer.
When a record is inserted into a non-unique secondary index page that is not in the buffer pool, InnoDB will insert the record into a special B-tree: the insert buffer. Periodically, the insert buffer is merged into the secondary index trees in the database. A merge will also occur whenever a secondary index page is loaded to the buffer pool.
A “normal” shutdown will not clear
the insert buffer. A normal shutdown occurs when
innodb_fast_shutdown=1, the default. If the
insert buffer is not empty when the InnoDB Plugin is shut
down, it may contain changes for tables in DYNAMIC or
COMPRESSED format. Thus, starting the built-in InnoDB in MySQL on the data
files may lead into a crash if the insert buffer is not
empty.
A “slow” shutdown will merge all
changes from the insert buffer. To initiate a slow
shutdown, execute the command SET GLOBAL
innodb_fast_shutdown=0 before initiating the shutdown
of the InnoDB Plugin.
To disable insert buffer merges, you may set
innodb_force_recovery=4 so that you can back up the
uncompressed tables with the built-in InnoDB in MySQL. Be sure not to use any
WHERE conditions that would require access to
secondary indexes. For more information, see the MySQL manual
on Forcing InnoDB Recovery.
In the InnoDB Plugin 1.0.3 and later, you can disable
the buffering of new operations by setting the parameter
innodb_change_buffering. See Section 7.4, “Controlling InnoDB Insert Buffering” for
details.
Table of Contents
The complete change history of the InnoDB Plugin can be
viewed in the file ChangeLog that is included in
the source and binary distributions as well as at the InnoDB
website.
Fixed MySQL Bug #48782: On lock wait timeout, CREATE INDEX
attempts DROP TABLE.
Report duplicate table names to the client connection, not to the error log.
Allow CREATE INDEX to be interrupted.
Fixed MySQL Bug #47167: InnoDB Plugin
"set global innodb_file_format_check" cannot set value
by User-Defined Variable.
Fixed MySQL Bug #45992: InnoDB memory not freed after shutdown; and MySQL Bug
#46656:
InnoDB Plugin memory leaks (Valgrind).
Clean up after a crash during DROP INDEX.
When InnoDB crashes while dropping an index, ensure that the
index will be completely dropped during crash recovery.
When a secondary index exists in the MySQL .frm file but not in
the InnoDB data dictionary, return an error instead of letting an
assertion fail in index_read.
Prevent the reuse of tablespace identifiers after InnoDB has crashed during table creation. Also, refuse to start if files with duplicate tablespace identifiers are encountered.
Fixed MySQL Bug #47055: Unconditional exit on ERROR_WORKING_SET_QUOTA
1453 (0x5AD) for InnoDB backend.
Fixed MySQL Bug #37232: InnoDB might get too many read locks for
DML with repeatable-read.
Fixed MySQL Bug #31183: Tablespace full problems not reported in error log; error message unclear.
Modified innodb-zip.test so that the
test will pass with zlib 1.2.3.3. Apparently, the zlib
function compressBound() has been slightly
changed, and the maximum record size of a table with 1K compressed
page size has been reduced by one byte.
Fixed a regression introduced by the fix for MySQL Bug #26316.
Fixed MySQL Bug #44571: InnoDB Plugin crashes on ADD INDEX.
Fixed a bug in the merge sort that can corrupt indexes in fast index creation.
Introduced the settable global variables innodb_old_blocks_pct
and innodb_old_blocks_time for controlling the buffer pool eviction
policy, making it possible to tune the buffer pool LRU eviction policy to
be more resistant against index scans. See
Section 7.14, “Making Buffer Cache Scan Resistant”.
Fixed MySQL Bug #42885: buf_read_ahead_random,
buf_read_ahead_linear counters, thread wakeups. See
Section 8.9, “More Read Ahead Statistics”.
Fixed MySQL Bug #46650: InnoDB assertion autoinc_lock == lock
in lock_table_remove_low on INSERT
SELECT.
Fixed MySQL Bug #46657: InnoDB Plugin: invalid read in
index_merge_innodb test (Valgrind).
Fixed MySQL Bug #42829: binlogging enabled for all schemas regardless of binlog-db-db / binlog-ignore-db.
Enabled inlining of functions and prefetch with Sun Studio.
Changed the defaults for innodb_sync_spin_loops
from 20 to 30 and innodb_spin_wait_delay from 5 to 6.
Implemented adaptive flushing of dirty pages, which uses
heuristics to avoid I/O bursts at checkpoint.
A new parameter
is added to control whether the new flushing algorithm should
be used. See Section 7.11, “Controlling the Flushing Rate of Dirty Pages”.
innodb_adaptive_flushing
Implemented I/O capacity tuning. A new
parameter is added to
control the master threads innodb_io_capacityI/O rate. (To preserve
the former behavior, set this parameter to a value of 100.) The
ibuf merge is also changed from synchronous to
asynchronous. See
Section 7.10, “Controlling the Master Thread I/O Rate”.
Introduced the PAUSE instruction inside
spin-loop where available. See
Section 7.12, “Using the PAUSE instruction in InnoDB spin loops”.
Fixed a crash on
SET GLOBAL innodb_file_format=DEFAULT or
SET GLOBAL innodb_file_format_check=DEFAULT.
Changed the default values for innodb_max_dirty_pages_pct,
innodb_additional_mem_pool_size, innodb_buffer_pool_size, and
innodb_log_buffer_size.
Enabled group commit functionality that was broken in 5.0 when distributed transactions were introduced. See Section 7.9, “Group Commit”.
Enabled the functionality of having multiple background
threads, with two new configuration parameters, innodb_read_io_threads
and innodb_write_io_threads. The Windows only parameter
innodb_file_io_threads has been removed. See
Section 7.8, “Multiple Background I/O Threads”.
Changed the linear read ahead algorithm and disabled
random read ahead. Also introduced a new configuration parameter
innodb_read_ahead_threshold to control the sensitivity of the
linear read ahead. See Section 7.7, “Changes in the Read Ahead Algorithm”.
Standardized comments that allow the extraction of documentation
from code base with the Doxygen tool.
Fixed a bug that could cause failures in secondary index lookups in consistent reads right after crash recovery.
Corrected the estimation of space needed on a compressed page
when performing an update by delete-and-insert.
Removed the statically linked copies of
the zlib and strings
libraries from the binary Windows plugin. Invoke the copies of
these libraries in the mysqld executable, like
the binary InnoDB Plugin does on other platforms.
Trimmed the output of SHOW ENGINE INNODB MUTEX. See
Section 8.8, “More compact output of SHOW ENGINE INNODB MUTEX”.
On Microsoft Windows, make use of atomic memory access to implement mutexes and rw-locks more efficiently. On Sun Solaris 10, if GCC built-in functions for atomic memory access are unavailable, use library functions instead. See Section 7.2, “Faster Locking for Improved Scalability”.
Fixed MySQL Bug #44032: in ROW_FORMAT=REDUNDANT, update
UTF-8 CHAR to/from NULL is
not in-place.
Fixed MySQL Bug #43660: SHOW
INDEXES/ANALYZE does not update
cardinality for indexes of InnoDB table.
Made the parameter innodb_change_buffering settable by
mysqld start-up option. Due to a programming
mistake, it was only possible to set this parameter by the
SET GLOBAL command in InnoDB Plugin
1.0.3.
Added a parameter innodb_spin_wait_delay for controlling
the polling of mutexes and rw-locks. See
Section 7.13, “Control over spin lock polling”.
In consistent reads, issue an error message on attempts to use newly created indexes that may lack required history. See Section 2.6, “Limitations”.
Improved the scalability of InnoDB on multi-core CPUs. See Section 7.2, “Faster Locking for Improved Scalability”.
Added a parameter innodb_change_buffering for controlling
the insert buffering. See Section 7.4, “Controlling InnoDB Insert Buffering”.
Added a parameter innodb_use_sys_malloc for using an
operating system memory allocation rather than the InnoDB
internal memory allocator. See Section 7.3, “Using Operating System Memory Allocators”.
Made it possible to dynamically enable or disable adaptive hash indexing. See Section 7.5, “Controlling Adaptive Hash Indexing”.
Changed the default value of innodb_thread_concurrency
from 8 to 0, for
unlimited concurrency by default. See Section 7.6, “Changes Regarding Thread Concurrency”.
Fixed an issue that the InnoDB Plugin fails if
innodb_buffer_pool_size is defined bigger than
4095M on 64-bit Windows.
Fixed MySQL bug #41676: Table names are case insensitive in locking.
Fixed MySQL bug #41904: Create unique index problem.
Fixed MySQL bug #43043: Crash on BLOB delete operation.
Fixed a bug in recovery when dropping incomplete indexes left behind by fast index creation.
Fixed a crash bug when all rows of a compressed table are deleted.
Fixed a corruption bug when a table is dropped on a busy system that contains compressed tables.
Fixed an assertion failure involving the variable
ut_total_allocated_memory that was caused by
unprotected access during fast index creation.
Implemented the dynamic plugin
(ha_innodb.dll) on Windows.
Added a parameter innodb_stats_sample_pages for
controlling the index cardinality estimates.
Made innodb_stats_on_metadata a settable global parameter.
(MySQL bug #38189)
Made innodb_lock_wait_timeout a settable session parameter.
(MySQL bug #36285)
Fixed bugs related to off-page columns (see Section 5.3, “DYNAMIC Row Format”).
Fixed various bugs related to compressed tables. This includes MySQL bug #36172, a possible but rare corruption, and an incompatible file format change relating to very long rows in compressed tables, and to off-page storage of long column values.
Fixed a bug in crash recovery which was a side effect of incorrect implementation of the system tablespace tagging.
Fixed MySQL bugs related to auto_increment
columns: #26316, #35498, #35602, #36411, #37531, #37788, #38839, #39830, #40224.
Fixed some race conditions, hangs or crashes related to
INFORMATION_SCHEMA tables, fast index creation,
and to the recovery of PREPARED transactions.
Fixed crashes on
DROP TABLE or CREATE TABLE
when there are FOREIGN KEY constraints.
(MySQL bug #38786)
Fixed a crash caused by a conflict between TRUNCATE
TABLE and LOCK TABLES. (MySQL bug
#38231)
Fixed MySQL bug #39939: DROP TABLE or
DISCARD TABLESPACE takes a long time.
Fixed MySQL bug #40359: InnoDB plugin error/warning message during shutdown.
Fixed MySQL bug #40360: Binlog related errors with binlog off.
Applied all changes from MySQL through version 5.1.30.
Fixed bugs related to the packaging of the InnoDB Plugin: MySQL bugs #36222, #36434.
Fixed crash bugs related to the new features of the InnoDB Plugin: MySQL bugs #36169, #36310.
Implemented the system tablespace tagging discussed in Section 4.4.1, “Startup File Format Compatibility Checking”.
Applied all changes from MySQL through version 5.1.25.
The initial release of the InnoDB Plugin is based on the built-in InnoDB in MySQL version 5.1. See Section 1.2, “Features of the InnoDB Plugin” for the main features.
Table of Contents
Innobase Oy acknowledges that certain Third Party and Open Source software has been used to develop or is incorporated in InnoDB (including the InnoDB Plugin). This appendix includes required third-party license information.
Innobase Oy gratefully acknowledges the following contributions from Google, Inc. to improve InnoDB performance:
I/O rate,
as discussed in Section 7.10, “Controlling the Master Thread I/O Rate”.
The master thread in InnoDB is a thread that performs various
tasks in the background. Historically, InnoDB has used a hard
coded value as the total I/O capacity of the
server. With this change, user can control the number of
I/O operations that can be performed per
second based on their own workload.Changes from the Google contributions were incorporated in
the following source code files: btr0cur.c,
btr0sea.c, buf0buf.c,
buf0buf.ic, ha_innodb.cc,
log0log.c, log0log.h,
os0sync.h,
row0sel.c, srv0srv.c,
srv0srv.h, srv0start.c,
sync0arr.c, sync0rw.c,
sync0rw.h, sync0rw.ic,
sync0sync.c, sync0sync.h,
sync0sync.ic, and
univ.i.
These contributions are incorporated subject to the
conditions contained in the file
COPYING.Google, which are reproduced
here.
Copyright (c) 2008, 2009, Google Inc.
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:
* Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above
copyright notice, this list of conditions and the following
disclaimer in the documentation and/or other materials
provided with the distribution.
* Neither the name of the Google Inc. nor the names of its
contributors may be used to endorse or promote products
derived from this software without specific prior written
permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
“AS IS” AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
POSSIBILITY OF SUCH DAMAGE.Innobase Oy gratefully acknowledges the contribution of Percona, Inc. to improve InnoDB performance by implementing configurable background threads, as discussed in Section 7.8, “Multiple Background I/O Threads”. InnoDB uses background threads to service various types of I/O requests. The change provides another way to make InnoDB more scalable on high end systems.
Changes from the Percona, Inc. contribution were
incorporated in the following source code files:
ha_innodb.cc, os0file.c,
os0file.h, srv0srv.c,
srv0srv.h, and srv0start.c.
This contribution is incorporated subject to the
conditions contained in the file
COPYING.Percona, which are reproduced
here.
Copyright (c) 2008, 2009, Percona Inc.
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:
* Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above
copyright notice, this list of conditions and the following
disclaimer in the documentation and/or other materials
provided with the distribution.
* Neither the name of the Percona Inc. nor the names of its
contributors may be used to endorse or promote products
derived from this software without specific prior written
permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
“AS IS” AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
POSSIBILITY OF SUCH DAMAGE.Innobase Oy gratefully acknowledges the following contributions from Sun Microsystems, Inc. to improve InnoDB performance:
Changes from the Sun Microsystems, Inc. contribution were
incorporated in the following source code files:
univ.i, ut0ut.c, and
ut0ut.h.
This contribution is incorporated subject to the
conditions contained in the file
COPYING.Sun_Microsystems, which are reproduced
here.
Copyright (c) 2009, Sun Microsystems, Inc.
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:
* Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above
copyright notice, this list of conditions and the following
disclaimer in the documentation and/or other materials
provided with the distribution.
* Neither the name of Sun Microsystems, Inc. nor the names of its
contributors may be used to endorse or promote products
derived from this software without specific prior written
permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
“AS IS” AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
POSSIBILITY OF SUCH DAMAGE.Up to MySQL 5.1.30, the InnoDB Plugin replaced the
built-in InnoDB in MySQL when the server was started with the option skip_innodb.
Due to MySQL Bug
#42610, it was impossible to replace the built-in InnoDB in MySQL with a
plugin in MySQL 5.1.31 and 5.1.32. MySQL 5.1.33 introduced the
option ignore_builtin_innodb to allow InnoDB Plugin
installation in the binary release.
Up to MySQL 5.1.30, installing the binary InnoDB Plugin
requires that MySQL be shut down and restarted after issuing
the INSTALL PLUGIN statements. This is because
the INSTALL PLUGIN statement started the plugin
with default options. The options would only be read from the
option file (my.cnf or my.ini)
after restarting the server. The InnoDB Plugin worked around this
limitation by copying parameters from the internal data structures
of the built-in InnoDB in MySQL. Beginning with MySQL 5.1.33,
the INSTALL PLUGIN statement will re-read the
option file and pass all options to the plugin, even those that are
not recognized by the built-in InnoDB in MySQL.
To use the binary InnoDB Plugin with MySQL 5.1.30 or
earlier, you may follow the instructions given in
Section 9.3, “Installing the Precompiled InnoDB Plugin as a Shared Library”, with one
change: Replace the option ignore_builtin_innodb with
skip_innodb. The general steps for dynamically
installing the binary InnoDB Plugin will thus become as
follows:
loose_,
e.g., loose_innodb_file_per_table instead
of innodb_file_per_table, so that MySQL will
start even when InnoDB is unavailable.skip_innodb
and default_storage_engine=MyISAM to the options,
to prevent the built-in InnoDB from starting.INSTALL the InnoDB Plugin and the
Information Schema tables, using the supplied script or equivalent
commands.This change only affects the binary distributions of MySQL and InnoDB Plugin. The procedure for building from source code is unchanged.
Throughout the course of development, the InnoDB Plugin has introduced new configuration parameters. The following table summarizes those parameters:
Table C.1. InnoDB Plugin new parameter summary
| Name | Cmd-Line | Option File | System Var | Scope | Dynamic | Default |
|---|---|---|---|---|---|---|
innodb_adaptive_flushing | YES | YES | YES | GLOBAL | YES | TRUE |
innodb_change_buffering | YES | YES | YES | GLOBAL | YES | inserts |
innodb_file_format | YES | YES | YES | GLOBAL | YES | Antelope |
innodb_file_format_check | YES | YES | YES | GLOBAL | YES | ON |
innodb_io_capacity | YES | YES | YES | GLOBAL | YES | 200 |
innodb_old_blocks_pct | YES | YES | YES | GLOBAL | YES | 37 |
innodb_old_blocks_time | YES | YES | YES | GLOBAL | YES | 0 |
innodb_read_ahead_threshold | YES | YES | YES | GLOBAL | YES | 56 |
innodb_read_io_threads | YES | YES | YES | GLOBAL | NO | 4 |
innodb_spin_wait_delay | YES | YES | YES | GLOBAL | YES | 6 |
innodb_stats_sample_pages | YES | YES | YES | GLOBAL | YES | 8 |
innodb_strict_mode | YES | YES | YES | GLOBAL|SESSION | YES | FALSE |
innodb_use_sys_malloc | YES | YES | YES | GLOBAL | NO | TRUE |
innodb_write_io_threads | YES | YES | YES | GLOBAL | NO | 4 |
innodb_adaptive_flushingTRUE. This parameter was added in
InnoDB Plugin 1.0.4.
See Section 7.11, “Controlling the Flushing Rate of Dirty Pages” for
more information.innodb_change_buffering"inserts" (buffer insert
operations). This parameter was added in InnoDB Plugin 1.0.3.
See Section 7.4, “Controlling InnoDB Insert Buffering”
for more information.innodb_file_formatinnodb_file_format_checkON. This parameter was added in
InnoDB Plugin 1.0.1. See
Section 4.4.1, “Startup File Format Compatibility Checking”
for more information.innodb_io_capacityI/O
operations that can be performed per second. The allowable value
range is any number 100 or greater, and the default value
is 200. This parameter was added in
InnoDB Plugin 1.0.4. To reproduce the earlier behavior,
use a value of 100.
See Section 7.10, “Controlling the Master Thread I/O Rate”
for more information.innodb_old_blocks_pct37 and the allowable value range is
5 to 95. This parameter was added in
InnoDB Plugin 1.0.5.
See Section 7.14, “Making Buffer Cache Scan Resistant”
for more information.innodb_old_blocks_time0
which means that blocks are moved to the “young”
end of the LRU list at the first access. This parameter
was added in InnoDB Plugin 1.0.5.
See Section 7.14, “Making Buffer Cache Scan Resistant”
for more information.innodb_read_ahead_threshold0 to 64 and the
default value is 56. This parameter was added
in InnoDB Plugin 1.0.4. See
Section 7.7, “Changes in the Read Ahead Algorithm” for more
information.innodb_read_io_threadsI/O
threads used for reads. The allowable value range
is 1 to 64 and the default
value is
4. This parameter was added in
InnoDB Plugin 1.0.4. See
Section 7.8, “Multiple Background I/O Threads” for more
information.innodb_spin_wait_delay0 (meaning
unlimited) or positive integers and the default value
is 6. This parameter was added in
InnoDB Plugin 1.0.4. See
Section 7.13, “Control over spin lock polling” for more
information.innodb_stats_sample_pages1-unlimited and the default value is
8. This parameter was added in InnoDB Plugin
1.0.2. See
Section 8.6, “Controlling Optimizer Statistics Estimation” for
more information.innodb_strict_modeOFF. This parameter was added in
InnoDB Plugin 1.0.2. See
Section 8.5, “InnoDB Strict Mode” for more
information.innodb_use_sys_mallocON (use an allocator of the underlying
system). This parameter was added in InnoDB Plugin 1.0.3. See
Section 7.3, “Using Operating System Memory Allocators” for more
information.innodb_write_io_threadsI/O
threads used for writes. The allowable value range
is 1 to 64 and the default
value is 4. This parameter was added in
InnoDB Plugin 1.0.4. See
Section 7.8, “Multiple Background I/O Threads” for more
information.Beginning in InnoDB Plugin 1.0.4 the following configuration parameter has been removed:
innodb_file_io_threadsinnodb_read_io_threads and
innodb_write_io_threads.
See Section 7.8, “Multiple Background I/O Threads”
for more information.For better out-of-the-box performance, InnoDB Plugin 1.0.4 changes the default values for the following configuration parameters:
Table C.2. InnoDB Plugin parameters with new defaults
| Name | Old Default | New Default |
|---|---|---|
innodb_additional_mem_pool_size | 1MB | 8MB |
innodb_buffer_pool_size | 8MB | 128MB |
innodb_log_buffer_size | 1MB | 8MB |
innodb_max_dirty_pages_pct | 90 | 75 |
innodb_sync_spin_loops | 20 | 30 |
innodb_thread_concurrency | 8 | 0 |