Saturday, March 13, 2010

Survey of MySQL Storage Engines

 MySQL has an interesting architecture that sets it apart from some other enterprise database systems. It allows you to plug in different modules to handle storage. What that means to end users is that it is quite flexible, offering an interesting array of different storage engines with different features, strengths, and tradeoffs.

Introduction

MySQL has an interesting architecture that sets it apart from some other enterprise database systems.


It allows you to plug in different modules to handle storage.  What that means to end users is that it is quite flexible, offering an interesting array of different storage engines with different features, strengths, and tradeoffs.


MySQL's Storage Engine Architecture
The various storage engines as the name implies manage how data is stored and retrieved on disk, and the features available therein.  One important thing to keep in mind however is that MySQL doesn't manage transactions at the server level, but at the storage engine level.  This is great if you have a mix of InnoDB tables in a query, or a mix of MyISAM ones, but if you have both in the same transaction, not only might performance be impacted, but also results may not be reliable.


Although applications can use a mix of different types of tables using different underlying engines, single queries and transactions should not in general do this.


Do you want to see what storage engines you have available?  Just query your MySQL variables as follows:

mysql&g; show variables like 'have%';
+-----------------------+----------+
| Variable_name         | Value    |
+-----------------------+----------+
| have_archive          | NO       | 
| have_bdb              | YES      | 
| have_blackhole_engine | NO       | 
| have_compress         | YES      | 
| have_crypt            | YES      | 
| have_csv              | NO       | 
| have_dynamic_loading  | YES      | 
| have_example_engine   | NO       | 
| have_federated_engine | NO       | 
| have_geometry         | YES      | 
| have_innodb           | YES      | 
| have_isam             | NO       | 
| have_merge_engine     | YES      | 
| have_ndbcluster       | NO       | 
| have_openssl          | DISABLED | 
| have_ssl              | DISABLED | 
| have_query_cache      | YES      | 
| have_raid             | NO       | 
| have_rtree_keys       | YES      | 
| have_symlink          | YES      | 
+-----------------------+----------+
 
Note these are features denoted by "have_*" so some of these listed are not storage engines.


MyISAM
This storage engine was one of the first.  Locking is done at the table level so that can be a bottleneck very quickly for highly concurrent applications.


It does not support transactions; however, it does support fulltext indexes, which can certainly come in handy for some applications.


With MyISAM, each table has three files, one for the table data, one for index data, and one for the definition or frm file.


MySQL can allocate a key cache for MyISAM index data, however caching of the table data is left to the filesystem.


So be sure to leave enough memory to the OS if you are using large MyISAM tables.


MyISAM supports some table repair after a crash, but does suffer from corruption more often than other table types, such as InnoDB.


Archive
This storage engine is optimized for high speed inserting, and compresses data as it is inserted.


As of 5.1 indexes can be added on these tables, so they might make sense on a replicated slave where you want to do reporting, as slaves do not have to use the same storage engine as the same table on the master.


It does not support transactions.


InnoDB
Perhaps the best-known storage engine, InnoDB, supports row-level locking and multi-version concurrency control.


This means in high concurrency applications such as web-facing ones, InnoDB is ideal.  InnoDB stores data in tablespaces although the underlying storage is managed internally.


It reads and writes data in chunks called blocks, and caches in a configurable buffer cache both data and index blocks.


InnoDB also supports crash recovery.


If you pull the plug on the box, and then reboot, when MySQL starts up, it will rollback all standing and open transactions that were interrupted by the power failure.


Merge
The Merge storage engine operates on underlying MyISAM tables similar to how a VIEW would.  In other words, it's not storing data itself per se, but providing a container of sorts for them.


As it turns out this code is very closely related to how partitions are handled and created.


Merge tables help you manage large volumes of data more easily.


They can be useful for logging in applications as you can easily remove old data by dropping the underlying tables.


You can think of them as UNION ALL Views.


One thing to keep in mind about MERGE tables is that the create statement does not check the underlying tables for compatibility so you may not find out they're incompatible until you use them.


Memory
These tables as you might expect are stored in-memory. The structure will persist past a database restart, but the data will not.


They're good for lookup tables, or caching results during aggregation.


Although access is fast, concurrency may be limited because they do table-level locking.


CSV
This storage engine is a little less known and kind of an interesting one.  It allows you to treat filesystem files in csv format as tables!


Want to select certain rows, do some manipulations on those, and insert them into a table?  This is a great way to do it.


This is very similar to what Oracle calls "External Tables".


NDB Cluster
Another well-known storage engine.  It allows one to cluster tables.  That means you have multiple masters, which all can do inserts, updates and deletes on the same table.


NDB has row-level locking, but not full multi-version concurrency control.  Due to some architectural limitations of how the MySQL server executes joins, they perform rather poorly with NDB storage engine.


For that reason, it is best used for single table primary key lookups. So, you would not likely port your entire web-facing database to NDB Cluster for example.


Conclusions
The MySQL architecture offers a rich selection of storage engines with different characteristics, strengths, and weaknesses.


We've covered some of the most common ones here.


Next month we'll look at some of the newer and also some more exotic ones such as Blackhole, Maria, PBXT, and more.

No comments:

Post a Comment