Saturday, October 29, 2011

MySQL, Postgres, and InnoDB: Some Critiques of Index Organized Tables

This discussion of InnoDB versus Postgres' storage engine (and index organized table structures generally, as this could also apply to Oracle's IOT's) is worth a careful read. Since our schema makes heavy use of the index organized property of InnoDB - and since I have a bit of a fanboy wish to use Postgres as I worked on it for a few years versus helping to buy Larry another yacht - here's some discussion of where index organized structures add value and some criticisms of my own.

Even though one used a primary key definition to define the "index organization structure", in my opinion, the real power of index organized table structures is in larger, "bulk" searches over single-record lookups of the sort one would typically do with a primary key. Assuming very careful schema design and querying, the advantage with "bulk" searches is that a single B-tree ingress is all that is necessary to answer the search, while secondary indexes require at least some indirection into the table, which gets expensive if the working set involves millions of records (as it often does, at least in our case). See the below discussion on telemetry schemas for more.

Index Organized Table Wishlist

  1. Have an explicit "ORGANIZATION KEY" table declaration (or some such) and not rely on the PRIMARY KEY. If the organization key doesn't completely define the storage, allow "..." or something to allow for an incompletely defined organization key. If "ORGANIZATION KEY" is omitted, default to the PRIMARY KEY.
  2. Secondary indexes should have the option of using storage row-ids to "point at" the main rows as organization keys can be relatively long, and require walking two different B-trees to find the recs. Implementing this could be hard and make inserts slow as otherwise unchanged rows can "move around" while the B-tree is having stuff done to it - requiring secondary indexes to have pointers redone - so there's a design trade-off here. In our environment, we mostly avoid secondary indexes in favor of "search tables", which are basically user-managed indexes.
If I get the energy to do an open-source project, I want to do what I call "vector indexes" and a "vectoring execution engine" that we've basically implemented in user code in our app. If we had "vector indexes", we wouldn't be so reliant on index-organized table structures.

No comments: