Saturday, May 12, 2012

Thoughts on VoltDB and performance vs flexibility

VoltDB is the latest project of Mike Stonebreaker (original founder of Postgres) aimed at very high performance OLTP.  It is fairly clearly a niche product however, and not really usable in the ERP space for reasons I will discuss below.

Stonebreaker's own presentations (one example, at 11:25) suggest he is aware of this issue given his triangle framework.

In the vertices of the triangle he places NoSQL, high performance OLTP (like VoltDB) and column-oriented databases, while in the second, as lower performance generalists, what he calls the legacy databases or elephants are in the middle.  The argument is that you can get a significant performance boost by specializing your systems beyond what you could when you have every system doing everything.  This is a fair argument to a point, but the implementation details show that it is true only in some areas.

Stonebreaker's analysis of processing time (see above presentation, at 14:36) in traditional databases places virtually all processing time in four areas, namely buffer management, locking, thread locks/semaphores, and recovery tasks, and he suggests that in order to get high degrees of performance one must eliminate these tasks.  This requires, however, rethinking how we address concurrency (the locking issues) and durability (recovery and disk-based storage).  The VoltDB approach is to get rid of concurrency entirely and see durability as a network, rather than a system, property.  Concurrency elimination accepted because the db is fast enough that queries can be run one at a time, and intraquery parallelism can be used instead of interquery parallelism.  However, this imposes significant limitations on the database because it means that every transaction is limited to a single query.  You can't do round-tripping in your transactions because this would impose locking requirements on the database.

This approach works very well for certain kinds of processing, such as consuming rapid data feeds and then feeding that information into a data warehouse at specified intervals.,  However one thing that is generally missing from the discussion is that the more complex the application, the more general the database solution needs to be.  One solution might be to separate data entry from reporting and use multiple tools, but who wants to pull their trial balance from a different system than they enter invoices on?  Since ERP systems usually tightly integrate decision support and OLTP, there isn't really room here to use specialized databases like VoltDB.

As the database market has expanded, it has created niches for products like VoltDB.  These niches may indeed grow with time.  However, I think it is incorrect to treat generalists as legacy, outdated approaches.

In essence VoltDB typifies what I call a specialism paradox, namely that to perform well in a specialist niche one must give up generalist solutions.  It is true that sometimes things improve and supplant older systems but at least as often they only do when the systems are not vastly more complex than the ones they are replacing.  For example, Stonebreaker's comparison of high performance OLTP to the shift from cable-operated digging excavation equipment to  hydrolics misses the fact that hydrolic systems are not that much more complex than cables and such.  There is some additional complexity, but it isn't a huge gap.  A better comparison might be comparing semi trucks to race cars and locomotives.  These are all highly engineered systems, but the basic design considerations are very different, and specialization comes at the cost of flexibility.  You will never see race cars hauling twenty-foot containers of furniture cross-country, and you will never see travelling along our nation's highways to reach out-of-the-way towns.  The semi however will never pull the loads that the locomotive can, and will never win the Indie 500......   The semi truck, however, can go fast enough, and pull the loads it needs to pull.....


  1. Hi Chris. Thanks for taking the time to write a bit about VoltDB.

    As a VoltDB engineer, the first thing I want to make sure is clear is that VoltDB allows for multiple queries per transaction, as well as complex decision logic within a transaction. The limitation is that this bundle of logic and SQL must be executed with a single round-trip from the client. This is indeed a limitation, but not quite as severe as it's worded in your post.

    Presumptively speaking for Dr. Stonebraker, I think he largely agrees with you. The point of that triangle diagram is to show that generalist, open-source RDBMSs like Postgres, are so good these days that it’s difficult to offer value as a commercial generalist. Meanwhile, specialized systems like VoltDB are so good at the corners of the triangle, then really all that’s left for legacy commercial RDBMSs are the middle of the edges.

    So speaking personally again, if you can solve a problem with Postgres on a single-node, by all means do so. Postgres isn’t about to be replaced by VoltDB. VoltDB is, for the most part*, only interesting at scale, though what’s interesting to me is watching scale problems trickle down to more and more companies.

    Though whether your scale pain is tremendous or mild, we hear you that flexibility is important. First we made the system fast and safe; much of the work we do today is to make this system easier to use and applicable to more use cases.

    Specific feedback is always welcome, especially in the form, “If you did/supported X, it would be easier for me build an application”. Multiple-round trips per transaction is probably not going to happen anytime soon, but a lot of other ideas are on the table.

    Thanks again.

    *Some of our users aren’t in absolute scale pain, but have chosen VoltDB for other reasons. HA is a good example; VoltDB is probably the easiest system in the world to run in redundant HA mode. Getting HA working with traditional RDBMSs has gotten better, but it’s still not trivial.

    1. Hi John;

      Thanks for the correction. My sense is that multiple round-trips per transaction are excluded as a design choice, and that adding it poses fundamental performance challenges to VoltDB (greatly increased memory requirements, possible locking requirements etc). I don't see how you could do this without sliding back towards the generalist solutions of on-disk row-oriented database management systems or drastically limiting the amount of data being stored in the DBMS.

      Secondly regarding single query vs single round-trip we may be simply speaking different languages, since I am used to invoking stored procedures from queries (as that's the way Pg does it). There is no limitation on what you can do in a single query unless there is a need for intermediate user input. Obviously what you can't do is hold a transaction open and ask "do you really want to reserve this seat on this airplane?" My experience in Pg is that when you try to code this sort of thing in an application and stored procedures you get a huge performance hit (often larger than an order of magnitude). Indeed the example that comes to mind saw a query take approx 20x as long due to having to handle cross-transactional locking in an application. The locks had to persist between subsequent HTTP requests in a single application session via CGI.

      My larger point though is that VoltDB is a neat product, but it is also one which illustrates the tradeoffs between generalization and specialization. It is extremely good at what it does, but that comes at a cost at doing a lot of other things a lot less well. For the technical problems it is designed to solve, it is unmatched in the open source world.

      However, when looking at something like an ERP solution, the design goals just don't match, neither do they match for anything else on the vertices of the triangles. You can't go with NoSQL (since on-the-fly aggregation and reporting is central), or column stores (since most of your work is still OLTP), or New OLTP (same reasons you can't go with NoSQL) solutions. In these complex applications, the only reasonable approach is that of the generalist rdbms's.

      I am not saying Stonebreaker would disagree. In fact the way I see his presentations, he probably agrees more than he disagrees.

    2. Re-reading my reply I see two things that should probably be clarified.

      First when I think of the possibility of holding open transactions for additional round-trips, it seems to me that MVCC becomes a lot more problematic. Instead of main memory having to hold your working data set plus the number of rows a single stored procedure might right, you now have to store main memory plus the number of rows your maximum number of transactions might write across an arbitrary number of stored procedures, and so you slide back to a disk-based RDBMS..... This is why I see this as a result of a fundamental design decision, and not something that can be changed in the future.

      The reason why ERP systems don't match VoltDB is similar to NoSQL, but a bit more narrow. While NoSQL doesn't typically allow any reasonably wide-range ad hoc reporting, ERP systems typically rely heavily on doing ad hock reporting on large data sets which must be instantly up to date based on OLTP entry. A small retail environment (say, two cash registers in a convenience store) might generate reports that run across millions of rows, aggregating and returning a few thousand to the user. Larger environments may aggregate tens of millions of rows on an as-needed basis.

      The thing is these reports have to be absolutely up to date. In other words, if an accountant approves a GL transaction, it needs to show in the trial balance report right away. Sure you can do materialized views, and the like but that adds complexity. Consequently, ERP applications are essentially in the center of the triangle. They do some OLTP, some OLAP, and some other stuff, and all of this is tightly connected.

  2. One interesting thought occurred to me. It seems that there is often 1-2 orders of magnitude performance difference generally in using specialized tools vs using generalized tools. Trying to do row locking in application because locks must persist across stateless http requests is something that seems to cause an *instant* performance hit of approximately 20x (in one case I tested it, a query went from 2 sec to 45 sec). In essence that's going from a specialized tool (db locking mechanisms) to a general one (storing/checking data in the db relating to locking). With further efforts, maybe we could get it up to only on order of magnitude slower, but such is life. The real solution is to move to a thick client where this is an issue and use SELECT... FOR UPDATE......

    This comes down to using the right tool for the job and suggesting that if the database market grows, there will be room for more tools.

  3. Great post, really enjoyed reading!

    I work for ScaleBase. I invite you to look at our solution to the well-known problem of RDBMS performance, latency and scalability.
    I couldn't agree more with Stonebreaker's min 14:36, most of DB's processing power goes on "DB engine stuff" and the more data and more concurrent sessions - the more severe the problem gets.

    We're going with a simple-to-perceive approach of just splitting data and also sessions across multiple DBs. The heaviness of the good old RDBMS is still there, however it's isolated and contained to x sessions and y GB of data. We never hear people complain about their databases when there are 100 users and 10GB of data... Data appears only once in the array, the database still locks, manages, do all the good things but there are 10 databases doing it in parallel in much smaller portions.

    The DB is still the same DB, MySQL... And in the "generic" tests we seem to pass quite well as well as the scale and performance of course.

    I would love to hear your thoughts about it.

    1. How does this compare or differ to the emerging PostgreSQL solution in this area, called Postgres-XC, which offers Terradata-style clustering for PostgreSQL?