Monday, February 18, 2013

A look back on something I wrote 9 years ago

I recently came across a comment I wrote in 2004 on Slashdot.  I read it and was surprised actually both where my approach had changed and where it hadn't.  The entire comment is worth posting with a few very minor edits:

First, I think that stored procs are not only often good but lead to better, more powerful, secure, and flexible applications that would be feasible without them. But on the other hand, they lead to hard to maintain applications which are explicitly tied to one database. So they are necessary but often misused.

Triggers are more important and usually use stored procedures to ensure that the information in the database is always meaningful or that some other automated activity happens on an insert or update.

Unless I absolutely have to, I try to avoid having my application call stored procedures directly. A relational database manager *should* be able to hide the stored procedures behind a view allowing a nice standard interface for your data. This means that if you have to move to another RDBMS later, porting is much more simple and mostly confined to the backend.

BTW, I agree with the points about having your business logic in one place. Stored procedures allow you to move business logic which is inherent to the database into the database, thus making it available from all clients regardless of the language they are written in. For a single app/db pair this is not an issue but if you have a large DB with many different clients, it is a larger issue. Maintaining your application in one location is a LOT less work than reimplementing it all in every one of your apps.

Triggers, BTW, as I mentioned before are very powerful mechanisms. They are not called by the app directly but are run every time a row is inserted, updated, or deleted (some RDBMS's also allow select triggers, though some have alternate ways of implementing this). They can be used to implement additional security restrictions, enforce referential integrity, or more advanced stuff such as send email when a new order is added to the database. Again, this is done regardless of whether the order is filed using a web app or a Windows app for the rep in the call center. Since the logic is unknown to the app, it isn't even aware of what happens after the order is placed. Talk about clean interfaces..... This requires stored procedures.

So, these are the great things about stored procedures. But when they are used badly, you end up with the stored procedures reducing the agility of the application because they tie it in too closely to the database. What do you do when your app is tied to Sybase and your customers want MS SQL Server? What if *all* your logic is in that database? Do you rewrite *all* of it for MS SQL Server? Probably Not. You are stuck and out of your investment.

In my opinion, it is important to maintain a clear line between what is the database's job and what is the application's job. If this line is blurred, bad things can result. Stored procedures are a very easy way to blur this line.

I design my apps with the following layers to them:

Application-specific logic
Database Access Layer
Information Presentation (SQL API using views)
Information Management and Data Logic (usually stored procedures)
Information Storage (highly normalized, usually).

The comment was made as part of a discussion on stored procedures. I was discussing at the time an open source CRM solution I was working on which never gained much traction in part because it was one of those things I did primarily to learn PHP, and so I made some important errors in architecture on the PHP side, and that I made a bunch of other errors in managing the project.  I learned a lot from the failure both in what to not to do but just as importantly what was done right.

In reading the post I am struck by two things.  The first is that my understanding of the problems has advanced less than my understanding of solutions, and also because my approach to the solutions at the time and today aim squarely at the same problems.

I have said before many times that one of the primary causes of grief with stored procedures is the fact that they very often create very brittle API's which require way too many changes in code to work right.  The choice to go stored procedure-based then is one which has a number of possible hazards (and different db's have different ways of addressing this), and rolling out applications against a new revision of stored procedures can be a problem which can become more difficult to coordinate over time (Oracle's "Edition-based revision" system helps with this a great deal).

The way I tried to solve this back  a decade ago was to wrap functions in views, so that an application would not need to know the semantics of the call, and in fact everything could be done in dynamic sql.  This has the advantage of entirely hiding the call structure of the stored procedure, but it has the disadvantage of making it somewhat difficult to come up with a view structure sufficiently robust and flexible as to be generally useful.  This also was intended to ensure that the application could just send ansi-SQL and get answers back.

As I look at this solution I recognize that today there are no perfect solutions but that this was a failed but noble effort to deal with a real problem (but one that worked surprisingly well for the app I was building.

The approach we take in LedgerSMB is both similar and different from the approach we took in HERMES.  The basic layers are still there, but the tools in each layer are different:

UIPHPTemplate Toolkit
Application-specific logicPHPPerl
Database Access LayerHand written PHP moduleDBI
Information PresentationViewsMoved up a level, a service locator written in Perl
Information Management and Data LogicStored Procedures, pretty basicStored procs plus named args
Information StorageNormalized tablesNormalized tables

The biggest differences between then and now are worth considering.  Instead of insisting that the public API is relational, the stored procedures form a discoverable API, and this allows one to utilize much more sophisticated stored procedures.  I couldn't imagine doing something like a trial balance or income statement in a view-based API.  Having access to function arguments makes a huge difference there.

To accomplish this the information presentation was moved above data access, and turned into a service locator rather than a simple wrapper.  More needs to be done on this wrapper, but it is a significant improvement over my previous attempts to approach these problems,  The approach is still a work in progress but the automation is sufficiently simple currently as to add a level of robustness not present before but not sufficient to have opacity-related bugs frequently.

I am much happier with the current approach I am using, but it is interesting to note how how little my approach has changed over the last decade or so.


  1. Could you expand on how the service locator works?

    1. I think the best way to do this would be to do an in-depth post of how this works next week.