Monday, October 8, 2012

Three Approaches to Object-Relational Databases: PostgreSQL, Oracle, and Informix

PostgreSQL vs Informix

Probably the closest database object-relationally to Postgres is Informix.  Informix in fact got its object-relational approach with the purchase of Illustra, a Postgres fork.  Illustra however split from Postgres before the latter adopted SQL, and so the SQL implementations are entirely independent.

Informix has perhaps had the most influence of any database software on how the industry sees object-relational databases and so because of the share heritage and Informix's place in the industry, it is the first comparison to make.

Table Methods

I was not able to find information on class.method notation in the Informix documentation.  As far as I can tell, Informix requires the methods to be called using function(object) syntax.  This is along the lines of Stonebraker's image processing example.  In this way the connection between structure and function feels somewhat more bolted on than it does in PostgreSQL.  However it would be a mistake to see this as the entirity of Informix's object-relational capabilities.


Informix supports single inheritance for both types and tables using the UNDER supertype syntax.  UNDER, similar to INHERITS in PostgreSQL establish an "is-a" relationship between the supertype and the subtype.  Unlike in PostgreSQL, indexes and referential integrity is inherited in Informix, meaning that foreign keys work properly in both directions.  Single inheritance is thus quite a bit more mature on Informix than on PostgreSQL, but the lack of multiple inheritance prevents composition of tables by in-lining type attributes (which is possible on PostgreSQL but not on Informix).

This shows that Informix has a different approach to table inheritance, namely that there is a simple use case which it supports very well and is quite well polished, but more complex use cases are beyond it.  In PostgreSQL, on the other hand, declarative referential integrity doesn't work and thus referential integrity requires writing your own constraint triggers.

Return Results

A select * from parent_table in informix returns all attributes of all rows from the parent table and all descendant tables.  This can lead to a situation where the result set is "jagged" (i.e. where the rows have different numbers of elements), where child tables add additional columns.  In this case, it is necessary to check the row definition when receiving each row.

One way to look at it is that both PostgreSQL and Informix return a set of objects, bot PostgreSQL coerces them into the types that are asked for, while informix returns them as they are.  Thus if you select * from person, and employee inherits person, then you automatically get all the employee information as well.

This feature is as far as I know, unique to Informix.  I know of no other ORDBMS that allows a result set to be jagged in this way, but it shows one important problem that happens when one starts merging object-oriented and relational approaches.  What does select * mean in this case?  It isn't self-evident, and therefore Informix and PostgreSQL take different approaches.

Alternative Approaches

In the absence of a system of composing tables by in-lining types, the way multiple inheritance works in PostgreSQL, table composition requires using row types as columns in Informix.  The use of complex types in this way in Informix is much more mature than it is in PostgreSQL (which only even began to support this very recently recently).

Consequently composition is done primarily through member types, but this evokes an important tradeoff between ease of querying in a relational sense and rich modelling.

I personally find the SQL required to make elegant use of columns as types in this way somewhat ugly but I recognize this is a personal practice.  Still, consider the following:

SELECT print(, mailing_label(e.address) from employees e;

If address and name are both complex types then whatever we are going to do is going to be a bit ugly.  There isn't much we can do about that.

PostgreSQL and Informix vs Oracle

Oracle in some ways shows some influence from Informix but it takes the approach in a somewhat different direction.  Oracle makes no bones about being a relational database first and foremost and has adopted an approach which avoids, rather than answers, questions of proper behavior.  Oracle Objects in some ways resemble Informix's approach and in some ways PostgreSQL's, but they are in general more different than the same.

Tables and Types 

Oracle objects tends approaches the question of object to relation equivalence by allowing types to be inherited, while tables cannot be.  Tables can copy type structures for their schemas however.  Thus one generally has to create an object model first and then create tables to store those objects.  This approach sets up very different patterns and antipatterns than one would get in Informix and PostgreSQL where tables themselves can inherit.  On one hand this separates (forcibly) data holding tables and their abstract parents.  On the other, this makes it harder to work with, except where complex types are being used in columns of a table.

It is worth noting that types support methods in Oracle and this simplifies things greatly.  However, I am left wondering why one would use Oracle objects instead of virtual columns and just treat Oracle as a form of relational database management system with little use of object extensions.

Object Composition Approaches

In Oracle the only approach that works is to use columns to store complex types.  Forunately those types can have methods so the SQL is not as ugly as it would be on Informix.  You can:

select, e.address.mailing_label() from employees e;

This strikes me as a bit more readable.

It seems to me that Oracle Objects have two main uses.  The first is in the logical model of the database although this role can be taken over by ORMs to some extent.  The second and more attractive approach is to use Oracle Objects not for their advertised use of modelling of things like customers or invoices but rather to create intelligent datatypes for columns.

For example, if I want to store images and dynamically determine what type of image we are looking for, I could still do something like:

SELECT id FROM slides s
 WHERE s.picture.is_of('sunset');

This is far cleaner SQL-wise than the equivalent syntax in Informix or PostgreSQL:

SELECT id FROM slides s
 WHERE is_of(s.picture, 'sunset'); 

This allows the same sorts of problems to be addressed as Stonebraker talks about, and it allows structure and logic to be clearly tied together, at the expense of substitutability as is found in table inheritance in both Informix and PostgreSQL.

The complexity costs are likely to be worth it in Oracle in a very different set of cases than would be the case in PostgreSQL or Informix.  These cases intersect at queries which must return data based on very complex criteria which cannot be reduced to relational algebra and predicate calculus.

Final Thoughts

The above discusses three different approaches to the question of how to encapsulate complex data into a database which may need to query based on arbitrary criteria not readily reducible to relational algebra and predicate calculus.

PostgreSQL and Informix both tightly integrate object handling far more deeply than Oracle.  Informix seems more polished in the areas it supports, but PostgreSQL has a broader support for the overall idea.

Oracle's approach is essentially to move object handling capabilities to the column level for the most part.  Yes, you an create tables of objects, but you cannot inherit them directly and you cannot compose your object model using multiple parents without moving everything to the column level.  This makes object behavior mostly useful in the creation of "smart column types."

Each of these three ORDBMS's takes its own approach.  All three allow the development of very intelligent database models.  All three pose different complexity issues.

No comments:

Post a Comment