Perspectives on LedgerSMB: July 2012

Saturday, July 28, 2012

One advantage of Logic in the DB: PHP classes for LedgerSMB coming soon!

The perpetual argument over logic in the db will continue forever, but here's one advantage that is often overlooked: stored procedures provide a powerful way to integrate programs written in different environments.

Case in point: Yesterday when I got tired of staring at Perl code, I wrote some basic PHP classes to provide integration logic for LedgerSMB. The basic interface which performs query mapping functions, took me a couple of hours including debugging, despite the fact that I haven't programmed in PHP since 4.2 was current, and includes an additional class for retrieving/saving company records as well. Not everything works yet but this will probably be finished up today.

These classes will make writing integration logic with portions of the software which have been moved over to stored procedures quite easy, and such code that calls stored procedures will be provably free of SQL injection and privilege escalation.

I haven't decided where to put this, whether in the main LedgerSMB project or in another Sourceforge project but it is coming soon. Once these are working I intend to teach myself enough Java to write the same in that language,

Long-run there is a lot of boilerplate code that will probably be able to be generated by code generators in all these languages. The code generators will eventually have to query system catalogs, and some custom application catalogs. This will minimize the amount of code that will have to be written by hand.

Friday, July 20, 2012

On Long Queries/Stored Procedures

In general, in most languages, programmers try to keep their subroutines down to a couple screens in size. This helps, among other things, with readability and ease at debugging. Many SQL queries are significantly longer than that. I have queries which are over 100 lines long in LedgerSMB, and worked on debugging queries that are more than 200 lines long. In general, though, the maximum length of a query is not a problem if certain practices are followed. This post describes why I think SQL allows for longer procedures and what is required to maintain readibility.

Why Long Queries Aren't a Problem Per Se

In a typical programming language, there isn't an enforced structure to any subroutines. What this means is as a subroutine becomes longer, maintaining internal patterns and an understanding of state becomes more difficult. SQL queries themselves, however, are defined by their structure. For example, a standard select statement (excluding common table expressions) includes the following parts in order:

Returned column list
Starting relation
Join operations
Filters
Aggregation criteria
Aggregation filters
Postprocessing (Ordering, limits, and off-set)

The order of these elements are not interchangeable. I can't put my returned column list at the end, or my join operations at the front. Therefore in a 100 line query, chances are you know initially what is being returned first, and can work onward to see why the value is what it is. Generally also you will have an idea of likely causes before you get started. Does this look like a join projection problem? Like a misbehaving filter? Bad aggregation? Ok, we know where to look for logic. Ok, now we have a block, and an idea of what to look for. It's pretty quick to figure out which lines are likely at issue.

Because of the structure, it is pretty easy to dive into even a very long query and figure out exactly where the problems are quickly. Maintainability is not dependent on length or overall complexity.

Moreover if you have common table expressions, it is easy to jump back to the beginning, where these are defined, and reference these as needed.

A second difference is that SQL statements work with state which is typically assumed to be unchanging for purposes of the operation. With rare exceptions, the order of execution doesn't matter (order of operations however does). Consequently you don't have to read line-by-line to track state.

In essence debugging an SQL statement is very much like searching a b-tree, while debugging Perl, Python, or C is very much like traversing a singly linked list.

What Can Be a Problem

Once I spent several days helping a customer troubleshoot a long, complex query. The problem turned out to be a bad filter parameter, but we couldn't tell that right away. The reason was that the structure of the query had decayed a bit and this made maintenance difficult.

The query was not only long, but it was also difficult to understand, because it didn't conform to the above structure. The problem in that case was the use of many inline views to create a cross-tab-type report.

If you can't understand a long query, or if you don't know immediately where to look, it is hard to troubleshoot these.

Best Practices

The following recommendations are about keeping long queries well-structured. Certain features of the SQL language are generally to be avoided more as queries become longer, and other features should be used carefully.

Avoid inline views. Ok, sometimes I use inline views too on longer queries, but usually these are well-tested units in their own right, and re-used elsewhere. These should probably be moved into defined views, or CTE's if applicable.
Avoid union and union all in long queries. These complicate query maintenance in a number of ways. It's better to move these to small testable units, like a defined view based on a shorter query or a common table expression.
In long stored procedures, keep your queries front and center, and move as much logic as is reasonable into them.
Avoid implicit joins, which work by doing cross-joins and then placing the filter condition in the result. Join logic should be separate from filter logic.

What do you think? What practices do you find helpful?

Tuesday, July 17, 2012

Why LedgerSMB uses Moose. An intro for PostgreSQL folks.

In LedgerSMB 1.4 we are moving to using a Perl object system called Moose for new code. This post will discuss why we are doing so, what we get out of it, etc. For those new to Moose this may serve as a starting point to decide whether to look further into this object framework.

To start with, however, we will need to address our overall strategy regarding data consistency and accuracy.

Why we Use Database Constraints Aggressively

Before getting into Moose specifically it's important to re-iterate why we use database constraints aggressively in LedgerSMB.

In general you can divide virtually all application bugs into a few categories which fall within two major classifications:

State Errors
Execution Errors

State errors involve not only transient state problems but also stored information. In other words if the application processes meaningless information, the results will be similarly meaningless. Summarized: garbage in, garbage out. Detecting garbage, and preventing garbage from being persistently stored is thus important. Typically a specific misbehavior can be a cascading failure from an undetected bug regarding storage of information. This is particularly important to avoid in an accounting application.

Database constraints allow us to declare mathematically the constraints of meaningful data and thus drastically reduce the chance of state errors occurring in the application. Foreign keys protect against orphaned records, type constraints protect against invalid data, and check constraints can be used to ensure data falls within meaningful parameters. NOT NULL constraints can protect against necessary but missing information.

The other type of error is an execution error. These can be divided into two further categories: misdirected execution (confused deputy problems and the like) and bad execution (cases where the application takes the right information and does the wrong things with it).

Any reduction in state errors has a significant number of benefits to the application. Troubleshooting is simplified, because the chances of a cascade from previous data is reduced. This leads to faster bugfixes and a more robust and secure application generally. Not only are the problems simplified but the problems that remain are reduced. We see this as a very good thing. While it is not necessarily a question of alternatives, testing cannot compare to proof.

One really nice thing about PostgreSQL is the very rich type system and the ability to add very rich constraints against those types. The two of those together makes it an ideal database for this sort of work.

Enter Moose

Recently, Perl programmers have been adopting a system of object-oriented programming based on metaclasses and other concepts borrowed from the LISP world. The leading object class that has resulted is called Moose, and bills itself as a post-modern object system for a post-modern language (Perl). Moose offers a large number of important features here including the following, which we consider to be the most important:

A rich type system for declaring object properties and constraints
Transparent internal structures
Automatic creation of constructors and accessors
Method modifiers.

These four benefits bring to the Perl level the ability to add the same sort of proof of state that the database level has traditionally had. We can be sure that a given attribute falls within meaningful range of values and we can be sure everything else just works.

Brief Overview of Useful Type Features

In plain old Perl 5 we would normally build accessors and constructors to check specific types for sanity. This is the imperative approach to programming. With Moose we do so declaratively and everything just works.

For example we can do something like this, if we want to make sure the value is an integer:

has location_class => (is => 'rw', isa => 'Int');

This would be similar to having a part of a table definition in SQL being:

location_class int,

This tells Moose to create a constructor that's read/write, and check to make sure the value is a valid Int.

We can also specify that this is a positive integer:

subtype 'PosInt', as 'Int', where { $_ > 0 };

has location_class => (is => 'rw', isa => 'PosInt');

This would be similar to using a domain:

CREATE DOMAIN posint AS int check(VALUE > 0)

Then the table definition fragment might be:

location_class posint,

But there is a huge difference. In the SQL example, the domain check constraint (at least in PostgreSQL) is only checked at data storage. If I do:

select -1::posint;

I will get -1 returned, not an error. Thus domains are useful only when defining data storage rules. The Moose subtype however, is checked on every instantiation. It defines the runtime allowed values, so any time a negative number is instantiated as a posint, it will generate an error. This then closes a major hole in the use of SQL domains and allows us to tighten up constraints even further.

So here we can simply specify the rules which define the data and the rest works. Typical object oriented systems in most languages do not include such a rich declarative system for dealing with state constraints.

The type definition system is much richer than the above examples allow and we can require that attributes belong to specific classes, specify default values, when the default is created and many other things. Moose is a rich system in this regard.

Transparent Internal Structures

Moose treats objects as blessed references to hash tables (in Perl we'd call this a hashref), where every attribute is simply listed by name. This ensures that when we copy and sanitize data for display, for example, the output is exactly expected. Consequently if I have a class like:

package location;

has location_class => (is => 'rw', isa => 'Int');
has line_one => (is => 'rw', isa => 'Str');
has line_two => (is => 'rw', isa => 'Maybe[Str]');
has line_three => (is => 'rw', isa => 'Maybe[Str]');
has city => (is => 'rw', isa => 'Str');
has state => (is => 'rw', isa => 'Str');
has zip => (is => 'rw', isa => 'Maybe[Str]');
has country_name => (is => 'rw', isa => 'Maybe[Str]');
has country_id => (is => 'rw', isa => 'Int');

In that case, when we go to look at the hash, we could print values from the hash in our templates. Our templating engines can copy value for value the resulting hash, escape it (in the new hash) for the format as needed, and then pass that on to the templating engine. This provides cross-format safety that accessor-based reading of attributes would not provide.

In other words we copy these as hashrefs, pass them to the templates and these just work.

Accessors, Constructors, and Methods

Most of the important features here follow from the above quite directly. The declarative approach is used to create constructors and accessors, and this provides a very different way to think about your code.

Now Moose also has some interesting features regarding methods which are useful in integrating LedgerSMB with other programs. These include method modifiers which allow you to specify code to run before, instead of, or after functions. These can take the place of database triggers on the Perl level.

Pitfalls

There are two or three reasons folks may choose not to use Moose, aside from the fact that it does have some overhead.

The first is complexity. Moose has a lot of features and learning how to program effectively in Moose takes some time, though this is reduced by approaching it from a database perspective.

The second is that sometimes it is tempting to look into the raw data structure directly and fiddle with it, and if you do this then the proofs of accuracy are invalid. This is mitigated by the use of correct controls on the database level.

Features we are Still Getting Used To

This isn't to say we are Moose experts already. Moose has a number of concepts we haven't started using yet, like roles, which add cross-class functionality orthogonal to inheritance and which are highly recommended by other Moose programmers.

There are probably many other features too that will eventually come to be indispensable but that we aren't yet excited about.

Future Steps

On the LedgerSMB side, long-run, I would like to move to creating our classes directly using code generators from database classes. We could query system catalogs for methods as well. This is an advanced area we probably won't see much of for a while.

It's also likely that roles will start to be used, and perhaps DBObject will become a role instead of an inheritance base.

Update:

Matt Trout has pointed out that setting required => 0 is far preferable to Maybe[] and helped me figure out why that wasn't working before. I agree. Removing the Maybe[] from the type definitions would be a great thing. Thanks Matt!

Wednesday, July 11, 2012

PostgreSQL CTEs and LedgerSMB

LedgerSMB trunk (which will become 1.4) has moved from connectby() to WITH RECURSIVE common table expressions (CTE's).    This post describes our experience, and why we have found that CTE's are an extremely useful way to write queries and why we are moving more and more towards these. We have started using CTE's more frequently in place of inline views when appropriate as well.

CTEs are first supported in PostgreSQL 8.4. They represent a way to write a subquery whose results remain stable for the duration of the query. CTE's can be simple or recursive (for generating tree-like structures via self-joins). the stability of CTEs both adds opportunities for better tuning a query and performance gotchas, and it is important to remember that the optimizer cannot correlate expected results from a CTE with the rest of the query to optimize the CTE (but presumably can go the other way).

The First Example

The simplest example would be our view, "menu_friendly" which shows the structure and layout of the menu. The menu is generated using a self-join and this view returns the whole menu.

CREATE VIEW menu_friendly AS
WITH RECURSIVE tree (path, id, parent, level, positions)
AS (select id::text as path, id, parent, 0 as level, position::text
      FROM menu_node where parent is null
     UNION
    select path || ',' || n.id::text, n.id, n.parent, t.level + 1,
           t.positions || ',' || n.position
      FROM menu_node n
      JOIN tree t ON t.id = n.parent)
SELECT t."level", t.path,
       (repeat(' '::text, (2 * t."level")) || (n.label)::text) AS label,
        n.id, n."position"
   FROM tree t
   JOIN menu_node n USING (id)
ORDER BY string_to_array(t.positions, ',')::int[];

This replaced an older view using connectby() from the tablefunc module in contrib:

CREATE VIEW menu_friendly AS
SELECT t."level", t.path, t.list_order,
       (repeat(' '::text, (2 * t."level")) || (n.label)::text) AS label,
        n.id, n."position"
FROM (connectby('menu_node'::text, 'id'::text, 'parent'::text,
                  'position'::text, '0'::text, 0, ','::text
        ) t(id integer, parent integer, "level" integer, path text,
        list_order integer)
   JOIN menu_node n USING (id));

The replacement query is longer, it is true, However it has a number of advantages. The biggest is that we no longer have to depend on a contrib module which may or may not come with PostgreSQL (most Linux distributions package these in a separate package) and where installation methodology may change (as it did in PostgreSQL 9.1).

I also think there is some value in having the syntax for the self-join a bit more obvious rather than having to run to the tablefunc documentation every time one wants to check.   The code is clearer and this adds value too.

I haven't checked performance between these two examples, because this isn't very performance-sensitive. After all it is only there for troubleshooting menu structure, and it is only a set of read operations. Given other tests however I would not expect a performance cost.

Better Performance

LedgerSMB generates the menu for the tree-view in a stored procedure called menu_generate() which takes no arguments. The 1.3 version is:

CREATE OR REPLACE FUNCTION menu_generate() RETURNS SETOF menu_item AS
$$
DECLARE
        item menu_item;
        arg menu_attribute%ROWTYPE;
BEGIN
        FOR item IN
                SELECT n.position, n.id, c.level, n.label, c.path,
                       to_args(array[ma.attribute, ma.value])
                FROM connectby('menu_node', 'id', 'parent', 'position', '0',
                                0, ',')
                        c(id integer, parent integer, "level" integer,
                                path text, list_order integer)
                JOIN menu_node n USING(id)
                JOIN menu_attribute ma ON (n.id = ma.node_id)
               WHERE n.id IN (select node_id
                                FROM menu_acl
                                JOIN (select rolname FROM pg_roles
                                      UNION
                                     select 'public') pgr
                                     ON pgr.rolname = role_name
                               WHERE pg_has_role(CASE WHEN coalesce(pgr.rolname,
                                                                    'public')
                                                                    = 'public'
                                                      THEN current_user
                                                      ELSE pgr.rolname
                                                   END, 'USAGE')
                            GROUP BY node_id
                              HAVING bool_and(CASE WHEN acl_type ilike 'DENY'
                                                   THEN FALSE
                                                   WHEN acl_type ilike 'ALLOW'
                                                   THEN TRUE
                                                END))
                    or exists (select cn.id, cc.path
                                 FROM connectby('menu_node', 'id', 'parent',
                                                'position', '0', 0, ',')
                                      cc(id integer, parent integer,
                                         "level" integer, path text,
                                         list_order integer)
                                 JOIN menu_node cn USING(id)
                                WHERE cn.id IN
                                      (select node_id FROM menu_acl
                                        JOIN (select rolname FROM pg_roles
                                              UNION
                                              select 'public') pgr
                                              ON pgr.rolname = role_name
                                        WHERE pg_has_role(CASE WHEN
                                                          coalesce(pgr.rolname,
                                                                    'public')
                                                                    = 'public'
                                                      THEN current_user
                                                      ELSE pgr.rolname
                                                   END, 'USAGE')
                                     GROUP BY node_id
                                       HAVING bool_and(CASE WHEN acl_type
                                                                 ilike 'DENY'
                                                            THEN false
                                                            WHEN acl_type
                                                                 ilike 'ALLOW'
                                                            THEN TRUE
                                                         END))
                                       and cc.path like c.path || ',%')
            GROUP BY n.position, n.id, c.level, n.label, c.path, c.list_order
            ORDER BY c.list_order

        LOOP
                RETURN NEXT item;
        END LOOP;
END;
$$ language plpgsql;

This query performs adequately for users with lots of permissions but it doesn't do so well when a user has few permissions. Even on decent hardware the query may take several seconds to run if the user only has access to items within one menu section. In many cases this isn't acceptable performance but on PostgreSQL 8.3 we don't have a real alternative outside of trigger-maintained materialized views. For PostgreSQL 8.4, however, we can do this:

CREATE OR REPLACE FUNCTION menu_generate() RETURNS SETOF menu_item AS
$$
DECLARE
        item menu_item;
        arg menu_attribute%ROWTYPE;
BEGIN
        FOR item IN
               WITH RECURSIVE tree (path, id, parent, level, positions)
                               AS (select id::text as path, id, parent,
                                           0 as level, position::text
                                      FROM menu_node where parent is null
                                     UNION
                                    select path || ',' || n.id::text, n.id,
                                           n.parent,
                                           t.level + 1,
                                           t.positions || ',' || n.position
                                      FROM menu_node n
                                      JOIN tree t ON t.id = n.parent)
                SELECT n.position, n.id, c.level, n.label, c.path, n.parent,
                       to_args(array[ma.attribute, ma.value])
                FROM tree c
                JOIN menu_node n USING(id)
                JOIN menu_attribute ma ON (n.id = ma.node_id)
               WHERE n.id IN (select node_id
                                FROM menu_acl acl
                          LEFT JOIN pg_roles pr on pr.rolname = acl.role_name
                               WHERE CASE WHEN role_name
                                                           ilike 'public'
                                                      THEN true
                                                      WHEN rolname IS NULL
                                                      THEN FALSE
                                                      ELSE pg_has_role(rolname,
                                                                       'USAGE')
                                      END
                            GROUP BY node_id
                              HAVING bool_and(CASE WHEN acl_type ilike 'DENY'
                                                   THEN FALSE
                                                   WHEN acl_type ilike 'ALLOW'
                                                   THEN TRUE
                                                END))
                    or exists (select cn.id, cc.path
                                 FROM tree cc
                                 JOIN menu_node cn USING(id)
                                WHERE cn.id IN
                                      (select node_id
                                         FROM menu_acl acl
                                    LEFT JOIN pg_roles pr
                                              on pr.rolname = acl.role_name
                                        WHERE CASE WHEN rolname
                                                           ilike 'public'
                                                      THEN true
                                                      WHEN rolname IS NULL
                                                      THEN FALSE
                                                      ELSE pg_has_role(rolname,
                                                                       'USAGE')
                                                END
                                     GROUP BY node_id
                                       HAVING bool_and(CASE WHEN acl_type
                                                                 ilike 'DENY'
                                                            THEN false
                                                            WHEN acl_type
                                                                 ilike 'ALLOW'
                                                            THEN TRUE
                                                         END))
                                       and cc.path::text
                                           like c.path::text || ',%')
            GROUP BY n.position, n.id, c.level, n.label, c.path, c.positions,
                     n.parent
            ORDER BY string_to_array(c.positions, ',')::int[]
        LOOP
                RETURN NEXT item;
        END LOOP;
END;
$$ language plpgsql;

This function is not significantly longer, it is quite a bit clearer, and it performs a lot better. For users with full permissions it runs twice as fast as the 1.3 version, but for users with few permissions, this runs about 70x faster. The big improvement comes from the fact that we are only instantiating the menu node tree once.

So not only do we have fewer external dependencies now but there are significant performance improvements.

A Complex Example

CTE's can be used to "materialize" a subquery for a given query. They can also be nested. The ordering can be important because this can give you a great deal of control as to what exactly you are joining.

I won't post the full source code for the new trial balance function here. However, the main query, which contains a nested CTE.

Here we are using the CTE's in order to achieve some stability across joins and avoid join projection issues, as well as creating a tree of projects, departments, etc (business_unit is the table that stores this).

The relevant part of the stored procedure is:

     RETURN QUERY
       WITH ac (transdate, amount, chart_id) AS (
           WITH RECURSIVE bu_tree (id, path) AS (
            SELECT id, id::text AS path
              FROM business_unit
             WHERE parent_id = any(in_business_units)
                   OR (parent_id = IS NULL
                       AND (in_business_units = '{}'
                             OR in_business_units IS NULL))
            UNION
            SELECT bu.id, bu_tree.path || ',' || bu.id
              FROM business_unit bu
              JOIN bu_tree ON bu_tree.id = bu.parent_id
            )
       SELECT ac.transdate, ac.amount, ac.chart_id
         FROM acc_trans ac
         JOIN (SELECT id, approved, department_id FROM ar UNION ALL
               SELECT id, approved, department_id FROM ap UNION ALL
               SELECT id, approved, department_id FROM gl) gl
                   ON ac.approved and gl.approved and ac.trans_id = gl.id
    LEFT JOIN business_unit_ac buac ON ac.entry_id = buac.entry_id
    LEFT JOIN bu_tree ON buac.bu_id = bu_tree.id
        WHERE ac.transdate BETWEEN t_roll_forward + '1 day'::interval
                                    AND t_end_date
              AND ac.trans_id <> ALL(ignore_trans)
              AND (in_department is null
                 or gl.department_id = in_department)
              ((in_business_units = '{}' OR in_business_units IS NULL)
                OR bu_tree.id IS NOT NULL)
       )
       SELECT a.id, a.accno, a.description, a.gifi_accno,
         case when in_date_from is null then 0 else
              CASE WHEN a.category IN ('A', 'E') THEN -1 ELSE 1 END
              * (coalesce(cp.amount, 0)
              + sum(CASE WHEN ac.transdate <= coalesce(in_date_from,
                                                      t_roll_forward)
                         THEN ac.amount ELSE 0 END)) end,
              sum(CASE WHEN ac.transdate BETWEEN coalesce(in_date_from,
                                                         t_roll_forward)
                                                 AND coalesce(in_date_to,
                                                         ac.transdate)
                             AND ac.amount < 0 THEN ac.amount * -1 ELSE 0 END) -
              case when in_date_from is null then coalesce(cp.debits, 0) else 0 end,
              sum(CASE WHEN ac.transdate BETWEEN coalesce(in_date_from,
                                                         t_roll_forward)
                                                 AND coalesce(in_date_to,
                                                         ac.transdate)
                             AND ac.amount > 0 THEN ac.amount ELSE 0 END) +
              case when in_date_from is null then coalesce(cp.credits, 0) else 0 end,
              CASE WHEN a.category IN ('A', 'E') THEN -1 ELSE 1 END
              * (coalesce(cp.amount, 0) + sum(coalesce(ac.amount, 0)))
         FROM account a
    LEFT JOIN ac ON ac.chart_id = a.id
    LEFT JOIN account_checkpoint cp ON cp.account_id = a.id
              AND end_date = t_roll_forward
        WHERE (in_accounts IS NULL OR in_accounts = '{}'
               OR a.id = ANY(in_accounts))
              AND (in_heading IS NULL OR in_heading = a.heading)
     GROUP BY a.id, a.accno, a.description, a.category, a.gifi_accno,
              cp.end_date, cp.account_id, cp.amount, cp.debits, cp.credits
     ORDER BY a.accno;

This query does a lot including rolling balances froward from checkpoints (which also effectively close the books), pulling only transactions which match specific departments or projects, and more.

Conclusion

We have found the CTE to be a tool which is hard to underrate in importance and usefulness. They have given us maintainability and performance advantages, and they are very flexible. We will probably move towards using more of them in the future as appropriate.

Saturday, July 7, 2012

Progress towards LedgerSMB 1.4

I have been relatively submerged in work and haven't gotten back to blog as much as I would have liked. On the whole this is a good thing.

A lot of things are changing in the LedgerSMB community for the better. For starters there has been a lot of work towards making the site more manageable. We expect this to be a lot better than it has been in the past.

Additionally my business is planning to address important as-yet-unmet needs that must be addressed if open source accounting/ERP software is to reach the mainstream. I will probably post more on this later.

Similarly development has accelerated a great deal until the last couple weeks when I have had to focus on other things. In the next series of posts I expect to talk a bit about how we are utilizing CTE's and the like in LedgerSMB 1.4 and how new features of PostgreSQL are making our lives a lot easier. I will say that the more I work with CTE's the more I am convinced this is a feature which is highly underrated. I can't wait until we can start requiring PostgreSQL 9.1 and we can use them in write operations.

I will probably also talk a bit about Moose, why we are moving to it, our experiences, and the like. Some other discussions like changes to our architecture may be discussed as well.

However for now, I figure a progress report is due.

A basic web services framework has been completed and work is ongoing to expose customers and vendors over the framework.
Many CRM improvements have been added including lead tracking, an ability to attach documents to customer/vendor records or companies or persons. Many of these improvements affect employees too.
A lot of code and schema clean-up in the CRM and employee management areas
Projects and departments have been replaced by a more flexible system of reporting dimensions supporting funds accounting and more
A new report framework has been written and many reports have already been moved to it.

To get to 1.4 beta, we basically need to complete payroll (which is not a trivial task) and get customers/vendors working via web services. We also need to move the financial statements to the new framework.

I would say we are most of the way there.