Perspectives on LedgerSMB: November 2013

Saturday, November 23, 2013

Reporting in LedgerSMB 1.4: Part 5, Conclusions

I hope many of you have enjoyed this series. We've tried hard to avoid inner platform syndrome here by making reporting something that a developer does.

Skills Required to Write Basic Reports

The developer skills required to write reports tend to fall on the database side. In general one should have:

A good, solid understanding of SQL and PL/PGSQL in a PostgreSQL environment. This is the single most important skill and it is where most of the reporting effort lies.
Basic understanding of Perl syntax. Any basic tutorial will do.
A basic understanding of Moose. A general understanding of the documentation is sufficient, along with existing examples.
A very basic understanding of the LedgerSMB reporting framework as discussed in this series.

These are required for general tabular reports, and they allow you to build basic tabular reports that can be output in HTML, CSV, ODS, and PDF formats.

Skills Required to Write More Advanced Reports

For more advanced reports, such as new financial statements, government forms, and the like, the following skills are required. These are not fully discussed here. These typically require, additionally:

An in-depth understanding of our HTML elements abstraction system (this will be discussed in a future post here)
A general proficiency with Template Toolkit, possibly including the LaTeX filter for running portions of the template through a LaTeX filter.

Strengths

The reporting framework here is very database-centric. In general you cannot have a non-database-centric reporting structure because the data resides in the database, and some knowledge there is required to get it out in a working form. We have tried hard to make a system where only minimal knowledge elsewhere is required to do this. If you have db folks who work with your finance folks, they can write the reports.

Weaknesses

Ad hoc reporting is outside the scope of this reporting. A one-off report is unlikely to be particularly helpful. Additionally this generates reports as documents that can be shared. Updating the data requires running the report again, and while this can be done as a sharable URL, it is not necessarily ideal for all circumstances.

Other Reporting Options

In cases where this reporting framework is not ideal, there are a number of other options available:

Views can be made which can be pulled in via ODBC into spreadsheets like Excel.
Third party report engines like JasperReports can be used instead, and
One-off SQL queries in PSQL can be used to generate HTML and (in the most recent versions) LaTeX documents that can be shared.

Wednesday, November 20, 2013

Writing Reports in LedgerSMB 1.4: (Mostly-) Declarative Perl Modules

So far we have talked about getting data in, and interacting with the database. Now we will talk about the largest of the modules and cover workflow scripts in relation with this stage,

At this point you would have a filter screen, a user defined function which would take the arguments from that screen's inputs (prefixed with 'in_' usually to avoid column name conflicts), and a tabular data structure you expect to return.

As a note here, all the code I am trashing here is my own, in part because I have learned a lot about how to code with Moose over the course of 1.4 development.

In your workflow script you are likely to need to add the following:

use LedgerSMB::Report::MyNewReport;

and

sub run_my_new_report {
my ($request) = @_;
LedgerSMB::Report::MyNewReport->new(%$request)->render($request);
}

That's all you need in the workflow script.

Overall Structure and Preamble

The actual Perl module basically defines a number of parameters for the report, and the LedgerSMB::Report.pm provides a general framework to cut down on the amount of code (and knowledge of Perl) required to write a report. Minimally we must, however, define inputs, if any, output structure, and how to create the output structure. We can also define buttons for further actions on the workflow script. The same workflow script would have to handle the button's actions.

Typically a report will start with something like this (of course MyNewReport is the name of the report here):

package LedgerSMB::Report::MyNewReport;
use Moose;
extends 'LedgerSMB::Report';
with 'LedgerSMB::Report::Dates'; # if date inputs used, use standard defs

This preamble sets up the basic reporting framework generally along with all the features discussed below. If you need to handle numeric input or secondary dates you will want to change:

with 'LedgerSMB::Report::Dates';

to

with 'LedgerSMB::Report::Dates', 'LedgerSMB::MooseTypes';

so that you can use type coercions for numeric and/or date fields (for processing localized formattings and the like).

Defining Inputs

Inputs are defined as report properties. Usually you want these properties to be read-only because you want them to correspond to the report actually run. You can use the full Moose capabilities in restricting inputs. However typically inputs should be read-only and you are likely to want to restrict to type and possibly coerce as well (at least when using the types defined in LedgerSMB::MooseTypes).

When including the following line you do not have to define the date_from and date_to inputs:

with 'LedgerSMB::Report::Dates';

Typically our conventions are to document inputs inline with POD. While this is (obviously) not necessary for the functioning of the report, it is helpful for future maintenance and highly recommended. It is also worth noting in the POD how a match is made (this should be in SQL also if applicable, in a COMMENT ON statement for easy checking of common assumptions regarding API contracts).

For example, from the GL report:

=item amount_from

The lowest value that can match, amount-wise.

=item amount_to

The highest value that can match, amount-wise.

=cut

has 'amount_from' => (is => 'rw', coerce => 1,
isa => 'LedgerSMB::Moose::Number');
has 'amount_to' => (is => 'rw', coerce => 1,
isa => 'LedgerSMB::Moose::Number');

Those lines demonstrate the full power of Moose in the definition. One obvious thing that will be fixed in beta is making these read-only (is => 'ro') while they are currently read-write. There is no reason for these to be read-write.

From the LedgerSMB::Report::PNL you see the following optional string input defined:

=item partnumber

This is the control code of the labor/overhead, service, or part consumed.

=cut

has partnumber => (is => 'ro', isa => 'Str', required => 0);

This would probably be improved by mentioning that the partnumber is an exact match in the POD, but it shows how to designate a read-only, optional string input.

If an input is not listed, it won't be passed on to the stored procedure. It is critical that all inputs are defined whether using standard modular definitions (LedgerSMB::Report::Dates) or explicit ones. If an input is being ignored this is one of the first places to check. Additionally note that because of other aspects of the reporting, it is not currently possible to use strict or slurpy constructors in any sane way. It is likely we will build our own constructor handling in the future, but currently this is a hard limitation.

Input Definition Antipatterns

There are a few things which someone who has not worked with Moose before is likely to do in this area, and while many of these are relatively harmless in the web interface because of a number of failsafes, but if you ever want to re-use the code in a more stateful environment you will have difficulties. The examples given are, alas, my own code but I have the benefit of being a new-comer to Moose here and so the lessons are fresh in my mind, or rather codebase.

The first is in use of read-write inputs. A report output is closely bound to its inputs, so read-write inputs allows the application to misrepresent the report. The example I gave above is:

has 'amount_to' => (is => 'rw', coerce => 1,
isa => 'LedgerSMB::Moose::Number');

Now this allows the application to do something like this:

my $report = LedgerSMB::Report::GL->new(%request);
$report->run_report();
$report->amount_to('10000000');
$report->render;

The above will represent that the report includes a bunch of transactions that may, in fact, be excluded. This is no good. On the other hand, if amount_to was read-only (is => 'ro'), then the above code would throw an error instead.

The second major anti-pattern is in the use of Maybe[] as an alternative to required => 0. For example see the following:

has 'approved' => (is => 'rw', isa => 'Maybe[Bool]');

Oh the joys of looking at code I wrote that is in need of rewrite.... Not only do we have a read-write input, but it is maybe boolean (i.e. true, false, or undef).

Now, this appears to work because undef is passed as NULL to the database, and the same is achieved by the more proper:

has approved => (is => 'ro', isa => 'Bool', required => 0);

The difference is that we will not accept as input a case where $request->{approved} = undef has been set. Our query handlers drop empty inputs so there is no case where this should happen. Additionally, this prevents unsetting the attribute after running the report and thus decoupling output from purported input.

Defining Report Structure

Report structure is defined using a series of functions which are overridden by actual reports. Some of these functions are optional and some are not. The required ones are covered first.

There are three required functions, namely columns, header_lines, and name. These are expected to return very specific data structures, but function in a largely declarative way. In other words, the functional interface effectively defines them as pseudo-constant (they are not fully constant because they are expected to return the localized names).

In all cases, LedgerSMB::Report::text() can be used to translate a string into its local equivalent (assuming localized strings in the .po files).

The columns function returns an arrayref of hashrefs, each of which is a column definition for our "dynatable" templates. The following are required:

col_id --- the name of the row field to use
type --- the display type of the field (text, href, checkbox, hidden, etc)
name --- localized header for the column

The following are conditionally required or optional:

href_base --- the base of the href. To this is appended the row_id (see below). Only used by href columns, and then required.
pwidth --- Used for width factors for PDF reports.

Here's an example of a columns function for a very simple report (which just lists all SIC codes in the system):

sub columns {
    return [
      { col_id => 'code',
          type => 'href',
     href_base => 'am.pl?action=edit_sic&code=',
          name => LedgerSMB::Report::text('Code'), },

      { col_id => 'description',
          type => 'text',
          name => LedgerSMB::Report::text('Description'), }
    ];
}

In most reports, the columns function is much longer.

The header_lines function provides an arrayref of hashrefs, for displaying inputs on the report. To this, the reporting engine adds the name of the report and the database connected to. If you want no header lines added, you can just return an empty arrayref:

sub header_lines { return []; }

In many cases however, such inputs should be displayed. Each hashref has two components:

name is the name of the input
text is the text label of the input.

Here's a more useful example from LedgerSMB::Report::GL:

sub header_lines {
    return [{name => 'from_date',
             text => LedgerSMB::Report::text('Start Date')},
            {name => 'to_date',
             text => LedgerSMB::Report::text('End Date')},
            {name => 'accno',
             text => LedgerSMB::Report::text('Account Number')},
            {name => 'reference',
             text => LedgerSMB::Report::text('Reference')},
            {name => 'source',
             text => LedgerSMB::Report::text('Source')}];
}

Finally name() returns the localized name of the report. This is usually a very simple function:

sub name {
    return LedgerSMB::Report::text('General Ledger Report');
}

Additionally there are two optional functions, buttons and template, which allow additional flexibility. These are rarely used.

The template function overrides our dynatable-based template as the template to use. This is used mostly in financial statements but is not used in the trial balance, or other fully tabular reports.

If you use it, just return the path to the template to use.

sub template { return 'Reports/PNL' }

Our PNL reporting module has additional features and beyond the scope of this post.

Finally buttons returns a list of buttons to be included on the report. These follow the element_data format of UI/lib/elements.html and are used to add HTML form callbacks to the report. Here's an example:

sub buttons {
    return [{
         text => LedgerSMB::Report::text('Add New Tax Form'),
         name => 'action',
         type => 'submit',
         class => 'submit'
    }];
}

How Columns are Selected

The columns to display are dynamically selected according to the following rules:

If no column selection criteria is found, then all columns are shown
If the render() method is called with a hashref as arguments that includes a a member with the name of the column ID prefixed with 'col_' then the column is shown and those which are not so selected are not.

What this means is that typically you will define inputs for selection of columns in the $request object before passing it through if you want to have a baseline of columns which always show. Otherwise you will usually allow selection of columns in the filter screen using inputs named as above (i.e. an 'id' field would have a selector named 'col_id').

The run_report() Function

The run_report function populates the rows of the report. It should exist and set $self->rows(@array) at the end. This is the only portion where specific knowledge of programming in Perl is particularly helpful. However, assuming nothing more than a little knowledge, here is a basic template from the SIC listing:

sub run_report{
    my ($self) = @_;
    my @rows = $self->exec_method(funcname => 'sic__list');
    for my $row(@rows){
        $row->{row_id} = $row->{code};
    }
    $self->rows(\@rows);
}

Going through this line by line:

my ($self) = @_;

The first line of the function body is boilerplate here. It is possible to accept the $request object here as a second parameter but not typical unless you have very specific needs for it. In that case, you simply:

my ($self, $request) = @_;

Remember that in Perl, @_ is the argument list.

my @rows = $self->exec_method(funcname => 'sic__list');

This line says "Take run the database function named 'sic__list' and map inputs of the report to function arguments. You would typically just copy this line and change the function name.

for my $row(@rows){
    $row->{row_id} = $row->{code};
}

If you have any hyperlinks in the report, it is necessary to set a row_id so that this can be properly handled. In any case the row_id is appended to the link from href_base in the column definition. It is possible to override this on a per-column basis but that's beyond the scope of this introduction.

$self->rows(\@rows);

This assigns the rows() of the report to the rows returned. Currently this is handled as a read/write property of reports, but long-run this will probably be changed so that programs cannot override this after running the report.

Ending the File

Always end report classes with the following line

__PACKAGE__->meta->make_immutable;

This improves performance and ensures that no more attributes can be dynamically added to your report. There are cases where such may be less than desirable outside of the reports of this sort, but such would be outside the reporting use case.

Others in series:

Introduction
Filter Screens
Best Practices regarding Stored Procedures
(this piece)
Conclusions

Tuesday, November 12, 2013

On CPAN, Community, and P: A Case Study in What Not to Do

I am going to try to do this piece as respectfully as I can. I understand people put a lot of work into developing things and they submit them, and when they get panned, it can be difficult. At the same time, community resources are community resources and so a failure to conduct such case studies in things gone amiss can lead to all kinds of bad things. Failure to get honest feedback can lead to people not improving, but worse, it can leave beginners sometimes mistakenly believing that bad practices are best practices. There is also a period of time after which bad practices become less easily excused.

So somewhat reluctantly I am going to undertake such a study here. This is solely about community interfacing. I am not going to critique code. Rather I would hope that this post can be a good one regarding understanding some of the problems regarding community interfaces generally, whether CPAN, PGXN, or others. The lessons apply regardless of language or environment and the critiques I offer are at a very different level than critiques of code.

So with this, I critique the P CPAN module from a community coding perspective. This module exports a single function called "P" which acts kind of like printf and sprintf. It would be an interesting exercise in learning some deep aspects of Perl but from a community resource perspective, it suffers from enough issues to make it a worthy case study.

The gist of this is that community resources require contemplating how something fits into the community and working with the community in mind. I cool idea or something one finds useful is not always something that is a candidate for publishing as a community resource, at least not without modifications aimed at carefully thinking how things fits into more general processes.

Four of my own CPAN modules are actually rewrites of code that I wrote in other contexts (particularly for LedgerSMB), and rewrote specifically for publication on CPAN. In general there is a huge gulf between writing a module for one project or one developer and writing it for everyone. I believe, looking at P, that it is just something the developer thought was useful personally and published it as is without thinking through any of these issues. This is all too common and so going through these I hope will prevent too many from making the same mistakes.

Namespace Issues

The name 'P' as an extraordinarily bad choice of naming for a public module. Perl uses nested namespaces, and nesting implies a clear relationship, such as inheritance (though other relationships are possible too).

Taking a top-level namespace is generally discouraged on CPAN where a second or third level namespace will suffice. There are times and places for top-level namespaces, for example for large projects like Moose, Moo, and the like. In general these are brand names for conglomerates of modules, or they are functional categories. They are not shorthand ways of referring to functionality to save typing.

'P' as a name is not helpful generally, and moreover it denies any future large project that namespace. The project is not ambitious enough to warrant a top-level namespace. There is no real room for sub-modules and so there are real problems with having this as a top-level module.

Proper Namespacing

It's helpful, I think, to look at three different cases for how to address namespacing. All three of these are ones I maintain or am planning to write. I believe they follow generally acceptable practices generally although I have received some criticism for PGObject being a top-level namespace as well.

PGObject is a top-level namespace housing, currently three other modules (perhaps ten by next year). I chose to make it a top-level namespace because it is a framework for making object frameworks, and not a simple module. While the top-level module is a thin "glue" module, it offers services which go in a number of different directions, defying simple categorization.

Additionally the top-level module is largely a framework for building object frameworks, which complicates the categorization further,. In this regard it is more like Moose than like Class::Struct. Sub-modules include PGObject::Simple (a simple framework using PGObject, not a simple version of PGObject), PGObject::Simple::Role, and PGObject;:Type::BigFloat.
Mail::RoundTrip is a module which allows web-based applications to request email verification by users. The module offers only a few pieces of functionality and is not really built for extensibility. This should not be a top-level module.
Device::POS::Printer is a module I have begun to write for point of sale printers, providing a basic interface for printing, controlling cash drawers, getting error messages, etc. The module is likely to eventually have a large number of sub-modules, drivers for various printers etc, but putting Device:: in front does no real harm and adds clarity. There's no reason to make it a top-level namespace.

The main point is thinking about how your module will fit into the community, how it will be found, etc. 'P' a name which suggests these have not been considered.

Exports

The P module exports a single function, P() which functions like printf and sprintf. The major reason for this, according to the author, is both to add additional checking and to save typing. Saving typing is not a worthy goal by itself, though neither is verbosity. Condensing a function which takes over two different functions to a single letter, however, is not a recipe for good, readable code. If others follow suit, you could get code like this:

P(A(R("This is a string", 3));

Now maybe this code is supposed to print the ASCII representation of "This is a string" repeated three times. However that is not obvious from the code, leading to code that is hard to read or debug.

Proper Exports

In Perl, exports affect the language. Exports are thus to be used sparingly as they can lead to conflicts which can lead to hard to maintain code. Exports should be rare, well documented, and not terribly subject to name collision. They should also be verbose enough they can be understood without tremendous prior knowledge of the module. P() as an exported function meets none of these criteria.

A good example of exports done right would be a function like has() used by Moose, Mouse, and Moo. The function is exported and used to declaratively define object properties. The convention has become widespread because it is obvious what it does. Again this does not matter so much for personal projects, but it does for published modules on a community repository.

Test Failure Report Management

The CPANTesters page for P shows that every version on CPAN has had test failures. This is unusual. Most modules have clear passes most of the time. Even complex modules like DBD::Pg show a general attention to test failures and a relatively good record. A lack of this attention shows a lack of interest in community use, and that fixes to test failures, needed for people to use the library, are just not important. So if you manage a module, you really want to take every test failure seriously.

Conclusions

Resources like CPAN, CTAN, PGXN, and the like are subject to one important rule. Just because it is good for your own personal use does not make it appropriate for community publication as a community resources. Writing something that fits the needs of a specific project, or a specific coder's style is very different from writing something that helps a wide range of programmers solve a wide range of problems. These community resources are not places to upload things just because one wrote them. They are places to interface with the community through work. Getting pre-review, post-review, and paying attention to the needs of others is critically important.

Monday, November 11, 2013

Writing Reports in LedgerSMB 1.4, part 3, Stored Procedure Conventions and Best Practices

The stored procedures/UDFs form the heart of the LedgerSMB reports. Everything else basically is there to handle input to the stored procedure and output from it. It's for this reason that the stored procedures are covered before the Perl modules.

The stored procedures follow the conventions and design principles laid out in the very first post on this blog. Arguments are prefixed with 'in_' and given consistent names from report criteria. Anyone wanting to write reports should start with that post in terms of understanding where to start.

Additionally there are a number of standardized conventions for common elements. The most important of these currently is that many dates require a date range, and the arguments here are standardized as in_date_from and in_date_to (corresponding to date_from and date_to on the application side). The conventions are set up on the Perl side so these will be covered in the next section.

In general, there are a number of do's and dont's associated with these procedures.

1. Do provide a default ordering that is useful.

2. Don't provide windowing functions that depend on ordering that could be overridden. Running totals should not be implemented here. As a good general rule, if you use ROWS UNBOUNDED PROCEDING in one of these functions, that's a red flag.

3. Do try to have one central query with, possibly, some small support logic.

4. Do define your types so several different reports can share the same output type.

5. Do write test scripts to test your reporting functions. Use transactions that roll back.

7. Do use plpgsql instead of sql if you have more than a few arguments. named arguments are easier to read. Once we can drop support for 9.2 and lower this will cease to be an issue though.

An example:

CREATE OR REPLACE FUNCTION report__gl
(in_reference text, in_accno text, in_category char(1),
in_source text, in_memo text, in_description text, in_from_date date,
in_to_date date, in_approved bool, in_from_amount numeric, in_to_amount numeric,
in_business_units int[])
RETURNS SETOF gl_report_item AS
$$
DECLARE
         retval gl_report_item;
         t_balance numeric;
         t_chart_id int;
BEGIN

IF in_from_date IS NULL THEN
   t_balance := 0;
ELSIF in_accno IS NOT NULL THEN
   SELECT id INTO t_chart_id FROM account WHERE accno = in_accno;
   t_balance := account__obtain_balance(in_from_date ,
                                       (select id from account
                                         where accno = in_accno));
ELSE
   t_balance := null;
END IF;

FOR retval IN
       WITH RECURSIVE bu_tree (id, path) AS (
            SELECT id, id::text AS path
              FROM business_unit
             WHERE parent_id is null
            UNION
            SELECT bu.id, bu_tree.path || ',' || bu.id
              FROM business_unit bu
              JOIN bu_tree ON bu_tree.id = bu.parent_id
            )
       SELECT g.id, g.type, g.invoice, g.reference, g.description, ac.transdate,
              ac.source, ac.amount, c.accno, c.gifi_accno,
              g.till, ac.cleared, ac.memo, c.description AS accname,
              ac.chart_id, ac.entry_id,
              sum(ac.amount) over (rows unbounded preceding) + t_balance
                as running_balance,
              compound_array(ARRAY[ARRAY[bac.class_id, bac.bu_id]])
         FROM (select id, 'gl' as type, false as invoice, reference,
                      description, approved,
                      null::text as till
                 FROM gl
               UNION
               SELECT ar.id, 'ar', invoice, invnumber, e.name, approved, till
                 FROM ar
                 JOIN entity_credit_account eca ON ar.entity_credit_account
                      = eca.id
                 JOIN entity e ON e.id = eca.entity_id
               UNION
               SELECT ap.id, 'ap', invoice, invnumber, e.name, approved,
                      null as till
                 FROM ap
                 JOIN entity_credit_account eca ON ap.entity_credit_account
                      = eca.id
                 JOIN entity e ON e.id = eca.entity_id) g
         JOIN acc_trans ac ON ac.trans_id = g.id
         JOIN account c ON ac.chart_id = c.id
    LEFT JOIN business_unit_ac bac ON ac.entry_id = bac.entry_id
    LEFT JOIN bu_tree ON bac.bu_id = bu_tree.id
        WHERE (g.reference ilike in_reference || '%' or in_reference is null)
              AND (c.accno = in_accno OR in_accno IS NULL)
              AND (ac.source ilike '%' || in_source || '%'
                   OR in_source is null)
              AND (ac.memo ilike '%' || in_memo || '%' OR in_memo is null)
             AND (in_description IS NULL OR
                  g.description
                  @@
                  plainto_tsquery(get_default_lang()::regconfig, in_description))
              AND (transdate BETWEEN in_from_date AND in_to_date
                   OR (transdate >= in_from_date AND in_to_date IS NULL)
                   OR (transdate <= in_to_date AND in_from_date IS NULL)
                   OR (in_to_date IS NULL AND in_from_date IS NULL))
              AND (in_approved is false OR (g.approved AND ac.approved))
              AND (in_from_amount IS NULL OR ac.amount >= in_from_amount)
              AND (in_to_amount IS NULL OR ac.amount <= in_to_amount)
              AND (in_category = c.category OR in_category IS NULL)
     GROUP BY g.id, g.type, g.invoice, g.reference, g.description, ac.transdate,
              ac.source, ac.amount, c.accno, c.gifi_accno,
              g.till, ac.cleared, ac.memo, c.description,
              ac.chart_id, ac.entry_id, ac.trans_id
       HAVING in_business_units is null or in_business_units
                <@ compound_array(string_to_array(bu_tree.path, ',')::int[])
     ORDER BY ac.transdate, ac.trans_id, c.accno
LOOP
   RETURN NEXT retval;
END LOOP;
END;
$$ language plpgsql;

The above is the function that provides the GL report and search. It is a monster query and we could use RETURN QUERY but for older versions this seems more reliable. Ideally, the inline view would be moved to a formal one, and a few other tweaks, but it works and works well enough and despite the length and the inline view, it is not hard to debug.

Others in series:

Part 1: Overview

Part 2: Filter Screens

Part 4: Perl Modules

Part 5: Conclusions

Friday, November 8, 2013

Interlude: A cool PL/PGSQL trick

I will be getting back to the report writing in LedgerSMB next or the week after.   I have an open source, PostgreSQL-based point of sale project I am hoping to make to a minimally useful version in the mean time.

In the mean time, here's a useful trick I discovered while trying to solve a problem. The basic problem is one of Object-Relational design, in the sense that we may want to use types as classes and therefore return a tuple which is garanteed to be of the type defined by one table or view.

However, we may want to override some of values in this tuple with values from another table.

In older versions of PostgreSQL, I seem to recall it was possible to add additional columns onto the end of a result which would be silently ignored on return. This doesn't work in current versions and isn't very safe anyway. A better way is to work with composite tuples, where each member is a tuple you need for processing.

So for example, suppose we have a table which contains pricing information, and a second table which contains temporary pricing information for a given customer, and these are different table structures. Suppose, for sake of argument we have tables like this:

CREATE TABLE goods (
   id int serial not null unique,
   sku text primary key,
   sell numeric not null, -- sell price
   buy numeric not null -- purchase price
);

CREATE TABLE pricematrix (
   customer_id int not null references customer(id),
   goods_id int not null references goods(id),
   valid_from date,
   valid_to date,
   sell numeric not null
);

CREATE VIEW pricematrix_today AS
select * from pricematrix
WHERE (valid_from is null or valid_from <= 'today') and
              (valid_to is null or valid_to >= 'today');

Now, what I want to do is pull records from the goods table, but override the sell column with the one from the pricematrix if it exists, choosing the lowest available price.

A simple approach (this code is not tested unlike my actual production code based on the same principle):

CREATE OR REPLACE FUNCTION get_goods_by_sku(in_sku text, in_customer_id int)
RETURNS goods LANGUAGE PLPGSQL AS
$$
DECLARE t_rec record;
   retval goods;
    pmatrix record;
BEGIN

   SELECT goods, pm INTO t_rec
   FROM goods
        LEFT JOIN (SELECT *,
                          rank() OVER
                           (ORDER BY sell
                            PARTITION BY goods_id) as ranking
                     FROM pricematrix_today
                    WHERE customer_id = in_customer_id
                  ) pm ON (goods.id = pm.goods_id and ranking = 1)
            WHERE goods.sku = in_sku

   pmatrix := t_rec.pm;
   retval := t_rec.goods;
   IF pmatrix.sell < retval.sell THEN
retval.sell = pmatrix.sell;
   END IF;
   RETURN retval;
END;
$$;

By keeping our types isolated we can safely manipulate them inside the function before returning them and still be guaranteed a named type tuple on output.

Now obviously this is a contrived example and it is less likely we will usually be selecting from tables than views, but the same goes for views since we can define "type methods" on those as well.