Perspectives on LedgerSMB: PostgreSQL, PL/Perl, and CVE-2016-1238

This post is about the dangers in writing user defined functions in untrusted languages, but it is also specifically about how to avoid CVE-2016-1238-related problems when writing PL/PerlU functions. The fix is simple and straight-forward and it is important for it to be in pl/perlU stored procedures and user defined functions for reasons I will discuss. This discusses actual exploits and the severity of being able to inject abritrary Perl code into the running database backend is a good reason to be disciplined and careful about the use of this language.

It is worth saying at the outset that I have been impressed by how well sensible design choices in PostgreSQL generally mitigate problems like this. In essence you have to be working in an environment where a significant number of security measures have been bypassed either intentionally or not. This speaks volumes on the security of PostgreSQL's design since it is highly unlikely that this particular attack vector was an explicit concern. In other words these decisions make many attacks even against untrusted languages far more difficult than they would be otherwise.

The potential consequences are severe enough, however, that secure coding is particularly important in this environment even with the help of the secure design. And in any language it is easy to code yourself into corners where you aren't aware you are introducing security problems until they bite you.

The current implementation, we will see, already has a fair bit of real depth of defense behind it. PostgreSQL is not, by default, vulnerable to the problems in the CVE noted in the title. However, with a little recklessness, it is possible to open the door to the possibility of real problems and it is possible for these problems to be hidden from the code reviewer by error rather than actual malice. Given the seriousness of what can happen if you can run arbitrary code in the PostgreSQL back-end, my view is that all applicable means should be employed to prevent problems.

PL/PerlU can be used in vulnerable ways, but PL/Perl (without the U) is by design safe. Even with PL/PerlU it is worth noting that multiple safety measures have to be bypassed before vulnerability becomes a concern. This is not about any vulnerabilities in PostgreSQL, but what vulnerabilities can be added through carelessly writing stored procedures or user-defined functions.

There are two lessons I would like people to take away from this. The first is how much care has been taken with regard to PostgreSQL regarding security in design. The second is how easily one can accidentally code oneself into a corner. PL/PerlU often is the right solution for many problem domains and it can be used safely but some very basic rules need to be followed to stay out of trouble.

Extending SQL in PostgreSQL using Arbitrary Languages

PostgreSQL has a language handler system that allows user defined functions and stored procedures to be written in many different languages. Out of the box, Python, TCL, C, and Perl come supported out of the box. Perl and TCL come in trusted and untrusted variants (see below), while Python and C are always untrusted.

There are large numbers of external language handlers as well, and it is possible to write ones own. Consequently, PostgreSQL allows, effectively, SQL to be extended by plugins written in any language. The focus here will be on untrusted Perl, or PL/perlU. Untrusted languages have certain inherent risks in their usage and these become important, as here, when there is concern about specific attack vectors.

Trusted vs Untrusted Languages, and what PL/Perl and PL/PerlU can do

PostgreSQL allows languages to be marked as 'trusted' or 'untrusted' by the RDBMS. Trusted languages are made available for anyone to write functions for while untrusted languages are restricted to database superusers. Trusted languages are certified by the developers not to interact with file handles, not to engage in any other activity other than manipulating data in the database.

There is no way to 'use' or 'require' a Perl module in pl/perl (trusted) and therefore it is largely irrelevant for our discussion. However PL/PerlU can access anything else on the system and it does so with the same permissions as the database manager itself. Untrusted languages, like PL/PerlU make it possible to extend SQL in arbitrary ways by injecting (with the database superuser's permission!) code into SQL queries written in whatever langauges one wants. Untrusted languages make PostgreSQL one of the most programmable relational databases in the world, but they also complicate database security in important ways which are well beyond the responsibility of PostgreSQL as a project.

Like test harnesses, untrusted languages are insecure by design (i.e. they allow arbitrary code injection into the database backend) and this issue is as heavily mitigated as possible, making PostgreSQL one of the most security-aware databases in the market.

To define a function in an untrusted language, one must be a database superuser, so the PostgeSQL community places primary trust in the database administration team not to do anything stupid, which is generally a good policy. This article is largely in order to help that policy be effective.

Paths, Permissions, and CVE-2016-1238

The CVE referenced in the title of this article and section is one which allows attackers to inject code into a running Perl routine by placing it in the current working directory of a Perl process. If an optional dependency of a dependency is placed i the current working directory, it may be included in the current Perl interpreter without the understanding of the user. This is a problem primarily because Perl programs are sufficiently complex that people rarely understand the full dependency tree of what they are running.

If the current working directory ends up being one which includes user-writeable data of arbitrary forms, the problem exists. If not, then one is safe.

The problem however is that changing directories changes the include path. You can demonstrate this by doing as follows:

In ./test.pl create a file that contains:

use test;
use test2;

./test.pm is:

chdir test;
\
test/test2.pm is:

warn 'haxored again';

What happens is that the use test statement includes ./test.pm which changes directory, so when you use test2, you are including from test/test2.pm. This means that if any period during this cycle of the Perl interpreter's life you are in a world-writable directory, and including or requring files, you are asking for trouble.

Usual Cases for PL/PerlU

The usual use case of PL/PerlU is where you need a CPAN module to process or handle information.. For example you may want to return json using JSON.pm, or you may want to write to a (non-transactional log).

For example, the most recent PL/PerlU function I wrote used basic regular expressions to parse a public document record stored in a database to extract information relating to patents awarded on protein sequences. It was trivial but was easier to use JSON than to write json serialization inline (and yes, performed well enough given the data sizes operate on).

Are usual cases safe? Deceptively so!

Here are a few pl/perlU functions which show the relevant environment that PL/PerlU functions run in by default:

postgres=# create or replace function show_me_inc() returns setof text language plperlu as
$$
return \@INC;
$$;
CREATE FUNCTION
postgres=# select show_me_inc();
show_me_inc
------------------------------
/usr/local/lib64/perl5
/usr/local/share/perl5
/usr/lib64/perl5/vendor_perl
/usr/share/perl5/vendor_perl
/usr/lib64/perl5
/usr/share/perl5
.
(7 rows)

postgres=# create or replace function get_cwd() returns text language plperlu as
postgres-# $$
postgres$# use Cwd;
postgres$# return getcwd();
postgres$# $$;
CREATE FUNCTION
postgres=# select get_cwd();
get_cwd
---------------------
/var/lib/pgsql/data
(1 row)

Wow. I did not expect such a sensible, secure implementation. PostgreSQL usually refuses to start if non-superusers of the system have write access to the data directory. So this is, by default, a very secure configuration right up until the first chdir() operation......

Now, in the normal use case of a user defined function using PL/PerlU, you are going to have no problems. The reason is that most of the time, if you are doing sane things, you are going to want to write immutable functions which have no side effects and maybe use helpers like JSON.pm to format data. Whether or not there are vulnerabilities exploitable via JSON.pm, they cannot be exploited in this manner.

However, sometimes people do the wrong things in the database and here, particularly, trouble can occur.

What gets you into trouble?

To have an exploit in pl/perlU code, several things have to happen:

Either a previous exploit must exist which has written files to the PostgreSQL data directory, or one must change directories
After changing directories, a vulnerable module must be loaded while in a directory the attacker has write access to.

It is possible, though unlikely, for the first to occur behind your back. But people ask me all the time how to send email from the database backend and so you cannot always guarantee people are thinking through the consequences of their actions.

So vulnerabilities can occur when people are thinking tactically and coding in undisciplined ways. This is true everywhere but here the problems are especially subtle and they are as dangerous as they are rare.

An attack scenario.

So suppose company A receives a text file via an anonymous ftp drop box in X12 format (the specific format is not relevant, just that it comes in with contents and a file name that are specified by the external user). A complex format like X12 means they are highly unlikely to do the parsing themselves so they implement a module loader in PostgreSQL. The module loader operates on a file handle as such:

On import, the module loader changes directories to the incoming directory. In the actual call to get_handle, it opens a file handle, and creates an iterator based on that, and returns it. Nobody notices that the directory is not changed back because this is then loaded into the db and no other file access is done here. I have seen design problems of this magnitude go undetected for an extended period, so coding defensively means assuming they will exist.

Now, next, this is re-implemented in the database as such:

CREATE OR REPLACE FUNCTION load_incoming_file(fiilename text)
RETURNS int
language plperlu as
$$
use CompanyFileLoader '/var/incoming'; # oops, we left the data directory
use CompanyConfig 'our_format'; # oops optional dependency that falls back to .
# in terms of exploit, the rest is irrelevant
# for a proof of concept you could just return 1 here
use strict;
use warnings;

my $filename = shift;
my $records = CompanyFileLoader->get_handle($filename);
while ($r = $records->next){
# logic to insert into the db omitted
}

return $records->count;

$$;

The relevant poart of CompanyFileLoader.pm is (the rest could be replaced with stubs for a proof of concept):

package CompanyFileLoader;
use strict;
use warnings;

my @dirstack;
sub import {
my $dir = pop;
push @dirstack, $dir;
chdir $dir;
}

Now, in a real-world published module, this would cause problems but in a company's internal operations it might not pose discovered problems in a timely fashion.

The relevant part of CompanyConfig is:

package CompanyConfig;
use strict;
use warnings;

eval { require 'SupplementalConfig' };

Now, if a SupplementalConfig.pm is loaded into the same directory as the text files, it will get loaded and run as part of the pl/perlU function.

Now to exploit this someone with knowledge of the system has to place this in that directory. It could be someone internal due to failure to secure the inbound web service properly. It could be someone external who has inside knowledge (a former employee for example). Or a more standard exploit could be tried on the basis that some other shared module might have a shared dependency.

The level of inside knowledge required to pull that off is large but the consequences are actually pretty dire. When loaded, the perl module interacts with the database with the same permissions as the rest of the function, but it also has filesystem access as the database server. This means it could do any of the following:

Write perl modules to exploit this vulnerability in other contexts to the Pg data directory
delete or corrupt database files
possibly alter log files depending on setup.
Many other really bad things.

These risks are inherent with the use of untrusted languages, that you can write vulnerable code and introduce security problems into your database. This is one example of that and I think the PostgreSQL team has done an extremely good job of making the platform secure.

Disciplined coding to prevent problems

The danger can be effectively prevented by following some basic rules:

All user defined functions and stored procedures in PL/PerlU should include the line:

no lib '.';

It is possible that modules could add this back in behind your back, but for published modules this is extremely unlikely. So local development projects should not use lib '.' in order to prevent this.

Secondly, never use chdir in a pl/perl function. Remember you can always do file operations with absolute paths. Without chdir, no initial exploit against the current working directory is possible through pl/perlu. Use of chdir circumvents important safety protections in PostgreSQL.

Thirdly it is important that one sticks to well maintained modules. Dangling chdir's in a module's load logic are far more likely to be found and fixed when lots of other people are using a module, and a dangling chdir is a near requirement to accidental vulnerability. For internal modules, they need to be reviewed both for optional dependencies usage and dangling chdir's in the module load logic.

Perspectives on LedgerSMB

Tuesday, August 2, 2016

PostgreSQL, PL/Perl, and CVE-2016-1238