Beyond Upsert - Coming in PostgreSQL 9.5

Published Jul 2, 2015

With PostgreSQL 9.5 just released as an alpha (and betas from August till its ready) the new feature list for the next version of the SQL database has become clearer. We've previously talked about the addition of upsert functionality to 9.5 but that's just one of the new user visible features and now's a good time to look at what should also be coming in the latter part of this year.

Row security

One of the more interesting additions, at least for databases with many users looking at the same data, is the addition of Row level security. This lets you create a policy for a table that decides on whether or not a user can see particular rows in a table. Policies can be associated with a table but only get activated when they are enabled, in this case with the command ALTER TABLE tablename ENABLE ROW LEVEL SECURITY. The policy, at heart, is an expression that has to be true for the row to be visible.

One example might be, if the table has an "owner" column then (owner == current_user) would only show rows where the current user was the rows owner. (And if the currrent_user thing is new to you, you may want to check out the System Information Functions in PostgreSQL which gives access to interesting runtime and session information). The superuser will still be able to see all rows as there's a BYPASSRLS attribute set on the superuser roles by default, and table owners won't be subjected to row level security for tables they own. A great feature if you need to ensure isolated data flow. You can read more about this DSHL Blog post where you'll also see how long the feature has been maturing for.

BRIN Indexes

If you have very large tables and columns which naturally relate to where that row is within the table, like a timestamp in a log, then BRIN indexes offer a chance to index those columns without the cost of creating and maintaining a B-Tree index. At its simplest, the Block Range Index works by breaking the data up into ranges, then storing the pages which are within those ranges and summary information. (PostgreSQL tables are made up of pages, each of which can contain one or more rows). The summary information for the block contains the minimum and maximum value for that set of pages.

When a query using the BRIN index comes in, the first pass over the index returns the pages which could contain records that match the query. In a very large table, this allows the rapid skipping of all the pages which don't match the query. The next step is the query executor still going sequentially through the returned pages to check the rows within them to see if they match.

The usefulness of BRIN indexes is contingent on balancing the tradeoffs. A B-Tree index is quick at finding records but takes more disk space and has a cost to maintain it. A BRIN index is slower than a B-Tree index but takes far less disk space and work to maintain as long as your indexed column can be broken down into sensible ranges. You can read more about BRIN indexes in Michael Paquier's BRIN indexes overview and this DSHL Blog post.

JSONB

While row security and BRIN indexes have been coming to Postgres for some time, the developments in JSONB are more recent and build on the work in 9.4 and the extensions built around it. As we mentioned when we covered the arrival of JSONB, modifying JSONB data was convoluted and involved extracting the entire document, modify it and then put it back. With 9.5 that changes with the appearance of JSONB modifiers. There's the || operator which will let you concatenate two JSONB values. The new - operator has a couple of modes - it can remove a top level field when given a string or an array element when given an int(eger). While - only does top-level keys, the #- operation will, given a nested path, remove nested keys and their values.

There's also three new functions for JSONB: jsonb_set() will let you update the value of a key, given by a nested path, json_strip_nulls() will remove keys whose value is null and finally jsonb_pretty() will pretty-print your JSONB values (rather than dump one long string).

This goes some way to addressing the issue of JSON manipulation in PostgreSQL. It'll definitely cut down on crufty JSON mangling scripting. For more examples, see Michael Paquir's recent blog post on the new JSONB features.

Faster sorting

A pair of enhancements for sorting in PostgreSQL also point to much faster sorting in 9.5. The enhancements began with a commit in January which extended sorting support with a framework for abbreviated keys. The first use of this was for text fields and used the first 8 characters of text fields and creates a blob using strxfrm(), a compressor for sortable strings. It then uses that blob to text for sorting and uses the old style comparison where there's a tie. The code uses a HyperLogLog to detect the worst-case (where the first 8 characters of all the values are similar) and takes appropriate action. The framework was then applied to give abbreviated keys for numeric sorts.

According to various benchmarks, in a general realistic case, this change can speed up indexing by three times. That's a massive step forward for any databases performance, especially as sorting is so core to what the database does.

And more

These are only some of the enhancements in PostgreSQL 9.5, selected here on the basis of what will be most visible to a Compose PostgreSQL user. There are many more enhancements we're looking forward to at Compose which will help to strengthen the infrastructure with which we deliver PostgreSQL to you.