pg_tmp(1)

Run tests on an isolated, temporary PostgreSQL database

overview | download 2.9 | man page

Usage

Get an interactive shell to a new database with a predefined schema

uri=$(pg_tmp) 
psql $uri -f schema.sql 
psql $uri

Screencast

Tagwatch: using pg_tmp to facilitate application development

WEBM (31MB, vp8-vorbis) MP4 (27MB, h264-aac)

Theory and Operation

pg_tmp is a compact shell script designed to make unit testing, integration testing with PostgreSQL easy in any language.

Sometimes SQLite is used as drop-in replacement for PostgreSQL when running tests in order to make tests self-contained and therefore free of side-effects. This technique seems to suffice for simple interactions, but it does so by reducing the set of features that an application can use to the functionality common to both platforms. PostgreSQL is not a heavy weight key-value store it is more like a specialized programming environment.

The engineering genius of unit tests is mainly the ability they have to shape the design of the internal interfaces used to build an application. To ensure that we are testing the module and not the environment we mock APIs to external resources, but never the runtime or the language itself. An application built on PostgreSQL is an application where individual functions span more than process—and more than one runtime. Testing concretely requires verification of methods as complete “units”.

To facilitate the creation of tests that run autonomously and concurrently two libraries have been developed to start up and tear down a database in the test code itself: Test-postgresql (Perl), and testing.postgresql (Python). Both of these is configurable, but they have some inherent limitations:

  1. Each is tied tightly to it's implementation language and supporting libraries, making them cumbersome to integrate when the main project is not built with Python or Perl
  2. Using automatic setup and teardown within the library itself creates deadlocks if the programmer is not careful to use objects in the correct scope
  3. The instance of PostgreSQL that is created is not optimized for limited memory and low latency by default

pg_tmp is less configurable from the command line, is impervious to deadlocks, requires no external libraries, and is already tuned for the singular task of reducing the startup time to less than a second. This makes a full-featured PostgreSQL database useful for unit testing and other tests that require rapid feedback.

Performance Techniques

Total Wait Time Speedup Strategy
7.2 Naive in-process creation and teardown
6.4 0.8 Disabling runtime fsync ( pg_ctl -F )
6.0 0.4 Spinning to make the first connection (instead of pg_ctl -w )
2.2 3.8 Background database initialization ( initdb --nosync )
1 second 1.3 Shut down database asynchronously

Space Optimization

On a production server, generous space for pg_wal is always a good idea, but an ephemeral instance has reason to limit space used on /tmp . To this end, the configuration installed by pg_tmp allows the Postgres server clean up quickly by writing checkpoint after 64MB of WAL (the default is 1GB).

In some cases it is useful to use pg_tmp to construct a snapshot of larger database that is populated by pg_dump | pg_restore . Since PostgreSQL is already trimming WAL the on-disk representation is already reasonably compact.

Other Utilities

There is one notable utility provided in the contrib/ directory. flattenjs is line-oriented way of viewing JSON data structures that is also compatible with PostgreSQL JSON operators. To install:

cp contrib/flattenjs $HOME/local/bin/

Schema Comparison (experimental)

The 3.0 release will include a new utility ddl_compare for generating a representation of two schemas that is suitable for diff.

ddl_compare -g roles.sql test.sql prod.sql
----
/home/eradman/localharvest/{a,b}/products    |  18 ------------------
/home/eradman/localharvest/{a,b}/farmer      |   5 ++---
2 files changed, 2 insertions(+), 21 deletions(-)

This works by

  1. Starting up two temporary instances of PostgreSQL
  2. Optionally applying global changes to both databases
  3. Loading a patched version of the ddlx extension
  4. Using ddlx to generate a complete definition of each table and it's dependencies
  5. Executing git diff to display a summary of chnages

News & Discussion

March 15, 2017
E-mail notification when new releases are tagged
» release-monitoring.org

April 19, 2016
Overcoming First Principles:
A guide for accessing the features of PostgreSQL in test-driven development
» pgconf.us video

November 17, 2015
Database testing in Python
» stackoverflow.com/

July 29, 2015
Minimal postgres instance for testing in Java
» stackoverflow.com/


Last updated on November 29, 2019
Send questions or comments to ericshane@eradman.com