Database Test Isolation
Some applications access a database, but never perform updates—in this case writing non-interacting tests is easy enough: load a schema and sample data, then run queries against that data. Whenever data is modified we need a strategy for preventing test interaction. The options are:
- Wrap each test in a transaction
- Truncate tables and load fixtures before each test
- Create a copy of the database for each test
- Revert the database to a restorepoint after each test
Each of these approaches incurs a varying level of complexity. Options 1–3 provide responsive feedback; options 2–4 provide the full set of capabilities:
|✔||Overhead of less then 50ms for each test|
|✔||Any data update is permissible|
|✔||Full capability to exercise transactions|
|✔||Can use session-level features|
|✔||Connection pools are okay|
|✔||External utilities invoked using a connection URI|
Option 1: Begin, Rollback
PostgreSQL has a feature called savepoints, which at first to allow for nested transactions, but this is not the case. Not only is the syntax different, but any error causes the entire outer transaction to fail, and must be manually aborted.
As long as the system under test does not perform it's own transaction handling wrapping each test in a transaction should suffice
BEGIN; -- test 1 ROLLBACK; BEGIN; -- test 2 ROLLBACK;
The challenge to writing a test runner is to ensure that the same database connection is used throughout all of the tests. For some components (such as external binaries) this is may be impossible.
Option 2: Truncate
Truncating tables and loading test data at the start of each test is a sensible approach, although it is difficult to estimate the overhead added to run each test. You may need to disable triggers and constraints while reloading the data
SET session_replication_role = 'replica'; TRUNCATE table_a; TRUNCATE table_b; -- COPY/INSERT data SET session_replication_role = 'origin';
This approach may require a bit of extra work, such as refreshing materialized views or resetting sequences, but is still a clean solution since each test is not responsible for cleanup.
Option 3: Copy Database
In my view the most capable approach is to create a new database for each test, using the initial test schema and data as the template
-- create template -- load schema and test data CREATE DATABASE test_0 TEMPLATE test; CREATE DATABASE test_1 TEMPLATE test_0; -- test 1 (pg_url + '&dbname=test_1') CREATE DATABASE test_2 TEMPLATE test_0; -- test 2 (pg_url + '&dbname=test_2')
The setup for each test is quick: about 30ms. Because you have a complete database there are very few restrictions.
Option 4: Point-in-Time Recovery
A fork of Postgres such as Zenith that supports "branching" provides a means of creating an isolated database using copy-on-write at the storage layer. It is also possible to do this using a file system that can restore to a snapshot.
Create a restore point using
- Run the test
- Stop Postgres
Adjust the recovery targets in
- Start Postgres
- Wait for recovery to complete
The only time I would deep this approach is worthwhile is if test database is large.