Database Test Isolation
Some applications access a database, but never perform updates—in this case writing non-interacting tests is easy enough: load a schema and sample data, then run queries against that data. Whenever data is modified we need a strategy for preventing test interaction. The options are:
- Wrap each test in a transaction
- Truncate tables and load fixtures before each test
- Create a copy of the database for each test
- Revert the database to a restorepoint after each test
Each of these approaches incurs a varying level of complexity. Options 1–3 provide responsive feedback; options 2–4 provide the full set of capabilities:
✔ | Overhead of less then 50ms for each test |
✔ | Any data update is permissible |
✔ | Full capability to exercise transactions |
✔ | Can use session-level features |
✔ | Connection pools are okay |
✔ | External utilities are provided a connection URI |
Option 1: Begin, Rollback
PostgreSQL has a feature called savepoints, which at first to allow for nested transactions, but this is not the case. Not only is the syntax different, but any error causes the entire outer transaction to fail, and must be manually aborted.
If the system under test does not perform it's own transaction handling wrapping each test in a transaction may suffice
BEGIN; -- test 1 ROLLBACK; BEGIN; -- test 2 ROLLBACK;
The challenge to writing a test runner is to ensure that the same database connection is used throughout all of the tests. For some components (such as external binaries) this is may be impossible.
Option 2: Truncate
Truncating tables and loading test data at the start of each test is a sensible approach, although it is difficult to estimate the overhead added to run each test. Disable triggers and constraints while reloading the data
SET session_replication_role = 'replica'; TRUNCATE table_a; TRUNCATE table_b; -- COPY/INSERT data SET session_replication_role = 'origin';
This approach may require a bit of extra work, such as refreshing materialized views or resetting sequences, but is still a clean solution since each test is not responsible for cleanup.
Option 3: Copy Database
In my view the most capable approach is to create a new database for each test, using the initial test schema and data as the template
-- create template -- load schema and test data CREATE DATABASE test_0 TEMPLATE test; CREATE DATABASE test_1 TEMPLATE test_0; -- test 1 (pg_url + '&dbname=test_1') CREATE DATABASE test_2 TEMPLATE test_0; -- test 2 (pg_url + '&dbname=test_2')
Test setup is quick: about 30ms.
Option 4: Point-in-Time Recovery
It is possible to do restore a file system snapshot. The only time I would deep this approach is worthwhile is if test database is large.
-
Create a restore point using
pg_create_restore_point()
- Run the test
- Stop Postgres
-
Adjust the recovery targets in
postgresql.conf
- Start Postgres
- Wait for recovery to complete