Eric Radman : a Journal

Introducing Multi-Master


Thanks to work excellent work of 2ndQuadrant, the components needed to support unidirectional and bidirectional replication are [mostly] integrated into PostgreSQL. Logical replication allows a range of options including the replication of part of a cluster and multi-master configurations.


on Linux

At the time of this writing (PG 9.5) still does not bundle everything required to run logical replication out of the box, so fetch the latest tarballs from and build them

cd postgres-bdr
./configure --prefix=/usr/pgsql-9.4-bdr --enable-debug
sudo make install
cd contrib
sudo make install

cd bdr-plugin
PATH=/usr/pgsql-9.4-bdr/bin:$PATH ./configure --prefix=/usr/pgsql-9.4-bdr --enable-bdr
sudo make install

on BSD

Installation on the BSDs is similar

FLEX=/usr/local/bin/gflex ./configure
gmake -j2
doas gmake install
cd contrib
doas gmake install

PATH=/usr/local/pgsql/bin:$PATH ./configure
gmake -j2
doas gmake install

Starting up the Cluster

On each node initialize the databases

su -l postgres
/usr/pgsql-9.4/bin/initdb -D /pg_data/9.4-bdr -A trust

Enable BDR in postgresql.conf thusly:

listen_addresses = '*'

shared_preload_libraries = 'bdr'
wal_level = 'logical'
track_commit_timestamp = on
max_connections = 100
max_wal_senders = 10
max_replication_slots = 10
max_worker_processes = 10

And then set up the appropriate permissions in pg_hba.conf

local   replication     postgres                                trust
host    replication     postgres            trust
host    replication     postgres        ::1/128                 trust

host all all  password

host replication postgres trust
host replication postgres trust

host replication bdrsync password
host replication bdrsync password

Next fire up the servers (as the user postgres) and create a user to be used for BDR

/usr/pgsql-9.4/bin/pg_ctl -D /pg_data/9.4-bdr start
psql -c "CREATE USER bdrsync superuser;"
psql -c "ALTER USER bdrsync WITH PASSWORD '12345#';"

For each host we will create an unprivileged user and a blank database

/usr/pgsql-9.4/bin/createuser amsv2
/usr/pgsql-9.4/bin/createdb -O amsv2 amstest
psql amstest -c 'CREATE EXTENSION btree_gist;'
psql amstest -c 'CREATE EXTENSION bdr;'

Finally it is time to join the nodes together in a bdr group:

SELECT bdr.bdr_group_create(
    local_node_name := 'node1',
    node_external_dsn := 'host= user=bdrsync dbname=amstest password=12345#'

SELECT bdr.bdr_group_join(
    local_node_name := 'node2',
    node_external_dsn := 'host= user=bdrsync dbname=amstest password=12345#',
    join_using_dsn := 'host= user=bdrsync dbname=amstest password=12345#'

Verifying Associations

Each peer uses a replication slot to identify other nodes that changes propagate to

amstest=# select * from bdr.bdr_node_slots;
node_name |                slot_name
node1     | bdr_17168_6231255027739465518_1_17948
node3     | bdr_17168_6231542136481997963_1_17162
(2 rows)

Also see the documentation on BDR-related system catalogs.

In production a lot of useful summary data can be seen in the bdr.pg_stat_bdr table.

Caveats of BDR

DDL Aquires Global Locks

All data manipulation of any sort is blocked on other masters if a table is altered:

amstest=# insert into t2 values (70);
ERROR:  database is locked against ddl by another node
HINT:  Node (6226337458219448371,1,16385) in the cluster is already performing DDL

Properties of Roles

  1. Roles are not part of a database and are therefore not replicated
  2. Other masters will repeatedly try to change ownership
  3. All ownership changes must be applied. Setting the owner of a table to A must succeed before setting the owner to B.

Since clients receive errors DDL needs to be planned carefully

Global IDs

With logical replication sequences are local to the database. This is a challenge because for each database a scheme for avoiding ID conflicts. postgres-bdr includes a special sequence that will host an election between nodes and will negotiate ID blocks

CREATE SEQUENCE global_job_id USING bdr;
ALTER TABLE job ALTER COLUMN id SET DEFAULT nextval('global_job_id');

If we do not have enough nodes to establish a quorum then each node will eventually use up it's pre-allocated block of IDs. This does not block; it raises an error

ERROR:  could not find free sequence value for global sequence public.global_job_id

While the initial ID of a sequence can be set when importing data; there is no direct way to increment the ID of an existing counter. We can force it this way:

ALTER SEQUENCE global_job_id RESTART WITH 1000000;
--ERROR:  ALTER SEQUENCE ... restart is not supported for bdr sequences

Perhaps the best we can do is increment the counter manually

SELECT max(nextval('global_job_id')) FROM generate_series(1,1000);


Import Existing Data

Before importing the data itself it is important to ensure that the schema itself works

pg_restore -c -U amstest -d amstest -O -Fc -x -s /archive/ams.dump

Before trying to make a logical dump, remove BDR-related bookkeeping

psql amstest -c 'truncate bdr.bdr_queued_commands;'
psql amstest -c 'truncate bdr.bdr_queued_drops;'
pg_dump amstest > /tmp/amsv2.dump

To convert all sequences to global add a directive to the top of the dump file

SET LOCAL default_sequenceam = 'bdr';
This requires a restore using the single-transaction flag
psql amstest -1 < /tmp/amsv2.dump

Recreating a Database

To remove the replication slots and demote a remote node run the following on each instance

SELECT bdr.bdr_part_by_node_names(ARRAY['node2']);
DELETE FROM bdr.bdr_nodes WHERE node_name='node2';
SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE client_port is null;

To destroy a database locally you first need to ensure that all clients have disconnected. Force this using

SELECT pg_drop_replication_slot(slot_name)
FROM pg_replication_slots
WHERE database='amstest';
dropdb amstest

There is no simple way to convert a node to a sremove an orphaned disconnected node from a
group. This is the <a
href="">current workaround</a>
SET LOCAL bdr.skip_ddl_locking = on;
SET LOCAL bdr.permit_unsafe_ddl_commands = on;
SET LOCAL bdr.skip_ddl_replication = on;
DELETE FROM bdr.bdr_connections;
DELETE FROM bdr.bdr_nodes;
SELECT bdr.bdr_connections_changed();

Last updated on November 26, 2016