ZFS Quickstart
ZFS is not only a full-featured file system, it is also handles volume discovery, RAID, and network access.
Rocky/Alma Linux Install
dnf install https://zfsonlinux.org/epel/zfs-release-2-2.el9.noarch.rpm dnf config-manager --disable zfs dnf config-manager --enable zfs-kmod dnf install zfs echo zfs > /etc/modules-load.d/zfs.conf
See also OpenZFS RHEL install.
FreeBSD
# /etc/rc.conf zfs_enable="YES"
ZFS does not require a partition table, but initializing a disk with GUID partition map will avoid spurious warnings and make it clear what kind of file system is on a device
geom disk list # List block devices gpart destroy -F nda1 # Delete partition data gpart create -s gpt nda1 # New GUID partition table gpart add -t freebsd-zfs nda1 # Create and label partition
Create zpool and new volume on first partition
zpool create -O compression=lz4 zpool2 /dev/nda1p1 zfs create -o mountpoint=/ci zpool2/ci
Automated Snapshot Managment
To automate snapshot retention make a new periodic snapshot and prune anything that is more than N days old
#!/bin/sh -e today=$(date +"%Y-%m-%d") for fs in zpool2/ci; do zfs snapshot $fs@$today for snap in $(zfs list -t snapshot -H -o name $fs | sort -r | tail +30); do zfs destroy $snap done done
Run daily
15 20 * * * /usr/local/bin/zfs-snap.sh
NFS Export
While any local mount can be added to
/etc/exports
ZFS sharing allows mount points to be automatically by setting the
shrenfs
property on each volume
$ doas zfs sharenfs='-network 192.168.2.0/24' zpool2/ci $ zfs get sharenfs zpool2/ci NAME PROPERTY VALUE SOURCE zpool2/ci sharenfs -network 192.168.2.0/24 local
To unshare a pool or volume
$ doas zfs set sharenfs=off zpool2
On FreeBSD,
rpcbind
must also be enabled. Optionally allow clients so connect without the
resvport
option.
# /etc/rc.conf rpcbind_enable="YES" nfs_reserved_port_only="NO"
Virtual Machines
When running KVM or Bhyave, ZFS can provide a device that can be attached to a
virtual machine directly. This is referred to as a
zvol
zfs create -sV 100G -o volmode=dev zpool2/vm/mykube2
Encryption
For manual unlock
zfs create -o encryption=on -o keylocation=prompt -o keyformat=passphrase zroot/home zfs set setuid=off zroot/home zfs set devices=off zroot/home zfs set mountpoint=/home zroot/home
Then in
rc.local
zfs load-key -r zroot/home zfs mount zroot/home
Import all Pools after Reinstall
zpool import -a
Formulas
Create RAID-1 pool
zpool create zpool0 /dev/nda0p1 zpool attach zpool0 /dev/nda0p1 /dev/nda1p1 zpool status
Add spare
zpool add zpool0 spare /dev/nda2p1
Replace disk
zpool replace zpool0 /dev/nda1p1 /dev/nda2p1 zpool detach zpool0 /dev/nda1p1 zpool add zpool0 spare /dev/nda1p1
Create RAID-Z pool
zpool create zpool0 raidz /dev/nda0p1 /dev/nda1p1 /dev/nda2p1
Tuning for PostgreSQL
The
OpenZFS Workload Tuning
pages indicates that
full_page_writes
can be disabled since there is not need to guard against
torn pages
on ZFS.
This parameter will likely lead to corruption if the database is replicated to
a non-ZFS volume.
Configuring Oracle Solaris ZFS for an Oracle Database (September 2020) provides a detailed guide that could be translated for PostgreSQL
| recordsize | logbias | primarycache | compression | |
|---|---|---|---|---|
| Tables | 32K | latency | all (data and metadata) | LZ4 |
| Redo | 128K | latency | Do not use | off (default) |
| Index | 32K | throughput | all (data and metadata) | off (default) |
| Undo | 1 MB | throughput | all (data and metadata) | off (default) |
| Temp | 128K | latency | Do not use | off (default) |
| Archive | 1 MB | throughput | Do not use | LZ4 |
From this we can also see a problem: these options are very dificult to validate, and this complexity can easily make obtaining a consensus view within a team impossible.