Recovery v1
In EDB Postgres Distributed for Kubernetes, recovery is available as a way to bootstrap a new PGD group starting from an available physical backup of a PGD node. Recovery can't be performed in place on an existing PGD group.
EDB Postgres Distributed for Kubernetes also supports point-in-time recovery (PITR), which allows you to restore a PGD group up to any point in time, from the first available backup in your catalog to the last archived WAL. Having a WAL archive is mandatory for PITR.
Prerequisites
Before recovering from a backup:
Make sure that the PostgreSQL configuration (
.spec.cnp.postgresql.parameters
) of the recovered cluster is compatible with the original one from a physical replication standpoint.When recovering in a newly created namespace, first set up a cert-manager CA issuer before deploying the recovered PGD group.
For more information, see EDB Postgres for Kubernetes recovery - Additional considerations in the EDB Postgres for Kubernetes documentation.
Recovery from an object store
You can recover from a PGD node backup created by Barman Cloud and stored on supported object storage.
For example, given a PGD group named pgdgroup-example
with three instances
with backups available, your object storage will contain a directory for each
node:
pgdgroup-example-1
, pgdgroup-example-2
, pgdgroup-example-3
This example defines a full recovery from the object store.
The operator transparently selects the latest backup among the defined serverNames
and
replays up to the last available WAL.
Important
Make sure to correctly configure the WAL section according to the source cluster.
In the example, since the pgdgroup-example
PGD group uses compression
and encryption
, make sure to set the proper parameters also in the PGD group
that's being created by the restore
.
Note
The example takes advantage of the parallel WAL restore feature, dedicating up to eight jobs to concurrently fetch the required WAL files from the archive. This feature can appreciably reduce the recovery time. Make sure that you plan ahead for this scenario and tune the value of this parameter for your environment. It makes a difference when you need it.
PITR from an object store
Instead of replaying all the WALs up to the latest one, after extracting a base backup, you can ask PostgreSQL to stop replaying WALs at any point in time. PostgreSQL uses this technique to achieve PITR. (The presence of a WAL archive is mandatory.)
This example defines a time-base target for the recovery:
Important
PITR requires you to specify a targetTime
recovery target by using the options described
in Recovery targets. When you use targetTime
or targetLSN
, the operator
selects the closest backup that was completed before that target. Otherwise, it
selects the last available backup in chronological order between the specified serverNames
.
Recovery from an object store specifying a backupID
The .spec.restore.recoveryTarget.backupID
option allows you to specify a base backup from
which to start the recovery process. By default, this value is empty.
If you assign a value to it, the operator uses that backup as the base for the recovery. The value must be in the form of a Barman backup ID.
This example recovers a new PGD group from a specific backupID of the
pgdgroup-backup-1
PGD node:
Important
When you specify a backupID
, make sure to list only the related PGD node
in the serverNames
option, and avoid listing the other ones.
Note
Defining a specific backupID
is especially needed when using one of the
following recovery targets: targetName
, targetXID
, and targetImmediate
.
In such cases, it's important to specify backupID
, unless
the last available backup in the catalog is okay.
Recovery from volumeSnapshot
You can also recover a PGDgroup from a volumeSnapshot backup. Stanza
spec.restore.volumeSnapshots
is used to define the criteria for volumeSnapshots restore
candidates. The operator transparently selects the latest volumeSnapshot among the candidates.
The operator requires the following annotations/labels in the volumeSnapshot. These annotations/labels are automatically added if volumeSnapshots are taken by the operator.
Annotations:
k8s.enterprisedb.io/backupEndTime
is used to compare and select the latest snapshot.k8s.enterprisedb.io/pvcRole
represents the pvcRole of the volumeSnapshot. Supported roles include PG_WAL and PG_DATA.
Labels:
k8s.enterprisedb.io/cluster
indicates the node where the volumeSnapshot is taken, crucial for fetching the serverName in the object store for WAL replaying.k8s.enterprisedb.io/backupName
is the backup name of the volumeSnapshot. Used to group volumeSnapshots when more volumes are defined in the backup.k8s.enterprisedb.io/tablespaceName
represents the tablespace name of the volumeSnapshot when the volumeSnapshot role isPG_TABLESPACE
.
This example shows a full recovery from volumeSnapshots. After the volumeSnapshot recovery,
WAL replaying for full recovery will target server pgdgroup-backup-vs-1
.
For more information, see Recovery from volumeSnapshot objects in the EDB Postgres for Kubernetes documentation.
PITR from volumeSnapshot
You can instruct PostgreSQL to halt the replay of write-ahead logs (WALs) at any specific moment during volumeSnapshot recovery. This is the same capability as when recovering from an object store.
This example shows setting a time-based target for recovery using volume snapshots:
Recovery targets
Beyond PITR are other recovery target criteria you can use. For more information on all the available recovery targets, see Recovery in the EDB Postgres for Kubernetes documentation.