This presentation was given at the PostgreSQL Backup & Recovery virtual event by Pavel Konotopov (Clustering/HA Team Lead, Postgres Professional) at November 30, 2021.
in IT; • Last 5 years working with PostgreSQL; • Database engineer specialized in PostgreSQL high availability; • Experience administering 300+ PostgreSQL clusters in a productive environment; • Last year working in Postgres Professional. LinkedIn: https://www.linkedin.com/in/pavel-konotopov-262028119 Email: [email protected]
do we mean by “database backup”? • What is good for PostgreSQL? • Overview of PostgreSQL-specific backup tools. • Advanced backup techniques. • Backup techniques in PostgreSQL HA clusters.
disaster (obvious case): Ø Unexpected power outage; Ø Sudden advent of Out of Memory killer; Ø Data corruption; Ø Random cloud instance death; Ø Malicious misrepresentation or deletion of data; Ø And … whatever you cannot imagine J Ø Fast new replicas creating in Highly Available PostgreSQL installations; Ø Creating Sandbox/Dev/Stage/QA/Preprod/UAT environments; Ø Point in time database recovery; Ø Data archive for future analysis; Ø Security standards requirements – HIPAA, PCIDSS, etc; Ø Potential response to future challanges.
snap03 DB Node DB Node Control Node Cloud/Hypervisor API • open connection to PG • run SELECT pg_start_backup() • hold connection • Create snapshot API call • Create snapshot • run SELECT pg_stop_backup() • close connection snap01 • Create snapshot • Before recovery, do not forget to remove postgreslq.pid file from the snapshot!
Open connection • Open connection to PG • Run SELECT pg_start_backup() • Hold connection Backup Node Control Node • Copy PGDATA somewhere as a new backup • In parallel copy WAL files • Copy PGDATA • Run SELECT pg_stop_backup() • Close PG connection • Copy newly created backup to the backup node • Trigger remote copy task • Close connection • The directories need to be copied and traversed in a certain order; • You must create a backup_label file.
many potential points of failure; Ø In both cases we should maintain snapshots or backups ourselves; Ø Taking backup could be too long, we want it to be faster! Ø No well-known implementation of these approaches, in both cases we should make our own scripts; Ø Big database changes – large snapshots, large size backups; Ø No incremental and differential backups are possible; Ø No Point In Time Recover (PITR); Ø We need advanced backup tools!
pg_restore are main utilities for this; Ø Makes a dump as SQL code; Ø The dump will be for one particular point in time (PostgreSQL doing snapshot when dump has began). BUUUT … Ø Recovery takes very long time if the Database is large: Ø Data loading; Ø Indexes creation; Ø No statistics! Ø No streaming backups, no point in time recovery possible!
ü First it dumps only schema, then data; ü pg_dump/pg_restore can parallelize for speedy data dump/restore; ü Use COPY command to save/load data into/from separate files/tables; Ø RESTORE ü Load schema and data separately; ü Parallel data uploading ü Background indexes creation (CONCURRENTLY) ü But still no statistics! ü Need to run VACUUM ANALYSE.
Ø When you are migrating to the major PG version; Ø For some reason pg_upgrade is not possible; Ø You don't want to drag "garbage" in binary files to the new version; Ø You can afford to stop the service for a while; Ø The size of the database is relatively small (< 1Tb). Ø Useful to validate a newly restored database or test backups (let’s remember this)!
pg_basebackup – is ”out of the box” tool; Ø Every PostgreSQL installation has it; Ø Can take backup locally and remotely by postgres protocol; BUUUT… Ø What if the database is large (>1Tb)? Ø What if we have very small maintenance window for database restore? Ø What if we have a limited backup storage size? Ø What if we are not sure if the backups are valid? Ø What if we want to restore DB state to the point in time? Ø More ”what if”!!! Ø We need more advanced backup tool! Ø It's cool to have backups, but not cool not to be able to recover in a reasonable amount of time!
Ø Your responsibility (DIY) Ø pgBackRest Ø --stanza option Ø Common repository for many instances Ø pg_probackup Ø --instance option Ø Common repository for many instances Ø Barman Ø <server_name> option Ø Common repository for many instances
Ø archive_command = “wal-g/wal-e wal-push …” Ø restore_command = “wal-g/wal-e wal-fetch …” Ø pgBackRest Ø archive_command = “pgbackrest archive-push…” Ø restore_command = “pgbackrest archive-get..” Ø Can work in asynchronous mode! Ø pg_probackup Ø archive_command = “pg_probackup archive-push…” Ø restore_command = “pg_probackup archive-get…” Ø Barman Ø archive_command = “rsync …” Ø restore_command = ”barman get-wal…” What is WAL? Write Ahead Log - the files where all the changes occurring in the DBMS are recorded before their will be applied into DB, to ensure the possibility of restoring. Having WAL, we can replay it from the beginning (usually since the last backup) to a certain point, thereby restoring the state of the DBMS for a certain period of time. Why do we need WAL archiving? Ensure that the DBMS can be restored to a point in time – Point In Time Recovery (PITR)
Ø retain N – number of backups in place Ø Before <wal-segment> Ø pgBackRest Ø Full & Differential Backup Retention - number of backups to retain Ø Archive Retention Ø Defined in configuration file Ø pg_probackup Ø --retention-redundancy Ø --retention-window Ø delete --expired --wal Ø Barman Ø retention_policy = {REDUNDANCY value RECOVERY WINDOW OF value {DAYS | WEEKS | MONTHS}}
backup Restore selected backup Validation: how to ensure that backup is valid? Calculate stats Save it inside (or external) DB Backup versioning Other metadata… Calculate and compare stats Run SQL test (business logic) Save results to external DB pg_dump –d dbname > /dev/null
Ø WALG_GPG_[KEY,PATH,PASSPHRASE] Ø Yandex Cloud KMS support Ø WALG_LIBSODIUM_[KEY,PATH] Ø pgBackRest Ø --repo-cipher-type = aes-256-cbc Ø --repo-cipher-pass Ø pg_probackup Ø Not Yet Implemented Ø There is the problem with Russian laws, we need to obtain a special license to include term “encryption” into. Ø Barman Ø Not Yet Implemented
a copy of a PostgreSQL instance using the backup catalog. Ø wal-g catchup-push /path/to/master/postgres --from-lsn replica_lsn Ø wal-g catchup-fetch /path/to/replica/postgres backup_name Ø pg_probackup Ø Creates a copy of a PostgreSQL instance without using the backup catalog. Ø pg_probackup catchup -b catchup_mode --source- pgdata=path_to_pgdata_on_remote_server --destination- pgdata=path_to_local_dir Ø Also we are able to catchup primary by using backup catalog and incremental copies.