Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Postgres Architectures in Production | Postgres...

Citus Data
December 09, 2020

Postgres Architectures in Production | Postgres Build 2020 | Dimitri Fontaine

When using PostgreSQL in production it is important to implement a strategy for High Availability. With a database service, the HA properties apply to both the service itself and of course to the data set. In this talk, we learn how to think about the specific HA needs of your production environment and how to achieve your requirements with Open Source tooling when it comes to your database of choice, PostgreSQL. In particular, we dive into many options that could be implemented for Postgres to evolve its offering from being a toolset to being “batteries included”. What does it mean in the context of HA? How to achieve it?

Citus Data

December 09, 2020
Tweet

More Decks by Citus Data

Other Decks in Technology

Transcript

  1. PostgreSQL Architectures in Production Dimitri Fontaine PostgreSQL Major Contributor P

    O S T G R E S B U I L D 2 0 2 0 , D E C 8 - 9 , O N L I N E
  2. PostgreSQL for developers Dimitri Fontaine PostgreSQL Major Contributor A B

    O O K A B O U T P O S T G R E S Q L B Y D I M I T R I F O N T A I N E
  3. S p e c i a l D i s

    c o u n t Use Code for 20% “Postgres Build 20”
  4. PostgreSQL Architectures in Production Dimitri Fontaine PostgreSQL Major Contributor P

    O S T G R E S B U I L D 2 0 2 0 , D E C 8 - 9 , O N L I N E
  5. High Availability • Maintain service • Failover • Automated? •

    Maintain data • Disaster Recovery • Streaming Replication
  6. Included in Postgres • Streaming Replication • Sync or Async

    • Per transaction setting • ALTER ROLE vip SET synchronous_commit TO ‘remote_apply’; • ALTER ROLE … SET … • ALTER DATABASE … SET … A B C
  7. Included in Postgres • Online Streaming changes • Replication Slots

    • pg_basebackup • pg_rewind • Fast forward • Cascading Replication • Online Standby Promotion A B C
  8. Included in Postgres • Disaster Recovery basics: PITR • Setup

    a “standby” with a target • Archiving provided separately A B C
  9. Included in Postgres • Client Side HA • Multi-Host Connection

    String • Client still has to reconnect when connection is lost, of course A B C postgres://hostA:5502,hostB:5503,hostC:5501/dbname?target_session_attrs=read-write&sslmode=require
  10. Not included in Postgres • Archiving • Base Backups •

    WAL Archiving • Retention setup for Disaster Recovery
  11. Not included in Postgres • Online “Membership” changes • Idea

    that roles are DYNAMIC • Ensure retention during failover • For WALs • And for base backups • Automated node role changes • Restart to reconnect to new primary
  12. Not included in Postgres • Configuration Management • postgresql.conf •

    pg_hba.conf • Some setting change require a restart • Extension Management • Postgres Upgrades • Minor upgrade (scheduled restarts) • Major upgrades
  13. Primary Secondary Application Monitor SQL SQL (fallback) Streaming Replication Health

    checks Health checks Single Standby Only provides failover, no HA
  14. Single Standby $ pg_autoctl create monitor —auth trust —self-signed-ssl —run

    $ pg_autoctl create postgres \ —auth trust \ —self-signed-ssl. \ —monitor postgres://host/pg_auto_failover \ —run $ pg_autoctl create postgres \ —auth trust \ —self-signed-ssl. \ —monitor postgres://host/pg_auto_failover \ —run
  15. number_sync_standby = Node A Primary replication quorum = true candidate

    priority = Node B Secondary replication quorum = true candidate priority = Node C Secondary replication quorum = true candidate priority = Application Monitor SQL SQL (fallback) Streaming Replication Health checks Two Standby Nodes Allows to implement HA
  16. Two Standby Nodes $ pg_autoctl create postgres \ —auth trust

    \ —self-signed-ssl. \ —monitor postgres://host/pg_auto_failover \ —run
  17. number_sync_standby = Node A Primary replication quorum = true candidate

    priority = Node B Secondary replication quorum = true candidate priority = Node C Secondary replication quorum = true candidate priority = Node D Secondary replication quorum = true candidate priority = Application Monitor SQL SQL (fallback) Health checks Streaming Replication Three Standby Nodes
  18. Three Standby Nodes $ pg_autoctl create postgres \ —auth trust

    \ —self-signed-ssl. \ —monitor postgres://host/pg_auto_failover \ —run
  19. number_sync_standby = Node A Primary replication quorum = true candidate

    priority = Node B Secondary replication quorum = true candidate priority = Node C Secondary replication quorum = false candidate priority = Node D Secondary replication quorum = true candidate priority = Application Monitor SQL SQL (fallback) Health checks Streaming Replication Three Standby Nodes, one async
  20. Three Standby Nodes, One Async $ pg_autoctl set node candidate-priority

    0 $ pg_autoctl set node replication-quorum false $ pg_autoctl create postgres \ —auth trust \ —self-signed-ssl. \ —monitor postgres://host/pg_auto_failover \ —replication-quorum false \ —candidate-priority 0 \ —run
  21. Included in pg_auto_failover • Streaming Replication • Replication slots •

    Maintained on standby nodes • Easy setup including SSL certificates • Dynamic settings, online changes $ pg_autoctl create postgres
  22. Included in pg_auto_failover • Online “Membership” changes • Idea that

    roles are DYNAMIC • Ensure retention during failover • For WALs • And for base backups • Automated HBA editing $ pg_autoctl show uri $ pg_autoctl show state $ pg_autoctl get formation settings
  23. Included in pg_auto_failover • Online membership changes • Online formation

    setting changes • Maintenance operations $ pg_autoctl set node candidate-priority 0 $ pg_autoctl set node replication-quorum false $ pg_autoctl set formation number-sync-standbys 2 $ pg_autoctl enable maintenance $ pg_autoctl disable maintenance
  24. Included in pg_auto_failover • Network split detection / protection •

    Active monitoring and role changes • Automated and manual Failover • Including manual Switchover • Including manual choice of target node $ pg_autoctl perform failover $ pg_autoctl perform switchover $ pg_autoctl perform promotion —name node_d
  25. Not included in pg_auto_failover yet • Archiving and Disaster Recovery

    • Configuration Management • postgresql.conf • pg_hba.conf • Some setting change require a restart • Extension Management • Postgres Upgrades • Minor upgrade (scheduled restarts) • Major upgrades
  26. PostgreSQL Architectures in Production Dimitri Fontaine PostgreSQL Major Contributor P

    O S T G R E S B U I L D 2 0 2 0 , D E C 8 - 9 , O N L I N E