Upgrade to Pro — share decks privately, control downloads, hide ads and more …

How to Ride Elephants Safely: Working with Post...

How to Ride Elephants Safely: Working with PostgreSQL when your DBA is not around

Every so often, one may be called upon to perform the tasks of a database administrator, especially in cases where application performance seems to be affected by the database. This talk will give non-DBAs an insight into how the world’s most powerful open-source database works, and the kind of tools and features that are readily available for people who unexpectedly find themselves in the DBA pilot’s seat. Topics covered will include query tuning, monitoring, indexing, and a basic run-down of some configurables to help you know what options are at your disposal

As presented at DjangoCon US 2023-10-16

Video recording: https://2023.djangocon.us/talks/how-to-ride-elephants-safely-working-with-postgresql-when-your-dba-is-not-around/

Richard Yen

October 16, 2023
Tweet

More Decks by Richard Yen

Other Decks in Technology

Transcript

  1. © EnterpriseDB Corporation 2023 - All Rights Reserved Riding Elephants

    Safely DjangoConUS 2023.10.16 Richard Yen Working with PostgreSQL When Your DBA is Not Around 1
  2. © EnterpriseDB Corporation 2023 - All Rights Reserved 2 •

    Software Developer/Support Engineer @ EDB since 2015 • Perl web developer before getting thrust into a DBA role • Worked on my first Django app in 2020 • Been using PostgreSQL since v. 7.4 About Me
  3. © EnterpriseDB Corporation 2023 - All Rights Reserved 3 •

    Django developer! • Someone else manages your development or UAT databases • Not intimidated by command line interfaces • Somewhat familiar with Linux • DBA is on vacation • ... or quit • ... or you never had a DBA 😱 Is this you?
  4. © EnterpriseDB Corporation 2023 - All Rights Reserved 4 •

    Installation • Performance tuning • Query tuning • Memory management • Indexes • Views • Tablespaces • Backups • Replication • Pooling • Foreign Data Wrappers • Statistics collection • Autovacuum tuning • Monitoring • Triggers • Logical Decoding • Encodings • Timestamps • Query planning • PL/SQL • Access control • WAL • Disaster recovery • PITR • Logging • Constraints • Data types • ... and more! There's a Lot to Cover!
  5. © EnterpriseDB Corporation 2023 - All Rights Reserved 5 •

    Log into the database, start and stop it • Take a backup before catastrophic damage occurs • Diagnose performance/stability issues by reading the logs • Identify any schema changes that could improve performance • Understand PostgreSQL's file/directory structure What We Hope to Achieve in 40 Minutes
  6. © EnterpriseDB Corporation 2023 - All Rights Reserved 6 •

    Getting in to the database • Looking around in the database • Understanding how the database is set up • Maintaining the database • Improving database performance • What NOT to do • Where to find help Our Roadmap
  7. © EnterpriseDB Corporation 2023 - All Rights Reserved 7 •

    SSH access allows start/stop the database as needed • Managed databases will need to be stopped/started from the console (i.e. RDS) • Perform sanity checks before starting up • Disk space - know if the database is capable of starting up • Database logs - know why the database shut down earlier • sudo systemctl start postgresql • May need to use postgresql-${version} in some cases • pg_ctl • Need to know ${PGDATA} to perform start/stop • pg_ctl -d ${PGDATA} start • pg_ctl -d ${PGDATA} stop • pg_ctl -d ${PGDATA} -m f stop • pg_ctl -d ${PGDATA} -m i stop • This technically crashes the database, starts in recovery mode Starting and Stopping the Database
  8. © EnterpriseDB Corporation 2023 - All Rights Reserved 8 •

    What you need: • Hostname • Port (5432 by default) • Username • Password • Check application config if defaults don't work • psql -- Postgres' default command line interface • GUI database applications • PgAdmin • DBeaver Connecting to the Database
  9. © EnterpriseDB Corporation 2023 - All Rights Reserved 9 Connecting

    to the Database psql -h "database.example.com" -U edb_admin edb_admin psql (15.3 (Homebrew), server 15.3 (Debian 15.3-1.pgdg100+1)) SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, bits: 256, compression: off) Type "help" for help. edb_admin=> \d List of relations Schema | Name | Type | Owner --------+-------------------------+------+---------- public | pg_stat_statements | view | postgres public | pg_stat_statements_info | view | postgres (2 rows) edb_admin=> \dn List of schemas Name | Owner ---------------+------------------- my_schema | edb_admin public | pg_database_owner results | edb_admin (4 rows)
  10. © EnterpriseDB Corporation 2023 - All Rights Reserved 10 •

    SELECT * FROM pg_stat_activity; • Shows what is going on at this very instant • Limited visibility in managed database (i.e. RDS, Google Cloud, Azure) • Postgres logs are also very informative (SHOW log_directory;) • RDS logs may be helpful if available • Superusers can cancel a query or terminate a session • SELECT pg_cancel_backend(${pid}) • SELECT pg_terminate_backend(${pid}) What's Going On in the Database?
  11. © EnterpriseDB Corporation 2023 - All Rights Reserved 11 datid

    | datname | pid | leader_pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | xact_start | query_start | state_change | wait_event_type | wait_event | state | backend_xid | backend_xmin | query_id | query | backend_type -------+----------+--------+------------+----------+----------+------------------+-------------+-----------------+------------- +-------------------------------+-------------------------------+-------------------------------+------------------------------- +-----------------+---------------------+--------+-------------+--------------+----------+---------------------------------- +------------------------------ | | 7212 | | 10 | postgres | | | | | 2023-08-16 21:56:45.037403-07 | | | | Activity | LogicalLauncherMain | | | | | | logical replication launcher | | 7211 | | | | | | | | 2023-08-16 21:56:45.033525-07 | | | | Activity | AutoVacuumMain | | | | | | autovacuum launcher 5 | postgres | 212820 | | 10 | postgres | psql | | | -1 | 2023-08-31 00:23:18.95734-07 | 2023-08-31 00:23:23.702319-07 | 2023-08-31 00:23:23.702319-07 | 2023-08-31 00:23:23.702323-07 | | | active | | 6796 | | select * from pg_stat_activity ; | client backend | | 7208 | | | | | | | | 2023-08-16 21:56:44.813366-07 | | | | Activity | BgWriterMain | | | | | | background writer | | 7207 | | | | | | | | 2023-08-16 21:56:44.807436-07 | | | | Activity | CheckpointerMain | | | | | | checkpointer | | 7210 | | | | | | | | 2023-08-16 21:56:45.041297-07 | | | | Activity | WalWriterMain | | | | | | walwriter (6 rows) What's Going On in the Database?
  12. © EnterpriseDB Corporation 2023 - All Rights Reserved 12 postgres=#

    \x Expanded display is on. postgres=# select * from pg_stat_activity limit 2 offset 1; -[ RECORD 1 ]----+------------------------------------------------- datid | datname | pid | 7211 leader_pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | 2023-08-16 21:56:45.033525-07 xact_start | query_start | state_change | wait_event_type | Activity wait_event | AutoVacuumMain state | backend_xid | backend_xmin | query_id | query | backend_type | autovacuum launcher What's Going On in the Database? -[ RECORD 2 ]----+------------------------------------------------- datid | 5 datname | postgres pid | 212820 leader_pid | usesysid | 10 usename | postgres application_name | psql client_addr | client_hostname | client_port | -1 backend_start | 2023-08-31 00:23:18.95734-07 xact_start | 2023-08-31 00:24:40.08706-07 query_start | 2023-08-31 00:24:40.08706-07 state_change | 2023-08-31 00:24:40.087064-07 wait_event_type | wait_event | state | active backend_xid | backend_xmin | 6796 query_id | query | select * from pg_stat_activity limit 2 offset 1; backend_type | client backend
  13. © EnterpriseDB Corporation 2023 - All Rights Reserved 13 •

    All contained in postgresql.conf • Can be viewed in psql with SHOW ALL; • Some can be changed without a system restart • SELECT name, setting FROM pg_settings WHERE context IN ('sighup','user'); • SET <param> TO <value>; • ALTER SYSTEM SET <param> TO <value>; • Commit changes in psql: SELECT pg_reload_conf(); • From the OS: systemctl reload or kill -HUP ${pid} Configuration
  14. © EnterpriseDB Corporation 2023 - All Rights Reserved 14 Some

    Useful Config Params • search_path -- namespaces to look for tables • work_mem -- memory to allocated for sorting and hashing • Be careful, as this can affect the entire system • log_* params -- controls what gets logged • log_line_prefix • log_checkpoints • log_connections • log_disconnections • log_autovacuum_min_duration • log_hostname • Database logs != WAL logs
  15. © EnterpriseDB Corporation 2023 - All Rights Reserved 15 WAL

    Logs • WAL stands for Write Ahead Log • WAL files live in ${PGDATA}/pg_wal/ • Basically a journal of all the write activity on the database • Provides a means of disaster recovery • Eventually synced/merged with actual database files in ${PGDATA}/base/ • DO NOT DELETE THESE FILES
  16. © EnterpriseDB Corporation 2023 - All Rights Reserved 16 postgres@davinci:~/15/main$

    pwd /var/lib/postgresql/15/main postgres@davinci:~/15/main$ ls -al total 92 drwx------ 19 postgres postgres 4096 Aug 16 21:56 . drwxr-xr-x 3 postgres postgres 4096 Aug 16 21:52 .. drwx------ 6 postgres postgres 4096 Aug 17 01:34 base drwx------ 2 postgres postgres 4096 Aug 18 09:25 global drwx------ 2 postgres postgres 4096 Aug 16 21:52 pg_commit_ts drwx------ 2 postgres postgres 4096 Aug 16 21:52 pg_dynshmem drwx------ 4 postgres postgres 4096 Aug 28 09:33 pg_logical drwx------ 4 postgres postgres 4096 Aug 16 21:52 pg_multixact drwx------ 2 postgres postgres 4096 Aug 25 01:42 pg_notify drwx------ 2 postgres postgres 4096 Aug 16 21:52 pg_replslot drwx------ 2 postgres postgres 4096 Aug 16 21:52 pg_serial drwx------ 2 postgres postgres 4096 Aug 16 21:52 pg_snapshots drwx------ 2 postgres postgres 4096 Aug 16 21:56 pg_stat drwx------ 2 postgres postgres 4096 Aug 16 21:52 pg_stat_tmp drwx------ 2 postgres postgres 4096 Aug 16 21:52 pg_subtrans drwx------ 2 postgres postgres 4096 Aug 16 21:52 pg_tblspc drwx------ 2 postgres postgres 4096 Aug 16 21:52 pg_twophase -rw------- 1 postgres postgres 3 Aug 16 21:52 PG_VERSION drwx------ 3 postgres postgres 4096 Aug 25 02:22 pg_wal drwx------ 2 postgres postgres 4096 Aug 16 21:52 pg_xact -rw------- 1 postgres postgres 88 Aug 16 21:52 postgresql.auto.conf -rw------- 1 postgres postgres 130 Aug 16 21:56 postmaster.opts -rw------- 1 postgres postgres 100 Aug 16 21:56 postmaster.pid postgres@davinci:~/15/main$ ls -al pg_wal/ total 65548 drwx------ 3 postgres postgres 4096 Aug 25 02:22 . drwx------ 19 postgres postgres 4096 Aug 16 21:56 .. -rw------- 1 postgres postgres 16777216 Aug 28 09:33 00000001000000000000000B -rw------- 1 postgres postgres 16777216 Aug 17 01:24 00000001000000000000000C -rw------- 1 postgres postgres 16777216 Aug 17 01:30 00000001000000000000000D -rw------- 1 postgres postgres 16777216 Aug 25 02:18 00000001000000000000000E drwx------ 2 postgres postgres 4096 Aug 16 21:52 archive_status WAL Logs
  17. © EnterpriseDB Corporation 2023 - All Rights Reserved 17 •

    Contained in pg_hba.conf • Allows connections to specific databases by specific users and IP addresses • Changes are committed with HUP or pg_reload_conf() Authentication # TYPE DATABASE USER ADDRESS METHOD # "local" is for Unix domain socket connections only local all all peer # IPv4 local connections: host all all 127.0.0.1/32 password # IPv6 local connections: host all all ::1/128 password # Allow replication connections from localhost, by a user with the # replication privilege. local replication all password host replication all 127.0.0.1/32 password host replication all ::1/128 password
  18. © EnterpriseDB Corporation 2023 - All Rights Reserved 18 Vacuuming

    • Upholds performance by preventing bloat • UPDATE or DELETE simply flag rows as deleted • Vacuum flags deleted rows as re-usable for future INSERT or UPDATE • Autovacuum will vacuum certain tables after some time • Usually best to wait for any heavy vacuuming to finish • If absolutely necessary, use pg_terminate_backend() • Manually vacuum the table immediately. • Run with SET vacuum_cost_delay TO 0;
  19. © EnterpriseDB Corporation 2023 - All Rights Reserved 19 •

    pg_dump • plaintext dump of the database • Can filter based on namespace, table • Can dump a compressed/binary version as well, to save space • Less likely to copy corruption • pg_basebackup • Takes a snapshot of the entire ${PGDATA} directory, includes indexes, FK constraints, etc. • Requires max_wal_senders, and a user with REPLICATION privilege • Faster, but if the database is corrupt, the corruption will be copied Backups
  20. © EnterpriseDB Corporation 2023 - All Rights Reserved 20 •

    Important logging parameters • log_line_prefix • log_min_duration_statement • Other logging parameters • log_statement - Logs statement before executing • log_min_error_statement - Logs specific types of messages • WARNING, ERROR, FATAL, PANIC • log_duration - Logs a duration only • Extension pg_stat_statements might be more useful Monitoring
  21. © EnterpriseDB Corporation 2023 - All Rights Reserved 21 •

    EXPLAIN v. EXPLAIN ANALYZE • Query performance can be evaluated in the logs • As a developer, auto_explain is a very helpful tool, especially if you're using an ORM Performance
  22. © EnterpriseDB Corporation 2023 - All Rights Reserved 22 Explain

    v. Explain Analyze postgres=# EXPLAIN SELECT * FROM pgbench_accounts a JOIN pgbench_branches b ON (a.bid=b.bid) WHERE a.aid < 100000; QUERY PLAN -------------------------------------------------------------------------------- Nested Loop (cost=0.00..4141.00 rows=99999 width=461) Join Filter: (a.bid = b.bid) -> Seq Scan on pgbench_branches b (cost=0.00..1.01 rows=1 width=364) -> Seq Scan on pgbench_accounts a (cost=0.00..2890.00 rows=99999 width=97) Filter: (aid < 100000) (5 rows) postgres=# EXPLAIN ANALYZE SELECT * FROM pgbench_accounts a JOIN pgbench_branches b ON (a.bid=b.bid) WHERE a.aid < 100000; QUERY PLAN ------------------------------------------------------------------------------------------------------------- Nested Loop (cost=0.00..4141.00 rows=99999 width=461) (actual time=0.039..56.582 rows=99999 loops=1) Join Filter: (a.bid = b.bid) -> Seq Scan on pgbench_branches b (cost=0.00..1.01 rows=1 width=364) (actual time=0.025..0.026 rows=1 loops=1) -> Seq Scan on pgbench_accounts a (cost=0.00..2890.00 rows=99999 width=97) (actual time=0.008..25.752 rows=99999 loops=1) Filter: (aid < 100000) Rows Removed by Filter: 1 Planning Time: 0.306 ms Execution Time: 61.031 ms (8 rows)
  23. © EnterpriseDB Corporation 2023 - All Rights Reserved 23 •

    Data types • Be sure to use the right one • Don't use all text • JSON when needed • Indexing • Very important to have proper indexes • Identify any needed indexes with EXPLAIN ANALYZE Improving Performance
  24. © EnterpriseDB Corporation 2023 - All Rights Reserved 24 Improving

    Performance postgres=# UPDATE pgbench_accounts SET bid = aid; UPDATE 100000 postgres=# EXPLAIN ANALYZE SELECT * FROM pgbench_accounts WHERE bid = 1; QUERY PLAN --------------------------------------------------------------------------------------------------------------------- Seq Scan on pgbench_accounts (cost=0.00..5778.24 rows=199939 width=97) (actual time=19.322..45.161 rows=1 loops=1) Filter: (bid = 1) Rows Removed by Filter: 99999 Planning Time: 0.101 ms Execution Time: 45.191 ms (5 rows) postgres=# CREATE INDEX pgba_bid_idx ON pgbench_accounts (bid); CREATE INDEX postgres=# EXPLAIN ANALYZE SELECT * FROM pgbench_accounts WHERE bid = 1; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------- --------- Index Scan using pgba_bid_idx on pgbench_accounts (cost=0.29..8.31 rows=1 width=97) (actual time=0.076..0.077 rows=1 loops=1) Index Cond: (bid = 1) Planning Time: 0.312 ms Execution Time: 0.119 ms (4 rows)
  25. © EnterpriseDB Corporation 2023 - All Rights Reserved 25 •

    kill -9 on any Postgres process • Causes Postgres to crash and enter into recovery mode • Idle Transactions • Always commit/rollback any transactions • Otherwise other users will be held up • Look in pg_stat_activity for idle in transaction sessions (different from simply idle) • Cross reference with pg_locks • Don't drop anything (columns, schemas, indexes, etc.) • Rename them (or wait until the DBA comes back) • Do not delete any files from $PGDATA (especially files in pg_wal or pg_xlog) What NOT to do
  26. © EnterpriseDB Corporation 2023 - All Rights Reserved 26 •

    Postgres Slack (postgresteam.slack.com) • Postgres Community Mailing Lists (postgresql.org/list/) • IRC (postgresql.org/community/irc/) • Wiki (wiki.postgresql.org) • Docs (postgresql.org/docs/current) • EDB Support (enterprisedb.com/support-center) Where to Find Help
  27. © EnterpriseDB Corporation 2023 - All Rights Reserved 27 THANK

    YOU 6 Enjoy your stay at DjangoCon 2023!