Upgrade to Pro — share decks privately, control downloads, hide ads and more …

How to Ride Elephants Safely: Working with Post...

How to Ride Elephants Safely: Working with Postgres when Your DBA is not Around

This conference presentation caters to individuals occasionally thrust into the role of a database administrator, particularly when application performance is at stake due to database intricacies. Geared towards non-DBAs, the talk provides a comprehensive overview of the world's most powerful open-source database, PostgreSQL, offering insights into its functionalities and the array of tools available for those unexpectedly steering the DBA ship.

As presented at COSCUP Taiwan 2024-08-03

Richard Yen

August 03, 2024
Tweet

More Decks by Richard Yen

Other Decks in Technology

Transcript

  1. ©EDB 2024 — ALL RIGHTS RESERVED. Riding Elephants Safely Working

    with PostgreSQL When Your DBA is Not Around Richard Yen COSCUP 2024 2024-08-03
  2. © EDB 2024 - All Rights Reserved 2 • Software

    Developer/Support Engineer @ EDB since 2015 • Using Postgres to back Golang and Python apps • Perl web developer before getting thrust into a DBA role • Been in love with PostgreSQL since 2002 ❤ About Me
  3. © EDB 2024 - All Rights Reserved 3 • Software

    developer, QA Engineer, Data scientist • Someone else manages your development or UAT databases • Not intimidated by command line interfaces • Somewhat familiar with Linux • DBA is on vacation • ... or quit • ... or you never had a DBA 😱 Is this you?
  4. © EDB 2024 - All Rights Reserved 4 • Installation

    • Performance tuning • Query tuning • Memory management • Indexes • Views • Tablespaces • Backups • Replication • Pooling • Foreign Data Wrappers • Statistics collection • Autovacuum tuning • Monitoring • Triggers • Logical Decoding • Encodings • Timestamps • Query planning • PL/SQL • Access control • WAL • Disaster recovery • PITR • Logging • Constraints • Data types • ... and more! There's a Lot to Cover!
  5. © EDB 2024 - All Rights Reserved 5 • How

    to connect to the database • How to start/stop the database • What is going on in the database • Where the data is located • How to back up the data • Understand performance issues Keeping Safe Without a DBA
  6. © EDB 2024 - All Rights Reserved 6 • SSH

    access allows start/stop the database as needed • Managed databases will need to be stopped/started from the console (i.e. RDS) • Perform sanity checks before starting up • Disk space - know if the database is capable of starting up • Database logs - know why the database shut down earlier • sudo systemctl start postgresql • May need to use postgresql-${version} in some cases • pg_ctl • 📣 Need to know ${PGDATA} to perform start/stop • pg_ctl -d ${PGDATA} start • pg_ctl -d ${PGDATA} stop • pg_ctl -d ${PGDATA} -m f stop • pg_ctl -d ${PGDATA} -m i stop • This technically crashes the database, starts in recovery mode Starting and Stopping the Database
  7. © EDB 2024 - All Rights Reserved 7 • What

    you need: • Hostname • Port (5432 by default) • Username • Password • Check application config (your app) and database config (postgresql.conf) if defaults don't work • psql -- Postgres' default command line interface • GUI database applications • PgAdmin • DBeaver Connecting to the Database
  8. © EDB 2024 - All Rights Reserved 8 Connecting to

    the Database psql -h "database.example.com" -U edb_admin edb_admin psql (15.3 (Homebrew), server 15.3 (Debian 15.3-1.pgdg100+1)) SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, bits: 256, compression: off) Type "help" for help. edb_admin=> \d List of relations Schema | Name | Type | Owner --------+-------------------------+------+---------- public | pg_stat_statements | view | postgres public | pg_stat_statements_info | view | postgres (2 rows) edb_admin=> \dn List of schemas Name | Owner ---------------+------------------- my_schema | edb_admin public | pg_database_owner results | edb_admin (4 rows)
  9. © EDB 2024 - All Rights Reserved 9 • SELECT

    * FROM pg_stat_activity; • Shows what is going on at this very instant • Limited visibility in managed databases (i.e. RDS, Google Cloud, Azure) • Postgres logs are also very informative (SHOW log_directory;) • RDS logs may be helpful if available • Superusers can cancel a query or terminate a session • SELECT pg_cancel_backend(${pid}) • SELECT pg_terminate_backend(${pid}) What's Going On in the Database?
  10. © EDB 2024 - All Rights Reserved 10 datid |

    datname | pid | leader_pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | xact_start | query_start | state_change | wait_event_type | wait_event | state | backend_xid | backend_xmin | query_id | query | backend_type -------+----------+--------+------------+----------+----------+------------------+-------------+-----------------+------------- +-------------------------------+-------------------------------+-------------------------------+------------------------------- +-----------------+---------------------+--------+-------------+--------------+----------+---------------------------------- +------------------------------ | | 7212 | | 10 | postgres | | | | | 2023-08-16 21:56:45.037403-07 | | | | Activity | LogicalLauncherMain | | | | | | logical replication launcher | | 7211 | | | | | | | | 2023-08-16 21:56:45.033525-07 | | | | Activity | AutoVacuumMain | | | | | | autovacuum launcher 5 | postgres | 212820 | | 10 | postgres | psql | | | -1 | 2023-08-31 00:23:18.95734-07 | 2023-08-31 00:23:23.702319-07 | 2023-08-31 00:23:23.702319-07 | 2023-08-31 00:23:23.702323-07 | | | active | | 6796 | | select * from pg_stat_activity ; | client backend | | 7208 | | | | | | | | 2023-08-16 21:56:44.813366-07 | | | | Activity | BgWriterMain | | | | | | background writer | | 7207 | | | | | | | | 2023-08-16 21:56:44.807436-07 | | | | Activity | CheckpointerMain | | | | | | checkpointer | | 7210 | | | | | | | | 2023-08-16 21:56:45.041297-07 | | | | Activity | WalWriterMain | | | | | | walwriter (6 rows) What's Going On in the Database?
  11. © EDB 2024 - All Rights Reserved 11 postgres=# \x

    Expanded display is on. postgres=# select * from pg_stat_activity limit 2 offset 1; -[ RECORD 1 ]----+------------------------------------------------- datid | datname | pid | 7211 leader_pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | 2023-08-16 21:56:45.033525-07 xact_start | query_start | state_change | wait_event_type | Activity wait_event | AutoVacuumMain state | backend_xid | backend_xmin | query_id | query | backend_type | autovacuum launcher What's Going On in the Database? -[ RECORD 2 ]----+------------------------------------------------- datid | 5 datname | postgres pid | 212820 leader_pid | usesysid | 10 usename | postgres application_name | psql client_addr | client_hostname | client_port | -1 backend_start | 2023-08-31 00:23:18.95734-07 xact_start | 2023-08-31 00:24:40.08706-07 query_start | 2023-08-31 00:24:40.08706-07 state_change | 2023-08-31 00:24:40.087064-07 wait_event_type | wait_event | state | active backend_xid | backend_xmin | 6796 query_id | query | select * from pg_stat_activity limit 2 offset 1; backend_type | client backend
  12. © EDB 2024 - All Rights Reserved 12 Vacuuming •

    Upholds performance by preventing bloat • UPDATE or DELETE simply flag rows as deleted • Vacuum flags deleted rows as re-usable for future INSERT or UPDATE • Autovacuum will vacuum certain tables after some time • Usually best to wait for any heavy vacuuming to finish • If absolutely necessary, use pg_terminate_backend() • Manually vacuum the table immediately. • Run with SET vacuum_cost_delay TO 0;
  13. © EDB 2024 - All Rights Reserved 13 • EXPLAIN

    v. EXPLAIN ANALYZE • Query performance can be evaluated in the logs • As a developer, auto_explain is a very helpful tool, especially if you're using an ORM • Helps identify where an index might be necessary Performance
  14. © EDB 2024 - All Rights Reserved 14 Explain v.

    Explain Analyze postgres=# EXPLAIN SELECT * FROM pgbench_accounts a JOIN pgbench_branches b ON (a.bid=b.bid) WHERE a.aid < 100000; QUERY PLAN -------------------------------------------------------------------------------- Nested Loop (cost=0.00..4141.00 rows=99999 width=461) Join Filter: (a.bid = b.bid) -> Seq Scan on pgbench_branches b (cost=0.00..1.01 rows=1 width=364) -> Seq Scan on pgbench_accounts a (cost=0.00..2890.00 rows=99999 width=97) Filter: (aid < 100000) (5 rows) postgres=# EXPLAIN ANALYZE SELECT * FROM pgbench_accounts a JOIN pgbench_branches b ON (a.bid=b.bid) WHERE a.aid < 100000; QUERY PLAN ------------------------------------------------------------------------------------------------------------- Nested Loop (cost=0.00..4141.00 rows=99999 width=461) (actual time=0.039..56.582 rows=99999 loops=1) Join Filter: (a.bid = b.bid) -> Seq Scan on pgbench_branches b (cost=0.00..1.01 rows=1 width=364) (actual time=0.025..0.026 rows=1 loops=1) -> Seq Scan on pgbench_accounts a (cost=0.00..2890.00 rows=99999 width=97) (actual time=0.008..25.752 rows=99999 loops=1) Filter: (aid < 100000) Rows Removed by Filter: 1 Planning Time: 0.306 ms Execution Time: 61.031 ms (8 rows)
  15. © EDB 2024 - All Rights Reserved 15 • All

    contained in postgresql.conf • postgresql.auto.conf controlled by ALTER SYSTEM commands • Can be viewed in psql with SHOW ALL; • Some can be changed without a system restart • SELECT name, setting FROM pg_settings WHERE context IN ('sighup','user'); • SET <param> TO <value>; • ALTER SYSTEM SET <param> TO <value>; • Commit changes in psql: SELECT pg_reload_conf(); • From the OS: systemctl reload or kill -HUP ${pid} Configuration
  16. © EDB 2024 - All Rights Reserved 16 Some Useful

    Config Params • search_path -- namespaces to look for tables • work_mem -- memory to allocated for sorting and hashing • Be careful, as this can affect the entire system • log_* params -- controls what gets logged • log_line_prefix • log_checkpoints • log_connections • log_disconnections • log_autovacuum_min_duration • log_hostname
  17. © EDB 2024 - All Rights Reserved 17 • Important

    logging parameters • log_line_prefix • %m [%p]: [%l] [txid=%x] user=%u,db=%d,app%a,client=%r • log_min_duration_statement • Other logging parameters • log_statement - Logs statement before executing (including args) • log_min_error_statement - Logs specific types of messages • WARNING, ERROR, FATAL, PANIC • ⚠ Database logs != WAL logs Monitoring
  18. © EDB 2024 - All Rights Reserved 18 WAL Logs

    • WAL stands for Write Ahead Log • WAL files live in ${PGDATA}/pg_wal/ • Basically a journal of all the write activity on the database • Provides a means of disaster recovery • Eventually synced/merged with actual database files in ${PGDATA}/base/ • ⛔ DO NOT DELETE THESE FILES ⛔
  19. © EDB 2024 - All Rights Reserved 19 postgres@davinci:~/15/main$ pwd

    /var/lib/postgresql/15/main postgres@davinci:~/15/main$ ls -al total 92 drwx------ 19 postgres postgres 4096 Aug 16 21:56 . drwxr-xr-x 3 postgres postgres 4096 Aug 16 21:52 .. drwx------ 6 postgres postgres 4096 Aug 17 01:34 base drwx------ 2 postgres postgres 4096 Aug 18 09:25 global drwx------ 2 postgres postgres 4096 Aug 16 21:52 pg_commit_ts drwx------ 2 postgres postgres 4096 Aug 16 21:52 pg_dynshmem drwx------ 4 postgres postgres 4096 Aug 28 09:33 pg_logical drwx------ 4 postgres postgres 4096 Aug 16 21:52 pg_multixact drwx------ 2 postgres postgres 4096 Aug 25 01:42 pg_notify drwx------ 2 postgres postgres 4096 Aug 16 21:52 pg_replslot drwx------ 2 postgres postgres 4096 Aug 16 21:52 pg_serial drwx------ 2 postgres postgres 4096 Aug 16 21:52 pg_snapshots drwx------ 2 postgres postgres 4096 Aug 16 21:56 pg_stat drwx------ 2 postgres postgres 4096 Aug 16 21:52 pg_stat_tmp drwx------ 2 postgres postgres 4096 Aug 16 21:52 pg_subtrans drwx------ 2 postgres postgres 4096 Aug 16 21:52 pg_tblspc drwx------ 2 postgres postgres 4096 Aug 16 21:52 pg_twophase -rw------- 1 postgres postgres 3 Aug 16 21:52 PG_VERSION drwx------ 3 postgres postgres 4096 Aug 25 02:22 pg_wal drwx------ 2 postgres postgres 4096 Aug 16 21:52 pg_xact -rw------- 1 postgres postgres 88 Aug 16 21:52 postgresql.auto.conf -rw------- 1 postgres postgres 130 Aug 16 21:56 postmaster.opts -rw------- 1 postgres postgres 100 Aug 16 21:56 postmaster.pid postgres@davinci:~/15/main$ ls -al pg_wal/ total 65548 drwx------ 3 postgres postgres 4096 Aug 25 02:22 . drwx------ 19 postgres postgres 4096 Aug 16 21:56 .. -rw------- 1 postgres postgres 16777216 Aug 28 09:33 00000001000000000000000B -rw------- 1 postgres postgres 16777216 Aug 17 01:24 00000001000000000000000C -rw------- 1 postgres postgres 16777216 Aug 17 01:30 00000001000000000000000D -rw------- 1 postgres postgres 16777216 Aug 25 02:18 00000001000000000000000E drwx------ 2 postgres postgres 4096 Aug 16 21:52 archive_status WAL Logs
  20. © EDB 2024 - All Rights Reserved 20 • pg_dump

    • plaintext dump of the database • Can filter based on namespace, table • Can dump a compressed/binary version as well, to save space • Less likely to copy corruption • pg_basebackup • Takes a snapshot of the entire ${PGDATA} directory, includes indexes, FK constraints, etc. • Requires max_wal_senders, and a user with REPLICATION privilege • Faster, but if the database is corrupt, the corruption will be copied Backups
  21. © EDB 2024 - All Rights Reserved 21 • Contained

    in pg_hba.conf • Allows connections to specific databases by specific users and IP addresses • Changes are committed with HUP or pg_reload_conf() Authentication # TYPE DATABASE USER ADDRESS METHOD # "local" is for Unix domain socket connections only local all all peer # IPv4 local connections: host all all 127.0.0.1/32 password # IPv6 local connections: host all all ::1/128 password # Allow replication connections from localhost, by a user with the # replication privilege. local replication all password host replication all 127.0.0.1/32 password host replication all ::1/128 password
  22. © EDB 2024 - All Rights Reserved 22 • kill

    -9 on any Postgres process • Causes Postgres to crash and enter into recovery mode • Idle Transactions • Always commit/rollback any transactions • Otherwise other users will be held up • Look in pg_stat_activity for idle in transaction sessions (different from simply idle) • Cross reference with pg_locks • Don't drop anything (columns, schemas, indexes, etc.) • Rename them (or wait until the DBA comes back) • Do not delete any files from $PGDATA (especially files in pg_wal or pg_xlog) What NOT to do
  23. © EDB 2024 - All Rights Reserved 23 • Postgres

    Slack (postgresteam.slack.com) • Postgres Community Mailing Lists (postgresql.org/list) • IRC (postgresql.org/community/irc) • Wiki (wiki.postgresql.org) • Docs (postgresql.org/docs/current) • EDB Support (enterprisedb.com/support-center) • 🌴 linktr.ee/postgres_help 🌴 Where to Find Help