These are the slides from Marco Slot's keynote talk at Citus Con: An Event for Postgres 2023, a virtual event organized by Microsoft. Building distributed PostgreSQL is perhaps one of the most challenging software engineering projects imaginable. In this keynote, Marco explores the distributed PostgreSQL problem and how Citus solves it.
Early on, the Citus team decided to architect Citus as PostgreSQL extension. That way Citus remains part of the PostgreSQL ecosystem even as PostgreSQL keeps developing. Moreover, architecting Citus as an extension made distribution a feature that can simply be added to PostgreSQL without losing the versatile feature set of Postgres, nor its mature, efficient implementations.
The goal of the Citus database is to provide high PostgreSQL performance at any scale, but simply distributing data across machines is rarely sufficient to achieve that. Crisp distribution concepts and careful trade-offs are important to favor workload patterns that benefit from scaling out. There are also many complex engineering problems given the large PostgreSQL feature set, failures and concurrency in distributed systems, and mission-critical nature of databases.
Marco discusses the main engineering challenges faced over the past 10 years of developing the fastest, most mature, open-source Distributed PostgreSQL implementation: Citus.