One of the defining characteristics of PostgreSQL is its extensibility, which enables developers to add new database functionality without forking from the original project.
Citus is an open source PostgreSQL extension that transforms PostgreSQL into a distributed database. The goal of Citus is to make the versatile set of data processing capabilities in PostgreSQL available at any scale. Citus can scale transactional workloads by routing transactions across nodes, and analytical workloads by parallelizing operations across all cores in the cluster. Citus also has distributed data placement features such as co-location and reference tables to efficiently scale relational database features such as foreign keys and joins.
In this talk, you’ll learn how Citus uses the PostgreSQL extension APIs, such as the planner and executor hooks, in order to implement: a sharding layer, a distributed query planner, an adaptive query executor, and distributed transactions—all in a way that is transparent to the application without changing a line of Postgres code. We will also discuss the design and trade-offs of each component and what we learned along the way.