Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Distributed PostgreSQL is a game changer | PGCo...

Citus Data
February 27, 2020

Distributed PostgreSQL is a game changer | PGConf India 2020 | Parikshit Savjani

Slides from Parikshit Savjani's talk representing Microsoft at PGConf India 2020 in Bengaluru. Postgres is one of the fastest growing DBMS in the industry in terms of popularity. Its extensible architecture combined with truly open source community development makes it very feature rich database engine with unprecedented speed of innovation. But as a developer or DBA, scaling your Postgres workload can be a complex daunting task. Microsoft loves Postgres and with its Azure Database for PostgreSQL Hyperscale (Citus) offering has significantly simplified scaling and manageability of your PostgreSQL workloads.

In this session, we will discuss distributed PostgreSQL architecture of Hyperscale (Citus) and some of the common use-cases and patterns where it shines. You will learn the concept of distributed tables and how you can apply it to achieve massively parallel processing with Hyperscale (Citus). The session will also give you a glimpse of new Azure Arc data services platform which allows you to deploy Hyperscale (Citus) anywhere from multi-cloud, on-premises to edge environments using Kubernetes. Come and attend this session to learn how you can leverage Hyperscale (Citus) to run your Postgres workloads at any scale, anywhere.

=====================

Parikshit Savjani is a Principal Program Manager with Azure Open Source Database engineering team working on enabling customers and community to be successful on managed database services on Azure, especially including Azure Database for PostgreSQL. Based on his decade long experience of working with relational databases, he has developed deep understanding of database concepts and fundamentals. Parikshit enjoys talking solution architectures and sharing his learnings of the databases designs and applications to the community via blogs, conferences, webinars—and by presenting in developer and community events.

Citus Data

February 27, 2020
Tweet

More Decks by Citus Data

Other Decks in Technology

Transcript

  1. One of most loved and wanted databases in Stack Overflow’s

    2019 Developer Survey https://insights.stackoverflow.com/survey/2019?utm_source=so-owned&utm_medium=blog&utm_campaign=dev-survey-2019&utm_content=launch-blog https://db-engines.com/en/blog_post/76 Ranked DBMS of the Year for 2 consecutive years by DB-Engines PostgreSQL is more popular than ever @talktosavjani
  2. https://db-engines.com/en/ranking_trend/system/PostgreSQL PostgreSQL is more popular than ever 2013 2014 2015

    2016 2017 2018 2019 2020 200 100 300 400 500 © February 2020, DB-Engines.com @talktosavjani
  3. Why Postgres? Why Now? • Truly Open Source (No single

    owner) • Feature rich & Highly Extensible @talktosavjani
  4. Why Postgres? Why Now? • Truly Open Source (No single

    owner) • Feature rich & Highly Extensible • Robust, Reliable & Rich Ecosystem @talktosavjani
  5. Why Postgres? Why Now? • Truly Open Source (No single

    owner) • Feature rich & Highly Extensible • Robust, Reliable & Rich Ecosystem • Speed of innovation and releases @talktosavjani
  6. Why Postgres? Why Now? • Truly Open Source (No single

    owner) • Feature rich & Highly Extensible • Robust, Reliable & Rich Ecosystem • Speed of innovation and releases • Available as a service in all major clouds @talktosavjani
  7. Common cloud native app patterns Multi-tenant SaaS applications Real-time operational

    analytics Transactional/OLTP applications Massive data volume and processing needs @talktosavjani
  8. Postgres Parallel Processing capabilities Postgres 9.6 Parallel Sequential scans, Joins

    and Aggregates Postgres 10 Parallel B-tree, bitmap heap scans, merge join, non-correlated sub queries. Postgres 11 Parallel Hash joins, DDL, Index Builds, Parallel partitions scans Postgres 12 Parallel Queries in Serializable Isolation Mode Postgres 13 Parallel Vacuum Credits: Amit Kapila (@kapila_amit) @talktosavjani
  9. Distributed PostgreSQL scales better APPLICATION SELECT FROM GROUP BY company_id,

    avg(spend) AS avg_campaign_spend compaigns company_id; METADATA COORDINATOR NODE WORKER NODES W1 W2 W3 … Wn SELECT company_id sum(spend), count(spend) … FROM campaigns_2001 … SELECT company_id sum(spend), count(spend) … FROM campaigns_2009 … SELECT company_id sum(spend), count(spend) … FROM campaigns_2017 … @talktosavjani
  10. Microsoft Windows team relies on Citus and Postgres (on Azure)

    for mission-critical shiproom decisions Read more in our aka.ms/azure-postgres-blog: https://techcommunity.microsoft.com/t5/azure-database- for-postgresql/architecting-petabyte-scale-analytics-by- scaling-out-postgres-on/ba-p/969685 @talktosavjani
  11. How do you manage distributed Citus cluster Your new best

    friend to manage distributed systems @talktosavjani
  12. @tapoueh PGCONFINDIA ~ a 40% off discount on any edition

    Source: theartofpostgresql.com @talktosavjani
  13. More popular than ever Distributed PostgreSQL = need of hour

    Citus shards Postgres & enables distributed processing Kubernetes = cluster management & portability Microsoft Postgres @talktosavjani
  14. A good newsletter is like a good GIN index. Sign

    up for the Citus Newsletter @talktosavjani